From: "Jimmy Zhang" Newsgroups: comp.arch.fpga Subject: hand placement Lines: 13 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Message-ID: NNTP-Posting-Host: 12.234.57.72 X-Complaints-To: abuse@attbi.com X-Trace: rwcrnsc53 1017913635 12.234.57.72 (Thu, 04 Apr 2002 09:47:15 GMT) NNTP-Posting-Date: Thu, 04 Apr 2002 09:47:15 GMT Organization: AT&T Broadband Date: Thu, 04 Apr 2002 09:47:15 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.cwix.com!wn2feed!worldnet.att.net!204.127.198.204!attbi_feed4!attbi.com!rwcrnsc53.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16096 Just keep hearing about this hand placement thing, don't know how it is done in reality. Does someone actually use their hands to do the placement as opposed to CAD based P&R. Any hints? -- ----------------------------------------------------- Click here for Free Video!! http://www.gohip.com/freevideo/ ###### Message-ID: <3CAC8546.C278B121@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 31 Date: Thu, 04 Apr 2002 16:53:54 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1017939234 68.15.41.165 (Thu, 04 Apr 2002 11:53:54 EST) NNTP-Posting-Date: Thu, 04 Apr 2002 11:53:54 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!out.nntp.be!propagator-SanJose!in.nntp.be!nntp-relay.ihug.net!ihug.co.nz!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16092 Hand placement means you direct where the logic goes instead of allowing the automatic placement to do it. Floorplanning the design can make significant gains in performance, density and power consumption. We have typically seen 50+% improvement in max clock rates as a result of floorplanning (hand placing) a design. To do this, work hierarchically and place what you can in the source so that you don't have to manually place every instance in the floorplanner. Jimmy Zhang wrote: > Just keep hearing about this hand placement thing, don't know how it > is done in reality. Does someone actually use their hands to do the > placement as opposed to CAD based P&R. Any hints? > > -- > ----------------------------------------------------- > Click here for Free Video!! > http://www.gohip.com/freevideo/ -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### From: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 4 Apr 2002 19:14:25 +0000 (UTC) Organization: Unknown Lines: 13 Message-ID: References: NNTP-Posting-Host: soda.csua.berkeley.edu X-Trace: agate.berkeley.edu 1017947665 23719 128.32.247.226 (4 Apr 2002 19:14:25 GMT) X-Complaints-To: usenet@agate.berkeley.edu NNTP-Posting-Date: Thu, 4 Apr 2002 19:14:25 +0000 (UTC) Originator: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!diablo.netcom.net.uk!netcom.net.uk!deine.net!news-out.spamkiller.net!propagator-la!news-in-la.newsfeeds.com!news-hog.berkeley.edu!ucberkeley!agate.berkeley.edu!agate!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16122 In article , Kevin Brace wrote: > I have seen one Xilinx employee in this newsgroup saying that >automatic P&R is getting better, so low level tools like floorplanner or >FPGA Editor is getting less important. Everytime Xilinx/Altera/Tool people say this, I have to laugh. There is so much low hanging fruit in datapath recognition, which the tools fail MISERABLY to recognise. A simple first order pass, align up the datapath, can be such a win. -- Nicholas C. Weaver nweaver@cs.berkeley.edu ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 04 Apr 2002 12:53:46 -0800 Organization: None Lines: 62 Sender: kevinbraceusenet@hotmail.com Message-ID: References: NNTP-Posting-Host: st-66-99-47-20.wheaton.lib.il.us Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1017946857 12331 66.99.47.20 (4 Apr 2002 19:00:57 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Thu, 4 Apr 2002 19:00:57 +0000 (UTC) X-Mailer: Mozilla 4.75 [en] (Win95; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16126 I have seen one Xilinx employee in this newsgroup saying that automatic P&R is getting better, so low level tools like floorplanner or FPGA Editor is getting less important. That can be true to some extent, but still, automatic P&R is so bad that, when I have to reduce setup time (Tsu) of my PCI IP core, I still have to rely on floorplanner. In theory, I can route my design many times to improve the timings, but typically, the improvement seems to end after the 10th routing, and after that, things don't improve at all. What I discovered through wasting lots of time routing my design multiple times is that the problem of Xilinx or Altera's P&R tool is that the tool doesn't place the timing critical LUTs and FFs in the right place, or relevant LUTs and FFs within a CLB (in Xilinx) or a LAB (in Altera). Because the timing critical LUTs and FFs are placed physically so far away from the destination (typically FFs), routing it multiple times just won't save the design, because the path will have greater routing delay inevitably. That's when the designer has to force the placement to certain location by using a floorplanner. If you are using ISE WebPACK, click on "View Floorplanner" after you P&R your design. Use UCF flow if you are using Floorplanner for the first time. You should download the Xilinx Floorplanner manual before trying it, but it only explains how the thing works, and it doesn't include anything like a tutorial. There is no tutorial available from Xilinx (I asked such a question several months ago, but no one gave me a reply. It turns out, Xilinx doesn't really have such a tutorial.), but if you are going to use Floorplanner, and target Virtex architecture FPGAs including Spartan-II, keep all relevant LUTs within a CLB because the routing delay within the CLB is small. Getting out of a CLB costs a lot in terms of routing delay, but still the delay to a CLB horizontally adjacent to is still fairly small. Another obvious advise will be that keep signal path distances to minimum because greater the distance, more the routing delay. Also, weren't you looking for a low cost PCI card? Insight Electronics recently released an upgraded version of the Spartan-II PCI card, and the new one is a little more expensive ($225) than the older one, but it has a bigger chip, and has more stuff on the card. http://www.insight.na.memec.com/cgi-bin/bvutf8/memec/scripts/local/mc_loc_b.jsp?Div=INSIGHT&Reg=AMERICAS&Country=UNITED_STATES&Lang=EN&EDOID=187428 Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) Jimmy Zhang wrote: > > Just keep hearing about this hand placement thing, don't know how it > is done in reality. Does someone actually use their hands to do the > placement as opposed to CAD based P&R. Any hints? > > -- > ----------------------------------------------------- > Click here for Free Video!! > http://www.gohip.com/freevideo/ ###### Reply-To: "Steve Casselman" From: "Steve Casselman" Newsgroups: comp.arch.fpga References: Subject: Re: hand placement Lines: 30 Organization: Virtual Computer Corporation X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Message-ID: NNTP-Posting-Host: 64.174.106.246 X-Complaints-To: abuse@prodigy.net X-Trace: newssvr13.news.prodigy.com 1017954855 ST000 64.174.106.246 (Thu, 04 Apr 2002 16:14:15 EST) NNTP-Posting-Date: Thu, 04 Apr 2002 16:14:15 EST X-UserInfo1: [[OQB\SDJSSURWH]^JKBOW@@YJ_ZTB\MV@BL\QMIWIWTEPIB_NVUAH_[BL[\IRKIANGGJBFNJF_DOLSCENSY^U@FRFUEXR@KFXYDBPWBCDQJA@X_DCBHXR[C@\EOKCJLED_SZ@RMWYXYWE_P@\\GOIW^@SYFFSWHFIXMADO@^[ADPRPETLBJ]RDGENSKQQZN Date: Thu, 04 Apr 2002 21:14:15 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!xara.net!gxn.net!news.tele.dk!small.news.tele.dk!207.115.63.138!newscon04.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr13.news.prodigy.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16115 The way the placer works is to a random placement. Then it takes a nets (usually in alphabetical order) and estimates the wire distance to all the pins it is connected to. This is the "cost" of each net. Then it takes two components (luts or flops)and swaps them. If the cost is lower it keeps the swap otherwise it doesn't. The placement can be really improved just by placing a few luts or flops. The placed components act like "attractors" for the rest of the components connected it. I had a chance to look at the old ppr code. I was able to speed the cost function by 9.8x by putting the function in hardware. Steve "Nicholas Weaver" wrote in message news:a8i8mh$n57$1@agate.berkeley.edu... > In article , > Kevin Brace wrote: > > I have seen one Xilinx employee in this newsgroup saying that > >automatic P&R is getting better, so low level tools like floorplanner or > >FPGA Editor is getting less important. > > Everytime Xilinx/Altera/Tool people say this, I have to laugh. > > There is so much low hanging fruit in datapath recognition, which the > tools fail MISERABLY to recognise. A simple first order pass, align > up the datapath, can be such a win. > -- > Nicholas C. Weaver nweaver@cs.berkeley.edu ###### From: Peter Alfke Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 04 Apr 2002 14:12:38 -0800 Organization: Xilinx Lines: 11 Message-ID: <3CACCFD6.75823C66@xilinx.com> References: Reply-To: peter.alfke@xilinx.com NNTP-Posting-Host: peter.xsj.xilinx.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353" Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.77C-CCK-MCD {C-UDP; EBM-APPLE} (Macintosh; U; PPC) X-Accept-Language: en To: Steve Casselman Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!wn1feed!wn2feed!worldnet.att.net!204.127.198.203!attbi_feed3!attbi.com!12.120.28.17!attla2!attla1!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16133 I argued ten years ago, and I am still convinced: The human brain is better than any computer in recognizing the underlying structure ( and thus drive some basic hand placement). But a computer is much better at the tedious job of routing. That's why routers have become very good, but the placer is still the problem child. And a bad placement is very difficult to remedy later. Peter Alfke ###### From: "Tim" Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 4 Apr 2002 23:58:18 +0100 Message-ID: <1017961909.26884.0.nnrp-01.9e9832fa@news.demon.co.uk> References: NNTP-Posting-Host: tile.demon.co.uk X-NNTP-Posting-Host: tile.demon.co.uk:158.152.50.250 X-Trace: news.demon.co.uk 1017961909 nnrp-01:26884 NO-IDENT tile.demon.co.uk:158.152.50.250 X-Complaints-To: abuse@demon.net X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Lines: 10 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!newsfeed.icl.net!dispose.news.demon.net!news.demon.co.uk!demon!tile.demon.co.uk!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16175 Steve Casselman wrote: > I had a chance to look at the old > ppr code. I was able to speed the cost function by 9.8x by putting the > function in hardware. Sounds interesting. What did you do? ###### Reply-To: "Steve Casselman" From: "Steve Casselman" Newsgroups: comp.arch.fpga References: <1017961909.26884.0.nnrp-01.9e9832fa@news.demon.co.uk> Subject: Re: hand placement Lines: 28 Organization: Virtual Computer Corporation X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Message-ID: NNTP-Posting-Host: 64.174.106.246 X-Complaints-To: abuse@prodigy.net X-Trace: newssvr13.news.prodigy.com 1017963595 ST000 64.174.106.246 (Thu, 04 Apr 2002 18:39:55 EST) NNTP-Posting-Date: Thu, 04 Apr 2002 18:39:55 EST X-UserInfo1: [[PAPDCAOHV[RPT^XZKBOFTBTR\B@GXLN@GZ_GYO^BSZUSAANVUEAE[YETZPIWWI[FCIZA^NBFXZ_D[BFNTCNVPDTNTKHWXKB@X^B_OCJLPZ@ET_O[G\XSG@E\G[ZKVLBL^CJINM@I_KVIOR\T_M_AW_M[_BWU_HFA_]@A_A^SGFAUDE_DFTMQPFWVW[QPJN Date: Thu, 04 Apr 2002 23:39:55 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!fr.usenet-edu.net!usenet-edu.net!freenix!sunqbc.risq.qc.ca!cyclone2.usenetserver.com!usenetserver.com!newscon06.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr13.news.prodigy.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16134 I took the cost function put it in hardware and ran the database past it several times. The cost function accounted for 30% of the placer performance. That part of the placer took about 1/3 of a xc4010. From my analysis I concluded that ppr could be speed up by 10x and would take about 50K gates. This holds to the normal 90/10 rule. Of course Xilinx was moving over to par at the time and they concluded that they didn't need the speedup. After spending a lot of time with the code I'm convinced that P&R is a sure bet for acceleration. Now with the PPC and Virtex II I'm sure that over all speedups of 8-10x would be pretty straight forward. I estimate about 2 man years of work and a design with 4-8 gig on board would do it. Steve "Tim" wrote in message news:1017961909.26884.0.nnrp-01.9e9832fa@news.demon.co.uk... > Steve Casselman wrote: > > I had a chance to look at the old > > ppr code. I was able to speed the cost function by 9.8x by putting the > > function in hardware. > > Sounds interesting. What did you do? > > > > ###### Message-ID: <3CACEF37.DCA68151@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 46 Date: Fri, 05 Apr 2002 00:25:58 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1017966358 68.15.41.165 (Thu, 04 Apr 2002 19:25:58 EST) NNTP-Posting-Date: Thu, 04 Apr 2002 19:25:58 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!netnews.com!nntp.abs.net!news-out.visi.com!hermes.visi.com!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16140 We've heard the "place and route is good enough you don't need to do floorplanning unless you are doing the 1% designs from hell" line for as far back as I can remember from Xilinx. Fact is, floorplanning seems to be getting larger gains, not smaller, with the new devices. I typically see 50-70% performance improvement over a automatic placement. Routing multiple times without running placement is not going get much in the way of performance gains. The router does a pretty decent job if the placement is good, and can't do much to salvage a poor placement. Xilinx, as a company, promotes not using the floorplanner probably to avoid a feeling that the devices are more difficult to design in (which is not the case, in fact the ability to improve performance and density through floorplanning is a big plus). Floorplanning has always been the closet case, and from the looks of it will continue to be. Therefore you get poor documentation, lots of bugs, and very low priority on getting the bugs fixed compared with the rest of the software package. Until floorplanning becomes a mainstream design event, I doubt it will ever be anything more than the poor cousin no one will admit to having. Unfortunately, the mainstream doesn't use it because a) they are told they don't need it*, that it is only there for the FAEs to get you out of trouble in special cases, b) They don't know the benefits because those are not told to them and the tool is not easy to learn without doing it alot (and living with numerous bugs), and c) Even if someone convinces them to use it, the documentation is next to useless as far as learning how to floorplan. Part of the problem is that floorplanning is sort of like putting together a puzzle with many acceptable solutions. Some people have the knack for it, some don't and if you don't you will probably not inherit it ever. Kevin Brace wrote: > I have seen one Xilinx employee in this newsgroup saying that > automatic P&R is getting better, so low level tools like floorplanner or > FPGA Editor is getting less important. > That can be true to some extent, but still, automatic P&R is so bad > that, when I have to reduce setup time (Tsu) of my PCI IP core, I still > have to rely on floorplanner. > and so on... > > > ----------------------------------------------------- > > Click here for Free Video!! > > http://www.gohip.com/freevideo/ -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### Message-ID: <3CACFBB3.1C73@designtools.co.nz> From: Jim Granville Reply-To: jim.granville@designtools.co.nz Organization: Mandeno Granville elect X-Mailer: Mozilla 3.0C-XTRA (Win95; I) MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <1017961909.26884.0.nnrp-01.9e9832fa@news.demon.co.uk> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 26 Date: Fri, 05 Apr 2002 13:19:47 +1200 NNTP-Posting-Host: 203.79.98.93 X-Complaints-To: abuse@tsnz.net X-Trace: news02.tsnz.net 1017970415 203.79.98.93 (Fri, 05 Apr 2002 13:33:35 NZST) NNTP-Posting-Date: Fri, 05 Apr 2002 13:33:35 NZST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-out.spamkiller.net!propagator2-maxim!propagator-maxim!news-in.spamkiller.net!news02.tsnz.net!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16152 Steve Casselman wrote: > > I took the cost function put it in hardware and ran the database past it > several times. The cost function accounted for 30% of the placer > performance. That part of the placer took about 1/3 of a xc4010. From my > analysis I concluded that ppr could be speed up by 10x and would take about > 50K gates. This holds to the normal 90/10 rule. Of course Xilinx was moving > over to par at the time and they concluded that they didn't need the > speedup. After spending a lot of time with the code I'm convinced that P&R > is a sure bet for acceleration. Now with the PPC and Virtex II I'm sure that > over all speedups of 8-10x would be pretty straight forward. I estimate > about 2 man years of work and a design with 4-8 gig on board would do it. > > Steve If I have this right, you are talking about using a VirtexPRO as an engine to route VirtexPRO (et al) ?. This becomes the silicon equivalent of the 'compiler bootstrap' :-) Maybe it's also a problem, the 'solution of 4 PPCs' is looking for ? Xilinx could sell route-boxes, and it would make a pretty impressive product demonstrator... -jg ###### Message-ID: <3CACFD0D.1B5EB315@iprimus.com.au> From: Russell Shaw X-Mailer: Mozilla 4.75 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CACEF37.DCA68151@andraka.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Original-NNTP-Posting-Host: 202.138.59.87 Lines: 10 X-Original-NNTP-Posting-Host: 127.0.0.1 Organization: iPrimus Customer - reports relating to abuse should be sent to abuse@iprimus.com.au Date: Fri, 05 Apr 2002 01:24:38 GMT NNTP-Posting-Host: 203.134.67.67 X-Complaints-To: news@primus.ca X-Trace: news.tor.primus.ca 1017969878 203.134.67.67 (Thu, 04 Apr 2002 20:24:38 EST) NNTP-Posting-Date: Thu, 04 Apr 2002 20:24:38 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-out.spamkiller.net!propagator2-maxim!propagator-maxim!news-in.spamkiller.net!feed.newsfeeds.com!feed.cgocable.net!feed.tor.primus.ca!feed.nntp.primus.ca!news.tor.primus.ca!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16148 Ray Andraka wrote: > > We've heard the "place and route is good enough you don't need to > do floorplanning unless you are doing the 1% designs from hell" line for as far back > as I can remember from Xilinx... From what i can tell, quartus2 doesn't even have a way of doing manual routing. ###### Reply-To: "Kelvin Xu Qijun" From: "Kelvin Xu Qijun" Newsgroups: comp.arch.fpga References: Subject: Re: hand placement Date: Fri, 5 Apr 2002 09:33:10 +0800 Lines: 26 Organization: OTCS Singapore X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 NNTP-Posting-Host: 203.116.53.151 Message-ID: <3cacff43@news.starhub.net.sg> X-Trace: 5 Apr 2002 09:34:59 +0800, 203.116.53.151 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newscore.univie.ac.at!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news-out.nuthinbutnews.com!feed-ev1!propagator-sterling!news-in.nuthinbutnews.com!news.stealth.net!news.stealth.net!newsvr.starhub.net.sg!news.starhub.net.sg!203.116.53.151 Xref: chonsp.franklin.ch comp.arch.fpga:16153 Nicholas, pls stop laugh since they have made it possible that you stay on your job for a while... I feel the laziness of CAD companies has saved many jobs in IC design area...:-) Email: qijun@okigrp.com.sg "Nicholas Weaver" wrote in message news:a8i8mh$n57$1@agate.berkeley.edu... > In article , > Kevin Brace wrote: > > I have seen one Xilinx employee in this newsgroup saying that > >automatic P&R is getting better, so low level tools like floorplanner or > >FPGA Editor is getting less important. > > Everytime Xilinx/Altera/Tool people say this, I have to laugh. > > There is so much low hanging fruit in datapath recognition, which the > tools fail MISERABLY to recognise. A simple first order pass, align > up the datapath, can be such a win. > -- > Nicholas C. Weaver nweaver@cs.berkeley.edu ###### Message-ID: <3CAD0269.C22D5B98@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CACEF37.DCA68151@andraka.com> <3CACFD0D.1B5EB315@iprimus.com.au> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 38 Date: Fri, 05 Apr 2002 01:47:50 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1017971270 68.15.41.165 (Thu, 04 Apr 2002 20:47:50 EST) NNTP-Posting-Date: Thu, 04 Apr 2002 20:47:50 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news-out.visi.com!hermes.visi.com!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16136 My point exactly a little later in the post. (Although I am told there is a real floorplanner capability in Quartus now...I haven't seen it. In Maxplus, the floorplanner was really only useful as a viewer since you couldn't apply your 'floorplan' to a future iteration.). I can sort of see Xilinx's point. A potential customer could point at ALtera's tools and say why should I buy yours that I need to floorplan when I can buy ALtera's that doesn't even have a floorplanner (so it must not be needed). Fact is, there is much more than meets the eye here. There is a fundamental difference in the routing structure. The altera routing structure happens to be less sensitive to placement, but as a result you don't have the capability of using very short high speed routes (outside of a LAB that is). Xilinx has shorter interconnects that permit a higher top speed in data flow designs, but in order to gain that benefit you need to exercise more care in the design and layout. Unfortunately, these types of distinctions seem to be lost on the marketing folks. Russell Shaw wrote: > Ray Andraka wrote: > > > > We've heard the "place and route is good enough you don't need to > > do floorplanning unless you are doing the 1% designs from hell" line for as far back > > as I can remember from Xilinx... > > From what i can tell, quartus2 doesn't even have > a way of doing manual routing. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 04 Apr 2002 20:15:11 -0600 Organization: None Lines: 190 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> NNTP-Posting-Host: 1cust82.tnt75.chi5.da.uu.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1017972341 21982 67.195.57.82 (5 Apr 2002 02:05:41 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Fri, 5 Apr 2002 02:05:41 +0000 (UTC) X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16163 Ray Andraka wrote: > > We've heard the "place and route is good enough you don't need to do > floorplanning unless you are doing the 1% designs from hell" line for as far back > as I can remember from Xilinx. Ray, I thought it was, "5% designs from hell." http://www.eetimes.com/story/OEG20000211S0011 Although my PCI IP core didn't used to meet Tsu < 7ns without floorplanning it in Spartan-II XC2S150-5 a few months ago, with improvements in the logic design, I got it to a point where I only have to route it once at the highest routing effort to meet the setup time requirement. So, I guess my design is no longer a "5% designs from hell," but I have to admit for a 66MHz PCI, Floorplanning is a must. (I already figured out how that secret PCILOGIC thing works, and I only now need to know how Bitgen's /Gclkdel option works to insert more delay on the global clock buffer . . . ) > Fact is, floorplanning seems to be getting larger gains, > not smaller, with the new devices. I typically see 50-70% > performance improvement over a automatic placement. > I guess Spartan-II is not really a new device, but in my case, I have seen 20% to 30% improvement in setup time when I manually floorplanned timing critical unregistered input control signals of PCI. (i.e., FRAME#, IRDY#, DEVSEL#, etc.) > Routing multiple times without running placement is not going get much in the way of performance gains. The router does a pretty decent job if the > placement is good, and can't do much to salvage a poor placement. > The idea of routing the design multiple times came from an article I read in Xcell magazine a while ago where a guy from Lucent said he sometimes routes the design over the weekend multiple times, and picks the one that's useful. I believe it was for a network router or something like that which utilized an XCV800 and an XCV150. So, I tried it, and once I ran it for about 10 hours while I was sleeping, and what I saw was that the improvement in timing score hit a plateau after 10 to 15 iterations. After that disappointment, I think that's when I started use Floorplanner to look at how the design was getting mapped, and I quickly realized why the timings weren't improving at all. It was because the relevant LUTs were getting spread all over the place that no matter how many times I routed the design, things weren't going to get better. > Xilinx, as a company, promotes not using the floorplanner probably > to avoid a feeling that the devices are more difficult to design > in (which is not > the case, in fact the ability to improve performance and density > through floorplanning is a big plus). My conspiracy theory of Xilinx attitude will be that the more waste there is in a design because manual floorplanning wasn't done, the users will have to use larger and faster speed grades which costs more, and that will certainly make Xilinx richer. I used to think a Spartan-II-5 was too slow for 33MHz PCI, and preferred if Insight Electronics put in a faster speed grade -6 part for Spartan-II PCI card even though Xilinx can get it right with speed grade -5. It turns out, my logic design was bad, and I can really easily meet Tsu < 7ns now with only automatic P&R. > Floorplanning has always been the closet case, > and from the looks of it will continue to be. Therefore you get poor documentation, lots of bugs, and very low priority on getting the bugs fixed > compared with the rest of the software package. I posted a question about a floorplanner tutorial in this newsgroup several months ago, but no one posted a reply for that posting. It turns out, no such thing exists, so I just decided to just place LUTs relevant to where I felt like it should belong to, and I saw huge improvement in setup time. > Until floorplanning becomes a mainstream design event, I doubt it > will ever be anything more than the > poor cousin no one will admit to having. I personally doubt that floorplanning will ever become a mainstream tool, but it still has to exist because the automatic P&R tool is soooo bad. However, each time I click on a LUT of a path violating timing requirements in Timing Analyzer, the floorplanner suddenly zooms in, and each time I will have to manually zoom out to get a view of where the thing is located on the chip. Is there a way to disable this? It is really annoying, and killing my productivity. > Unfortunately, the mainstream doesn't use it because a) they > are told they don't need it*, that it is only > there for the FAEs to get you out of trouble in special cases, Another conspiracy theory I came up with will be that more the users are discouraged from doing manual floorplanning, more likely they will rely on Xilinx for engineering services when something goes wrong (i.e., The design doesn't meet timings or doesn't fit inside a specific chip.), but I guess that creates more opportunity (?) for consultants to make some bucks, too. However, this "It's an FAE's tool" mentality seems to create a myth that only Xilinx or really experienced users can use it, and that myth seems to be similar to a PCILOGIC myth people talk about when a question comes up about "The special IRDY and TRDY pin for PCI in Virtex." People always say only Xilinx knows how PCILOGIC works, but should I shatter the PCILOGIC myth for once and for all by posting a sample code, and some explanation on how to instantiate it and actually use it? > b) They don't know the benefits because those are not told to > them and the tool is not > easy to learn without doing it alot (and living with numerous bugs), > and My conspiracy theory seems to hold that the dumber the users, the more they will have to pay for chips, tools, and services. > c) Even if someone convinces them to use it, the documentation > is next to > useless as far as learning how to floorplan. I think Xilinx has to release a tutorial on how to use a floorplanner, and I will also like to see more information on what the routing is like inside a Virtex architecture chip (Altera documentation is slightly better since it has more figures about the routing structure in their datasheets.) Although, I suppose if I had access to FPGA Editor, I won't complain about it . . . > Part of the problem is that floorplanning is sort of like > putting together a puzzle with many acceptable > solutions. Some people have the knack for it, some don't and if you don't you will probably not inherit it ever. > > > -- At least, I am glad that Xilinx didn't remove Floorplanner from ISE WebPACK. Without Floorplanner, the design wouldn't have met the timings a couple of months ago. However, I don't appreciate the fact that FPGA Editor doesn't come with ISE WebPACK. Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 04 Apr 2002 21:03:13 -0600 Organization: None Lines: 99 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> <3CACFD0D.1B5EB315@iprimus.com.au> <3CAD0269.C22D5B98@andraka.com> NNTP-Posting-Host: 1cust82.tnt75.chi5.da.uu.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1017975223 22279 67.195.57.82 (5 Apr 2002 02:53:43 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Fri, 5 Apr 2002 02:53:43 +0000 (UTC) X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16161 Ray Andraka wrote: > > My point exactly a little later in the post. (Although I am told there is a real > floorplanner capability in Quartus now...I haven't seen it. > In Maxplus, the floorplanner > was really only useful as a viewer since you couldn't apply your 'floorplan' to a future > iteration.). Quartus II got the floorplanner, but I think it is broken unless I am doing something wrong. For example, recently, I was able to assign several (two FFs) to a single LE. Of course, that's a no-no considering that only one FF exists for one LE. But the floorplanner doesn't stop me from doing that, and only at fitting stage (P&R stage) the tool will complain. That doesn't happen in Xilinx Floorplanner because you cannot place a FF to a location where already a FF was assigned to. Another problem which I don't know what to do is when I assign multiple FFs (up to four) to a LAB (not talking a single LE here, a single LAB), I have seen fitter failing to place my design. If I cannot constrain certain timing critical FFs to a certain LAB, that's really bad because I am at the mercy of the horrible automatic fitter. > I can sort of see Xilinx's point. A potential customer could point > at > ALtera's tools and say why should I buy yours that I need to floorplan when I can buy > ALtera's that doesn't even have a floorplanner (so it must not be needed). > I just hate marketers who bend the truth. Not having a floorplanner can really make life miserable when the automatic P&R tool does a poor job. Because my PCI IP core is written in Verilog without using any vendor specific (device specific or synthesis tool specific) features, I often target FLEX10KE in Quartus II 2.0 Web Edition to see how it does. I am an Altera nay-sayer in general from my bad experiences with their tools (Altera tools seem to be less reliable than Xilinx tools. Sure, Xilinx tools have their own bugs, but I haven't seen too many of them, but I have seen Quartus II 2.0 WE's router crash several times when routing my design.), and to make the matter worse, Altera's P&R tool's placement seems much worse than Xilinx's P&R tool. (Pretty slow, too.) One of the reason I think the placement is bad is because of lack of multiple IOE FFs, and not letting active low output FFs reside inside an IOE (That seems to have been fixed in APEX20KE, but I am still targeting FLEX10KE here.), so output and OE (Output Enable) FFs can get placed anywhere in the chip, and the P&R tool seems to place it at really bad locations (Locations really far away from the pin.) compared to Xilinx tools which tries to place the thing close to the pin. Of course, Spartan-II's IOBs are much better than FLEX10KE's, so I always use IOB FFs, and that probably makes the timings more predictable. Another weakness of FLEX10KE in PCI I think is the fact that FLEX10KE IOE FF doesn't have an asynchronous preset input, almost all of the PCI control signals are active low (i.e., FRAME#, IRDY#, DEVSEL#, etc.), and that makes my life very hard when I target FLEX10KE. Why do I care about FLEX10KE/ACEX1K? That's because that's the last Altera chip that officially support 5V PCI, and most desktops still use 5V PCI. > Fact is, there is much more than meets the eye here. There is a fundamental difference in > the routing structure. The altera routing structure happens to be less sensitive to > placement, but as a result you don't have the capability of using very short high speed > routes (outside of a LAB that is). Xilinx has shorter interconnects that permit a higher > top speed in data flow designs, but in order to gain that benefit you need to exercise > more care in the design and layout. Unfortunately, these types of distinctions seem to be > lost on the marketing folks. > Ray, I thought you said in the past that Altera's chips are good in random logic, but from my experience dealing with Spartan-II and FLEX10KE, Spartan-II seems better even in random logic and setup time. Altera's device seems to be better in fmax (Okay, maybe that's random logic.), but for 33MHz PCI, frequency above 33MHz is not too important. I believe Altera claims in their datasheets that their LUT-based PLDs (Please, just call it FPGAs.) have more predictable routing delays than FPGAs (Probably means Xilinx.), but I find Spartan-II's routing delays far more predictable. With the stuff I wrote, I am sure I will get firestorm of criticism from die-hard Altera fans now (I have gotten an unprofessional personal attack recently.), but I stand behind what I said here. (Unless I misunderstood something.) Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) ###### Message-ID: <3CAD173C.71FB0077@attbi.com> From: Phil Hays Organization: phil-hays at above domain X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 44 NNTP-Posting-Host: 12.230.124.133 X-Complaints-To: abuse@attbi.com X-Trace: sccrnsc01 1017976243 12.230.124.133 (Fri, 05 Apr 2002 03:10:43 GMT) NNTP-Posting-Date: Fri, 05 Apr 2002 03:10:43 GMT Date: Fri, 05 Apr 2002 03:10:43 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-out.spamkiller.net!propagator2-maxim!propagator-maxim!news-in.spamkiller.net!xmission!sunqbc.risq.qc.ca!newsfeed.mathworks.com!wn3feed!worldnet.att.net!204.127.198.204!attbi_feed4!attbi_feed3!attbi.com!sccrnsc01.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16179 Jimmy Zhang wrote: > Just keep hearing about this hand placement thing, don't know how it > is done in reality. Does someone actually use their hands to do the > placement as opposed to CAD based P&R. Any hints? The first way I learned to do this was with a paper diagram of the target chip, writing the constraints with a text editor, and coloring on the paper to indicate what had been put where. I didn't do the best of jobs (had an register reversed, with the msb where the lsb should be), but it was still ~30% faster resulting clock speed than the automatic placement. Made place and route times drop nicely as well. It was even better than that once I got the twist removed. But this is as close to "by hand" as I can picture. The floorplanner that Xilinx provides is just a nicely automated way of doing the same sort of puzzle. Do the data path(s) first, fit things together in a "logical" fashion, and for the first one floor plan at least plan on spending some time fiddling. Some people seem to get this skill right away, and some take longer. A slightly "higher level" way of gaining much of the benefit from floorplanning with potentially rather less effort is to use a "physical design" tool. Synplicity had the first ("Amplify") aimed at FPGA design (and I'm not sure if Mentor, Synopsys or anyone else have anything in this space yet), however there were physical design tools for ASIC design long before Amplify. These work by putting large chunks of the design into subsections of the target chip. Synopsys's ASIC physical design tool set: http://www.synopsys.com/products/phy_syn/phy_syn.html Amplify is at: http://www.synplicity.com/products/amplify.html -- Phil Hays ###### From: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Fri, 5 Apr 2002 03:13:21 +0000 (UTC) Organization: Unknown Lines: 26 Message-ID: References: <3CACEF37.DCA68151@andraka.com> NNTP-Posting-Host: soda.csua.berkeley.edu X-Trace: agate.berkeley.edu 1017976401 42733 128.32.247.226 (5 Apr 2002 03:13:21 GMT) X-Complaints-To: usenet@agate.berkeley.edu NNTP-Posting-Date: Fri, 5 Apr 2002 03:13:21 +0000 (UTC) Originator: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!enews.sgi.com!news-hog.berkeley.edu!ucberkeley!agate.berkeley.edu!agate!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16155 In article <3CACEF37.DCA68151@andraka.com>, Ray Andraka wrote: >We've heard the "place and route is good enough you don't need to do >floorplanning unless you are doing the 1% designs from hell" line for >as far back as I can remember from Xilinx. Fact is, floorplanning >seems to be getting larger gains, not smaller, with the new devices. >I typically see 50-70% performance improvement over a automatic >placement. I actually assert that a lot of the problem is right in the synthesis, not in the P&R: There has been a lot of work which has been done on work which, given a datapath, constructs an orderly layout. Yet none of this seems to have found its way into the commercial synthesis toolflows. So much informtion is thrown away. >Routing multiple times without running placement is not going get >much in the way of performance gains. The router does a pretty >decent job if the placement is good, and can't do much to salvage a >poor placement. This is especially true when you consider just how rich the modern FPGA interconnects are: Compare the number of wires/LUT in an XC4000 with a 4000XL with a Virtex. -- Nicholas C. Weaver nweaver@cs.berkeley.edu ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 04 Apr 2002 21:13:24 -0600 Organization: None Lines: 28 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> <3CACFD0D.1B5EB315@iprimus.com.au> NNTP-Posting-Host: 1cust82.tnt75.chi5.da.uu.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1017975836 22357 67.195.57.82 (5 Apr 2002 03:03:56 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Fri, 5 Apr 2002 03:03:56 +0000 (UTC) X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16162 Quartus II 2.0 Web Edition has a floorplanner, although I think it has more problems than Xilinx's one. (I am not saying Xilinx's floorplanner is perfect. I just think Altera's one has more problems. Refer to the reply I made to Ray Andraka.) To open QII floorplanner, click Processing -> Open Current Assignments Floorplan. You should open the P&Red current design too (Processing -> Open Last Compilation Floorplan), and drag and drop the currently placed LEs to the new floorplan. That's all you have to do, and things seem to get constrained to certain locations, but I have seen routing fail during Fitting several times, so it is not very reliable. When I say fail, what I mean is Fitter displaying error messages during Fitting. Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) Russell Shaw wrote: > > > > From what i can tell, quartus2 doesn't even have > a way of doing manual routing. ###### Message-ID: <3CAD2F02.5D39F79A@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CACEF37.DCA68151@andraka.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 126 Date: Fri, 05 Apr 2002 04:58:05 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1017982685 68.15.41.165 (Thu, 04 Apr 2002 23:58:05 EST) NNTP-Posting-Date: Thu, 04 Apr 2002 23:58:05 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-out.spamkiller.net!propagator2-maxim!propagator-maxim!news-in.spamkiller.net!out.nntp.be!propagator-SanJose!in.nntp.be!nntp-relay.ihug.net!ihug.co.nz!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16138 Kevin Brace wrote: > Ray, I thought it was, "5% designs from hell." What's a few percent between friends ;-) Xilinx used to call it the 5% designs from hell, but at the current volume that is still a significant number of designs, so I took the liberty of toning it down to reflect the prevailing attitude toward floorplanning. That article underlines the fact that floorplanning has always been the unwanted baby. The floorplanner has suffered in every major release since XACT6, and the new one in 4.2 is no exception (try placing an RPM in virtexE using the floorplanner in 4.2). > > > http://www.eetimes.com/story/OEG20000211S0011 > > > So, I guess my design is no longer a "5% designs from hell," but I have > to admit for a 66MHz PCI, Floorplanning is a must. (I already figured > out how that secret PCILOGIC thing works, and I only now need to know > how Bitgen's /Gclkdel option works to insert more delay on the global > clock buffer . . . ) Fact is lots of folks are having a hard time making VIrtexE and VIrtexII designs run at 50 MHz, as evidenced by many posts on this group. A combination of floorplanning and good pipelined synchronous design practices make designs at 3-4 times that clock rate quite doable in the same parts. Our DSP designs typically are running at between 160 and 180 MHz in VirtexE-6 parts, in fact I have one on the bench right now with a pair of 2000e-6's running at 160 MHz...that one has pentium heatsinks with fans on the FPGAs. The point is these designs are very doable with floorplanning, will meet timing every time through the tools with margin, and route within an hour or two. Without floorplanning, PAR gives up after about a day and a half, and automatic placement of some of the partitions small enough to fit after automatic PAR are limited to around 100 MHz. Some of the floorplanning gains are a function of the design. Our designs are DSP, so they naturally include data path with lots of arithmetic. That is precisely what PAR does so lousy a job at. In order to get the best speed out of the carry chains,you need to precede the carry chain with a reigster located in an adjacent slice and limit fanouts. The PAR seems to go out of its way to avoid putting the driving register next to the carry chain, and it does an exceptionally poor job at lining up data paths. This is low hanging fruit that gives big gains, which is why we see the gains we do with floorplanning. > > > Fact is, floorplanning seems to be getting larger gains, > > not smaller, with the new devices. I typically see 50-70% > > performance improvement over a automatic placement. > > I'm willing to bet he was running multipass PLACE and route. The multipass placement often times will find a better placement. It is not an incremental improvement, rather it is just doing the placement several times with different cost tables. Usually one or two cost tables will do better than others for a particular design. If you don't get a good layout in the first 5 or 6 tables, you are not likely to get one at all. > The idea of routing the design multiple times came from an > article I read in Xcell magazine a while ago where a guy from Lucent > said he sometimes routes the design over the weekend multiple times, and > picks the one that's useful. Yada yada yada... No conspiracy that I can see. Believe it or not, it is in their best interest to get you n a smaller device too...why do you think the marketing gate numbers are the way they are (BTW, the equivalent gate count that comes out of the tools is typically a little more than double the marketing gate count in our designs, which is an indication of the density we are getting by doing floorplanning). There is not much worse than fixing someone else's design that isn't meeting timing. The worse the design, the more defensive the designer/manager responsible for it. I can't imagine a company purposely making this happen to create or protect work. Yikes. > > My conspiracy theory of Xilinx attitude will be that the more > waste there is in a design because manual floorplanning wasn't done, the > users will have to use larger and faster speed grades which costs more, > and that will certainly make Xilinx richer. and later: Another conspiracy theory I came up with will be that more the users are discouraged from doing manual floorplanning, more likely they will rely on Xilinx for engineering services when something goes wrong (i.e., The design doesn't meet timings or doesn't fit inside a specific chip.), but I guess that creates more opportunity (?) for consultants to make some bucks, too. > > > Unfortunately, the mainstream doesn't use it because a) they > > are told they don't need it*, that it is only > > there for the FAEs to get you out of trouble in special cases, The *, which I forgot to add the foot note was supposed to be a comment that the Xilinx design courses, specifically the one by Avnet, come right out and say (paraphrased) don't floorplan the design. It is beyond the scope of the class, and is not necessary except in extenuating circumstances in which case your FAE should be involved. --A travesty if you ask me. > I think Xilinx has to release a tutorial on how to use a > floorplanner, and I will also like to see more information on what the > routing is like inside a Virtex architecture chip (Altera documentation > is slightly better since it has more figures about the routing structure > in their datasheets.) > Although, I suppose if I had access to FPGA Editor, I won't complain > about it . . . Problem is, I don't think there is anybody there at Xilinx that has done much floorplanning in real designs. As I alluded earlier, it really is more an art than an science at this point. In my experience, some people are naturals at it, some just can't seem to do it no matter how much coaching they get. It is one thing to show you how to work the tool, quite another to explain how to come up with a good floorplan. For those who get the knack, it is pretty much sufficient to just say "keep the routes as short and as rectangular as possible. Place the logic to meet that goal". You can pretty much tell who wil get it and who won't in the first 5 minutes of showing them the tool. The xilinx has fairly complete information not only in the tools but in the data books, enough to know what routing resources you have and the connection pattern. Altera, at least last time I checked, leaves you in the dark as to the routing connectivity. With the 10K, doing hand placement could make your design worse because there is *no* information available as to the connection pattern of the LABs on the row (only a limited set of row routes goes to any particular lab. If you put logic in two LABs in one row, without that information, your route may have to go through a 3rd LAB to make the connection). Fortunately for Altera, the performance is not as sensitive to layout because the delay is the same to any connected LAB in the row, so floorplanning there is not as important. I think for the newer devices, starting with the 20K that have direct connects to adjacent LABs,however, floorplanning becomes a lot more important for a data-flow design. In those cases, it would be nice to have the detail on the row route connection matrix. > > > > Part of the problem is that floorplanning is sort of like > > putting together a puzzle with many acceptable > > solutions. Some people have the knack for it, some don't and if you don't you will probably not inherit it ever. > > > > > > -- > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### Message-ID: <3CAD3083.D2EA0B0C@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CACEF37.DCA68151@andraka.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 60 Date: Fri, 05 Apr 2002 05:04:30 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1017983070 68.15.41.165 (Fri, 05 Apr 2002 00:04:30 EST) NNTP-Posting-Date: Fri, 05 Apr 2002 00:04:30 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-out.spamkiller.net!propagator2-maxim!propagator-maxim!news-in.spamkiller.net!feed.newsfeeds.com!news-out.visi.com!hermes.visi.com!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16137 Nicholas Weaver wrote: > In article <3CACEF37.DCA68151@andraka.com>, > Ray Andraka wrote: > >We've heard the "place and route is good enough you don't need to do > >floorplanning unless you are doing the 1% designs from hell" line for > >as far back as I can remember from Xilinx. Fact is, floorplanning > >seems to be getting larger gains, not smaller, with the new devices. > >I typically see 50-70% performance improvement over a automatic > >placement. > > I actually assert that a lot of the problem is right in the synthesis, > not in the P&R: There has been a lot of work which has been done on > work which, given a datapath, constructs an orderly layout. Yet none > of this seems to have found its way into the commercial synthesis > toolflows. So much informtion is thrown away. I don't buy this. The information is there in the form of the netlist. The fact that we get the gains we do out of floorplanning indicates that the synthesis is doing fine. Perhaps you mean the synthesis needs to infer placement and add the placement info to the primitives? > > > >Routing multiple times without running placement is not going get > >much in the way of performance gains. The router does a pretty > >decent job if the placement is good, and can't do much to salvage a > >poor placement. > > This is especially true when you consider just how rich the modern > FPGA interconnects are: Compare the number of wires/LUT in an XC4000 > with a 4000XL with a Virtex. I think more the inverse. The new parts have enough routing resources that even a poor placement will route. The older 4K devices would route well, and with consistent results if the placement was good. It was there I first learned the importance of floorplanning. The reason the floorplanning seems to be getting bigger gains in the new devices is mostly due to the size of the device giving the automatic placement more placement options with which to hang itself. > > > -- > Nicholas C. Weaver nweaver@cs.berkeley.edu -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### Message-ID: <3CAD31A9.F50F3653@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CAD173C.71FB0077@attbi.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 67 Date: Fri, 05 Apr 2002 05:09:24 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1017983364 68.15.41.165 (Fri, 05 Apr 2002 00:09:24 EST) NNTP-Posting-Date: Fri, 05 Apr 2002 00:09:24 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!out.nntp.be!propagator-SanJose!in.nntp.be!nntp-relay.ihug.net!ihug.co.nz!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16135 Frankly, I don't see the cost justified by the marginal added value with amplify. You can do almost* everything it does with the area constraints in the floorplanner, plus with the floorplanner you can lock down some and area constrain other logic. *The one thing it does buy you is to take into consideration the layout while doing the synthesis. Like you mention, there was a time when we did floorplanning with graph paper. The GUI makes it easier, but I still use the graph paper method for doing placement in the source. Phil Hays wrote: > Jimmy Zhang wrote: > > > Just keep hearing about this hand placement thing, don't know how it > > is done in reality. Does someone actually use their hands to do the > > placement as opposed to CAD based P&R. Any hints? > > The first way I learned to do this was with a paper diagram of the > target chip, writing the constraints with a text editor, and coloring on > the paper to indicate what had been put where. I didn't do the best of > jobs (had an register reversed, with the msb where the lsb should be), > but it was still ~30% faster resulting clock speed than the automatic > placement. Made place and route times drop nicely as well. It was even > better than that once I got the twist removed. But this is as close to > "by hand" as I can picture. > > The floorplanner that Xilinx provides is just a nicely automated way of > doing the same sort of puzzle. Do the data path(s) first, fit things > together in a "logical" fashion, and for the first one floor plan at > least plan on spending some time fiddling. Some people seem to get this > skill right away, and some take longer. > > A slightly "higher level" way of gaining much of the benefit from > floorplanning with potentially rather less effort is to use a "physical > design" tool. Synplicity had the first ("Amplify") aimed at FPGA design > (and I'm not sure if Mentor, Synopsys or anyone else have anything in > this space yet), however there were physical design tools for ASIC > design long before Amplify. These work by putting large chunks of the > design into subsections of the target chip. > > Synopsys's ASIC physical design tool set: > > http://www.synopsys.com/products/phy_syn/phy_syn.html > > Amplify is at: > > http://www.synplicity.com/products/amplify.html > > -- > Phil Hays -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### From: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Fri, 5 Apr 2002 05:31:31 +0000 (UTC) Organization: Unknown Lines: 34 Message-ID: References: <3CACEF37.DCA68151@andraka.com> <3CAD3083.D2EA0B0C@andraka.com> NNTP-Posting-Host: soda.csua.berkeley.edu X-Trace: agate.berkeley.edu 1017984691 47659 128.32.247.226 (5 Apr 2002 05:31:31 GMT) X-Complaints-To: usenet@agate.berkeley.edu NNTP-Posting-Date: Fri, 5 Apr 2002 05:31:31 +0000 (UTC) Originator: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!news.maxwell.syr.edu!news-hog.berkeley.edu!ucberkeley!agate.berkeley.edu!agate!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16159 In article <3CAD3083.D2EA0B0C@andraka.com>, Ray Andraka wrote: >> I actually assert that a lot of the problem is right in the synthesis, >> not in the P&R: There has been a lot of work which has been done on >> work which, given a datapath, constructs an orderly layout. Yet none >> of this seems to have found its way into the commercial synthesis >> toolflows. So much informtion is thrown away. > >I don't buy this. The information is there in the form of the netlist. The >fact that we get the gains we do out of floorplanning indicates that the >synthesis is doing fine. Perhaps you mean the synthesis needs to infer >placement and add the placement info to the primitives? That there is convenient structure which the synthesis tool can easily exploit, as the high level information is there, while the P&R tool would have to infer and recover. It is much the same way where, yes, you can do all the compiler optimizations at the assembly level, but it is MUCH more straightforward to do alot of it earler in the process, in the intermediate form. >> This is especially true when you consider just how rich the modern >> FPGA interconnects are: Compare the number of wires/LUT in an XC4000 >> with a 4000XL with a Virtex. > >I think more the inverse. The new parts have enough routing resources that >even a poor placement will route. I think we are trying to say the same thing: Interconnects are much richer than they used to be, so the router has more freedom and an easier time of things. -- Nicholas C. Weaver nweaver@cs.berkeley.edu ###### From: "Jimmy Zhang" Newsgroups: comp.arch.fpga References: <3cacff43@news.starhub.net.sg> Subject: Re: hand placement Lines: 41 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Message-ID: NNTP-Posting-Host: 12.234.57.72 X-Complaints-To: abuse@attbi.com X-Trace: sccrnsc02 1017985333 12.234.57.72 (Fri, 05 Apr 2002 05:42:13 GMT) NNTP-Posting-Date: Fri, 05 Apr 2002 05:42:13 GMT Organization: AT&T Broadband Date: Fri, 05 Apr 2002 05:42:17 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!wn14eed!wn1feed!worldnet.att.net!204.127.198.204!attbi_feed4!attbi_feed3!attbi.com!sccrnsc02.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16144 Well, since I know Nick personally, I can say for sure that Nick doesn't have to worry about jobs simply because he is on the other side of this whole job thing. -- ----------------------------------------------------- Click here for Free Video!! http://www.gohip.com/freevideo/ "Kelvin Xu Qijun" wrote in message news:3cacff43@news.starhub.net.sg... > Nicholas, pls stop laugh since they have made it possible that you stay on > your job for a while... > I feel the laziness of CAD companies has saved many jobs in IC design > area...:-) > > > > > Email: qijun@okigrp.com.sg > "Nicholas Weaver" wrote in message > news:a8i8mh$n57$1@agate.berkeley.edu... > > In article , > > Kevin Brace wrote: > > > I have seen one Xilinx employee in this newsgroup saying that > > >automatic P&R is getting better, so low level tools like floorplanner or > > >FPGA Editor is getting less important. > > > > Everytime Xilinx/Altera/Tool people say this, I have to laugh. > > > > There is so much low hanging fruit in datapath recognition, which the > > tools fail MISERABLY to recognise. A simple first order pass, align > > up the datapath, can be such a win. > > -- > > Nicholas C. Weaver nweaver@cs.berkeley.edu > > ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Thu, 04 Apr 2002 23:59:12 -0600 Organization: None Lines: 54 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> <3CAD2F02.5D39F79A@andraka.com> NNTP-Posting-Host: 1cust237.tnt89.chi5.da.uu.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1017985784 24636 67.195.71.237 (5 Apr 2002 05:49:44 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Fri, 5 Apr 2002 05:49:44 +0000 (UTC) X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16164 Ray Andraka wrote: > > > I'm willing to bet he was running multipass PLACE and route. > The multipass placement often times will find a better placement. > It is not an incremental improvement, rather it is just doing the placement > several times with different cost tables. Usually one or two cost tables will > do better than others > for a particular design. If you don't get a good layout in the first 5 or 6 tables, you are not likely to get one at all. > Yes, in that article, I think Lucent used multi-pass P&R. (It is coming back to me.) > > The *, which I forgot to add the foot note was supposed to be a comment that the Xilinx design courses, specifically the one by Avnet, come right out and > say (paraphrased) don't floorplan the design. It is beyond the scope of the class, and is not necessary except in extenuating circumstances in which > case your FAE should be involved. --A travesty if you ask me. > Sounds to me Avnet is recommending ordinary users not to use the Floorplanner to get more consulting work. > > Problem is, I don't think there is anybody there at Xilinx that has done > much floorplanning in real designs. Didn't Xilinx have to resort to Floorplanner/FPGA Editor to get Virtex/Spartan-II-6 to meet 66MHz PCI's Tsu < 3ns? In practice, their design should have a little more time than 3ns because of a global clock buffer delay (I am not 100% sure if it is safe to assume global clock buffer delay is going to be there.) which is about 1.5ns (in Spartan-II XC2S150-6). I personally doubt from my own experience that they can handle unregistered signal paths (a signal coming in from a pin, going through several levels of 4-input LUT, and eventually reaching IOB FFs.) in only 4.5ns, and they do admit that the user has to add additional delay on the global clock buffer in Bitgen. It is called a /Gclkdel option, but it is a guard secret, and I have no idea how it works. Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) ###### From: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Fri, 5 Apr 2002 08:24:24 +0000 (UTC) Organization: Unknown Lines: 11 Message-ID: References: <3cacff43@news.starhub.net.sg> NNTP-Posting-Host: soda.csua.berkeley.edu X-Trace: agate.berkeley.edu 1017995064 52333 128.32.247.226 (5 Apr 2002 08:24:24 GMT) X-Complaints-To: usenet@agate.berkeley.edu NNTP-Posting-Date: Fri, 5 Apr 2002 08:24:24 +0000 (UTC) Originator: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-out.spamkiller.net!propagator2-maxim!propagator-maxim!news-in.spamkiller.net!tethys.csu.net!news-hog.berkeley.edu!ucberkeley!agate.berkeley.edu!agate!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16157 In article <3cacff43@news.starhub.net.sg>, Kelvin Xu Qijun wrote: >Nicholas, pls stop laugh since they have made it possible that you stay on >your job for a while... >I feel the laziness of CAD companies has saved many jobs in IC design >area...:-) Hehehehhe. I'm an academic, so well, I find it entertaining, not a guarentee of employment. -- Nicholas C. Weaver nweaver@cs.berkeley.edu ###### From: Philip Freidin Newsgroups: comp.arch.fpga Subject: Re: hand placement Organization: Fliptronics Reply-To: philip@fliptronics.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> X-Newsreader: Forte Agent 1.9/32.560 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 43 NNTP-Posting-Host: 216.103.85.188 X-Complaints-To: abuse@prodigy.net X-Trace: newssvr13.news.prodigy.com 1018005320 ST000 216.103.85.188 (Fri, 05 Apr 2002 06:15:20 EST) NNTP-Posting-Date: Fri, 05 Apr 2002 06:15:20 EST X-UserInfo1: [[PGG]WD\BR[C_TX@BCD^VX@WB]^PCPDLXUNNH\KMAVNDQUBLNTC@AWZWDXZXQ[K\FFSKCVM@F_N_DOBWVWG__LG@VVOIPLIGX\\BU_B@\P\PFX\B[APHTWAHDCKJF^NHD[YJAZMCY_CWG[SX\Y]^KC\HSZRWSWKGAY_PC[BQ[BXAS\F\\@DMTLFZFUE@\VL Date: Fri, 05 Apr 2002 11:15:20 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.imp.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!cyclone2.usenetserver.com!usenetserver.com!newscon06.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr13.news.prodigy.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16129 On Thu, 04 Apr 2002 20:15:11 -0600, Kevin Brace wrote: >(I already figured >out how that secret PCILOGIC thing works, and I only now need to know >how Bitgen's /Gclkdel option works to insert more delay on the global >clock buffer . . . ) > > .... > >However, this "It's an FAE's tool" mentality seems to create a myth that >only Xilinx or really experienced users can use it, and that myth seems >to be similar to a PCILOGIC myth people talk about when a question comes >up about "The special IRDY and TRDY pin for PCI in Virtex." >People always say only Xilinx knows how PCILOGIC works, but should I >shatter the PCILOGIC myth for once and for all by posting a sample code, >and some explanation on how to instantiate it and actually use it? > Before you hurt your self with too much patting yourself on the back, you might want to look at this, from a year ago: http://www.fpga-faq.com/archives/30000.html#30017 and http://www.fpga-faq.com/archives/30000.html#30018 Not much of a secret really. :-) :-) :-) But /Gclkdel is a different story. Maybe you should write an explanation for inclusion in the FAQ?? =================== Philip Freidin philip@fliptronics.com Host for WWW.FPGA-FAQ.COM ###### From: Philip Freidin Newsgroups: comp.arch.fpga Subject: Re: hand placement Organization: Fliptronics Reply-To: philip@fliptronics.com Message-ID: <5s1rauk5eln8if18gq9iattmp0f7ik6r0r@4ax.com> References: <3CACEF37.DCA68151@andraka.com> <3CACFD0D.1B5EB315@iprimus.com.au> <3CAD0269.C22D5B98@andraka.com> X-Newsreader: Forte Agent 1.9/32.560 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 19 NNTP-Posting-Host: 216.103.85.188 X-Complaints-To: abuse@prodigy.net X-Trace: newssvr13.news.prodigy.com 1018005593 ST000 216.103.85.188 (Fri, 05 Apr 2002 06:19:53 EST) NNTP-Posting-Date: Fri, 05 Apr 2002 06:19:53 EST X-UserInfo1: [[PGG]WD\BR[C_TX@BCD^VX@WB]^PCPDLXUNNH\KMAVNDQUBLNTC@AWZWDXZXQ[K\FFSKCVM@F_N_DOBWVWG__LG@VVOIPLIGX\\BU_B@\P\PFX\B[APHTWAHDCKJF^NHD[YJAZMCY_CWG[SX\Y]^KC\HSZRWSWKGAY_PC[BQ[BXAS\F\\@DMTLFZFUE@\VL Date: Fri, 05 Apr 2002 11:19:53 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!fr.usenet-edu.net!usenet-edu.net!news.tele.dk!small.news.tele.dk!207.115.63.138!newscon04.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr13.news.prodigy.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16173 For a tutorial (somewhat dated, but a reasonable intro) you might want to look at: http://www.fliptronics.com/floorplanning1.html and to see some really beautiful examples of floorplanning you can see them at http://www.fliptronics.com/gallery.html In particular, the designs MFILT, DVB_DEMOD, LINEAR, and FORMATTER show what can be achieved. Philip Freidin Fliptronics ###### Message-ID: <3CADD533.5050706@synplicity.com> From: Ken McElvain User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1 X-Accept-Language: en-us MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CAD173C.71FB0077@attbi.com> <3CAD31A9.F50F3653@andraka.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Lines: 113 Date: Fri, 05 Apr 2002 16:47:35 GMT NNTP-Posting-Host: 209.157.48.1 X-Complaints-To: abuse@verio.net X-Trace: sea-read.news.verio.net 1018025255 209.157.48.1 (Fri, 05 Apr 2002 16:47:35 GMT) NNTP-Posting-Date: Fri, 05 Apr 2002 16:47:35 GMT Organization: Verio Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!news.maxwell.syr.edu!newsfeed.media.kyoto-u.ac.jp!sjc-peer.news.verio.net!news.verio.net!sea-read.news.verio.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16146 I think a more detailed description of what Amplify does is in order. The logical hierarchy of a design can be easily reorganized into a good physical hierarchy without touching the source code. This may not matter to some people who thought physically from the beginning of their design and have the experience to know up front what the organization should be. Part of this reorganization capability includes the ability to replicate chunks of RTL objects. You may find that replicating an FSM or counter into different parts of the chip yields large improvements. Because the floorplan definition in Amplify is at an RTL level and we worked very hard at making generated RTL object names repeatable even with design changes, the RTL floorplan can survive design changes. Gate level floorplans that don't stick to module boundaries often have to be redone for even trivial design changes. (If you are doing structural netlisting in your HDL and generating placements, then that would be an exception). Amplify will also perform boundary optimization on your floorplan which involve timing optimizations mixed with placing results back into the regions. The current version of Amplify (3.0) includes a full detail placer for regions that works in cooperation with timing optimization of the logic. This feature is currently only available for Virtex/VirtexE. Detail placement obviously gives Amplify much more accurate delay information. An experienced Amplify user can get substantial performance improvements in 3-5 iterations through P&R. The first iteration gets you calibrated and the following iterations are like a game of whack a mole where the mole doesn't come back up again. Amplify obviously works best when the critical paths in your design are through a chunks of RTL. If you are willing to do a structural design and hand place everything, then it won't be useful to you. Most designs aren't done that way. Ken McElvain CTO Synplicity, Inc. Ray Andraka wrote: > Frankly, I don't see the cost justified by the marginal added value with > amplify. You can do almost* everything it does with the area constraints in > the floorplanner, plus with the floorplanner you can lock down some and area > constrain other logic. > > *The one thing it does buy you is to take into consideration the layout > while doing the synthesis. > > Like you mention, there was a time when we did floorplanning with graph > paper. The GUI makes it easier, but I still use the graph paper method for > doing placement in the source. > > > Phil Hays wrote: > > >>Jimmy Zhang wrote: >> >> >>>Just keep hearing about this hand placement thing, don't know how it >>>is done in reality. Does someone actually use their hands to do the >>>placement as opposed to CAD based P&R. Any hints? >>> >>The first way I learned to do this was with a paper diagram of the >>target chip, writing the constraints with a text editor, and coloring on >>the paper to indicate what had been put where. I didn't do the best of >>jobs (had an register reversed, with the msb where the lsb should be), >>but it was still ~30% faster resulting clock speed than the automatic >>placement. Made place and route times drop nicely as well. It was even >>better than that once I got the twist removed. But this is as close to >>"by hand" as I can picture. >> >>The floorplanner that Xilinx provides is just a nicely automated way of >>doing the same sort of puzzle. Do the data path(s) first, fit things >>together in a "logical" fashion, and for the first one floor plan at >>least plan on spending some time fiddling. Some people seem to get this >>skill right away, and some take longer. >> >>A slightly "higher level" way of gaining much of the benefit from >>floorplanning with potentially rather less effort is to use a "physical >>design" tool. Synplicity had the first ("Amplify") aimed at FPGA design >>(and I'm not sure if Mentor, Synopsys or anyone else have anything in >>this space yet), however there were physical design tools for ASIC >>design long before Amplify. These work by putting large chunks of the >>design into subsections of the target chip. >> >>Synopsys's ASIC physical design tool set: >> >>http://www.synopsys.com/products/phy_syn/phy_syn.html >> >>Amplify is at: >> >>http://www.synplicity.com/products/amplify.html >> >>-- >>Phil Hays >> > > -- > --Ray Andraka, P.E. > President, the Andraka Consulting Group, Inc. > 401/884-7930 Fax 401/884-7950 > email ray@andraka.com > http://www.andraka.com > > "They that give up essential liberty to obtain a little > temporary safety deserve neither liberty nor safety." > -Benjamin Franklin, 1759 > > > ###### Message-ID: <3CADD560.7013D4FC@mail.com> From: John_H X-Mailer: Mozilla 4.75 [en]C-CCK-MCD (Win95; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CACEF37.DCA68151@andraka.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 16 Date: Fri, 05 Apr 2002 16:48:34 GMT NNTP-Posting-Host: 192.65.17.17 X-Complaints-To: postmaster@opbu.xerox.com X-Trace: news-west.eli.net 1018025314 192.65.17.17 (Fri, 05 Apr 2002 09:48:34 MST) NNTP-Posting-Date: Fri, 05 Apr 2002 09:48:34 MST Organization: Xerox Officeprinting NewsReader Service Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed.earthlink.net!sjc1.nntp.concentric.net!newsfeed.concentric.net!news-west.eli.net!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16177 Please, please... Tell me what constructs I can manually edit into my edif file to produce orderly layout! If the synthesis tool isn't giving the P&R the info it needs to get the job done, what can I possibly do to have the P&R do better than the fumbling I get now? Nicholas Weaver wrote: > I actually assert that a lot of the problem is right in the synthesis, > not in the P&R: There has been a lot of work which has been done on > work which, given a datapath, constructs an orderly layout. Yet none > of this seems to have found its way into the commercial synthesis > toolflows. So much informtion is thrown away. ###### Message-ID: <3CADE6DE.4ECAAF5A@mail.com> From: John_H X-Mailer: Mozilla 4.75 [en]C-CCK-MCD (Win95; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CAD173C.71FB0077@attbi.com> <3CAD31A9.F50F3653@andraka.com> <3CADD533.5050706@synplicity.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 34 Date: Fri, 05 Apr 2002 18:03:11 GMT NNTP-Posting-Host: 192.65.17.17 X-Complaints-To: postmaster@opbu.xerox.com X-Trace: news-west.eli.net 1018029791 192.65.17.17 (Fri, 05 Apr 2002 11:03:11 MST) NNTP-Posting-Date: Fri, 05 Apr 2002 11:03:11 MST Organization: Xerox Officeprinting NewsReader Service Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!newsfeed.cwix.com!sjc-peer.news.verio.net!news.verio.net!news.sanjose1.Level3.net!Level3.net!news-west.eli.net!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16178 Let me start by saying thanks for providing the Synplify tool to the market. Compile times are great and the results are reasonable though I still have my troubles with every design. I need to hand-tweak much less of the code than with other tools I've used in the past. Ken, you have my respect. The replication of chunks of RTL is potentially helpful: it save the manual replication I regularly do by hand in my code for a good hour or two savings in a large project. Thank you for working "very hard at making generated RTL object names repeatable even with design changes" since this is the biggest open sore of the tool that I actually own. I just wish it could benefit me. (more below...) Ken McElvain wrote: > An experienced Amplify user can get substantial performance improvements > in 3-5 iterations through P&R. The first iteration gets you calibrated > and the following iterations are like a game of whack a mole where the > mole doesn't come back up again. I've suggested in the past that my 8 year old nephew could get just about as good results using Amplify as an engineer with intimate knowledge of the design. I leave it to you to decide if this is praise for the ease of use of the product or a criticism of the tool to take that "extra step" and decide on the area optimizations itself. I realize the Amplify tool has a market because of the inadequacies of the Xilinx P&R tools but it also has a place because Synplify falls short of allowing us to do the light work ourselves. ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Fri, 05 Apr 2002 13:57:12 -0600 Organization: None Lines: 107 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> NNTP-Posting-Host: st-66-99-47-20.wheaton.lib.il.us Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1018037065 24514 66.99.47.20 (5 Apr 2002 20:04:25 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Fri, 5 Apr 2002 20:04:25 +0000 (UTC) X-Mailer: Mozilla 4.75 [en] (Win95; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16180 Philip Freidin wrote: > > On Thu, 04 Apr 2002 20:15:11 -0600, Kevin Brace > wrote: > > > Before you hurt your self with too much patting yourself on the > back, you might want to look at this, from a year ago: > Not a very nice comment. > http://www.fpga-faq.com/archives/30000.html#30017 > > and > > http://www.fpga-faq.com/archives/30000.html#30018 > When I figured it out how to instantiate it from XST, I used the information from the following website. http://www.opencores.org/forums/pci/2001/09/00003 The problem was, the guy who did the analysis didn't post the code he wrote, and it wasn't in Verilog. (I am a VHDL hater.) Until I thought of cracking the PCILOGIC secret, I never used a blackbox in my code (Now I use it. It is a very useful feature.), so I had to learn that, and not having an initiator/target PCI IP core (It was a target only one at that time.) to play around with at the time also caused some problems when I experimented with it. > Not much of a secret really. :-) :-) :-) > Before reading that opencore.org conversation, I have already done a Google Groups search of this newsgroup, but I didn't find any useful information about PCILOGIC at that time. Only after I figured out how the PCILOGIC works, I noticed that the equation inside was indeed discussed at this newsgroup. I didn't used to visit this newsgroup at the time it was discussed either. I find it surprising that Eric Crabill of Xilinx who supposedly works with LogiCORE PCI at Xilinx publically admitting the equations inside. Isn't that supposed to be a trade secret of Xilinx? PCILOGIC may not be a huge secret to you, but a question about it seems to come up once every two months or so, and people who answer it keeps saying that it is a secret feature only Xilinx knows. Yes, if you ask a Xilinx employee about it, they won't tell you at all saying that it is an undisclosed feature. So, I will still call it a secret feature, but I already got it. After I figured it out how it works, it turns out there is really no big deal other than unregistered long signal paths (unregistered IRDY# and TRDY#) leading to CE (Clock Enable) input of IOB output FFs will not cause timing problems during P&R. So, there is really not big deal, and I will like to shatter that myth that it is some kind of a magic box. > But /Gclkdel is a different story. > All the publically available Xilinx documents say, "Don't use it unless you are instructed by Xilinx." Trying to keep something secret, of course, makes me more interested in knowing about it. Supposedly, the customers of their 66MHz PCI IP core know how to use it according to LogiCORE PCI implementation document, because they are required to use it in order to meet Tsu < 3ns. One concern I have with a /Gclkdel option is that it is added after a static timing analysis is done, so the static timing analyzer's numbers won't reflect the /Gclkdel option. > Maybe you should write an explanation for inclusion in the FAQ?? > > =================== > Philip Freidin > philip@fliptronics.com > Host for WWW.FPGA-FAQ.COM Yes, if it is possible, I will rather have a sample code, a constraint file, and the analysis of how it works included in the FPGA FAQ than simply being posted to this newsgroup because that way, people can refer to the FAQ rather than having to search this newsgroup. Where should I send all the information? Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Fri, 05 Apr 2002 14:21:50 -0600 Organization: None Lines: 47 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> <3CACFD0D.1B5EB315@iprimus.com.au> <3CAD0269.C22D5B98@andraka.com> <5s1rauk5eln8if18gq9iattmp0f7ik6r0r@4ax.com> NNTP-Posting-Host: st-66-99-47-20.wheaton.lib.il.us Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1018038543 25152 66.99.47.20 (5 Apr 2002 20:29:03 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Fri, 5 Apr 2002 20:29:03 +0000 (UTC) X-Mailer: Mozilla 4.75 [en] (Win95; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16181 Thanks, Philip for the information. Why not update the information to cover Virtex architecture FPGAs? I believe I already know the basics of floorplanning in Virtex architecture FPGAs because I had to floorplan my design to meet Tsu < 7ns requirement of 33MHz PCI a few months ago. Although I no longer have to do so now because of design improvements I made. Several times in the past, I have "tried" to get my PCI IP core to meet Tsu < 3ns in Spartan-II-6, and, of course, I had to manually floorplan very timing critical unregistered control signal paths. (i.e., unregistered DEVSEL#, TRDY#, and STOP# to FRAME# and IRDY IOB FFs.) I came within about 15% of meeting Tsu < 3ns after tying two global clock buffers in series to artificially generate more setup time. (I consider this an alternative to Bitgen's /Gclkdel option, but I am not sure how reliable it is.) Likely, Virtex-E-7 might make it if everything works out, but I haven't tried that yet. However, the problem I am having here is not with the Xilinx Floorplanner, but with Quartus II 2.0 Web Edition's floorplanner, and do you know any online resources about it? I guess Altera users don't use floorplanner that much like some Xilinx users do, but for things like PCI, I think it is very important. Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) Philip Freidin wrote: > > For a tutorial (somewhat dated, but a reasonable intro) you > might want to look at: > > http://www.fliptronics.com/floorplanning1.html > > and to see some really beautiful examples of floorplanning > you can see them at > > http://www.fliptronics.com/gallery.html > > In particular, the designs MFILT, DVB_DEMOD, LINEAR, and > FORMATTER show what can be achieved. > > Philip Freidin > Fliptronics ###### From: Philip Freidin Newsgroups: comp.arch.fpga Subject: Re: hand placement Organization: Fliptronics Reply-To: philip@fliptronics.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> X-Newsreader: Forte Agent 1.9/32.560 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 118 NNTP-Posting-Host: 216.103.85.188 X-Complaints-To: abuse@prodigy.net X-Trace: newssvr21.news.prodigy.com 1018039108 ST000 216.103.85.188 (Fri, 05 Apr 2002 15:38:28 EST) NNTP-Posting-Date: Fri, 05 Apr 2002 15:38:28 EST X-UserInfo1: T[O_BXCDEBSWSQPXNCOF_W\@PJ_^PBQLGPQRZQMIWIWTEPIB_NVUAH_[BL[\IRKIANGGJBFNJF_DOLSCENSY^U@FRFUEXR@KFXYDBPWBCDQJA@X_DCBHXR[C@\EOKCJLED_SZ@RMWYXYWE_P@\\GOIW^@SYFFSWHFIXMADO@^[ADPRPETLBJ]RDGENSKQQZN Date: Fri, 05 Apr 2002 20:38:28 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!fr.usenet-edu.net!usenet-edu.net!news.tele.dk!small.news.tele.dk!207.115.63.138!newscon04.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr21.news.prodigy.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16182 On Fri, 05 Apr 2002 13:57:12 -0600, Kevin Brace wrote: >Philip Freidin wrote: >> >> On Thu, 04 Apr 2002 20:15:11 -0600, Kevin Brace >> wrote: >> >> Before you hurt your self with too much patting yourself on the >> back, you might want to look at this, from a year ago: > > Not a very nice comment. Oh, come on! I know that other have written personal attacks that were uncalled for, but this hardly counts. How much more gentle could I have been? The info is available, it's in the archive, anyone can see it. >> Not much of a secret really. :-) :-) :-) > > Before reading that opencore.org conversation, I have already >done a Google Groups search of this newsgroup, but I didn't find any >useful information about PCILOGIC at that time. >Only after I figured out how the PCILOGIC works, I noticed that the >equation inside was indeed discussed at this newsgroup. >I didn't used to visit this newsgroup at the time it was discussed >either. Well, that doesn't make it a secret. Just an undocumented feature of the chip. Others have reverse engineered its functionality and published the info in this news group. >I find it surprising that Eric Crabill of Xilinx who supposedly works >with LogiCORE PCI at Xilinx publically admitting the equations inside. Typical Xilinx person being helpfull. >Isn't that supposed to be a trade secret of Xilinx? Who supposes? Not much of a trade secret if it has been published. Since Xilinx tools even output a simulation model for it for post P&R simulation, there is no way it could be considered a secret. Just poorly (i.e. not at all) documented. > PCILOGIC may not be a huge secret to you, but a question about >it seems to come up once every two months or so, and people who answer >it keeps saying that it is a secret feature only Xilinx knows. Who said that ??? Show them too me !!! I'll give them the URL too. >Yes, if you ask a Xilinx employee about it, they won't tell you at all >saying that it is an undisclosed feature. More likely, they just don't know. Xilinx probably has 2000 people, and of them only 6 people (approx) know the details: The Product Planner The I.C. designer The I.C. test engineer The SW QA engineer The SW engineer that created the simulation model The engineer in the IP group that created the PCI cores. If you talk to any of the other 1994 people, they will look at the same info you have access to, and wont find an answer. Then they might answer "It must be a secret". Just like the secret that you can do higher quality designs with schematics and hierarchial floorplanning. And timing based simulation is unnecessary if you have done fully synchronous design and have 100% coverage of timespecs with static timing analysis. >So, I will still call it a secret feature, but I already got it. Good enough. >After I figured it out how it works, it turns out there is really no big >deal other than unregistered long signal paths (unregistered IRDY# and >TRDY#) leading to CE (Clock Enable) input of IOB output FFs will not >cause timing problems during P&R. >So, there is really not big deal, and I will like to shatter that myth >that it is some kind of a magic box. > >> Maybe you should write an explanation for inclusion in the FAQ?? >> >> =================== >> Philip Freidin >> philip@fliptronics.com >> Host for WWW.FPGA-FAQ.COM > > > Yes, if it is possible, I will rather have a sample code, a >constraint file, and the analysis of how it works included in the FPGA >FAQ than simply being posted to this newsgroup because that way, people >can refer to the FAQ rather than having to search this newsgroup. Sure it is possible. >Where should I send all the information? Go to http://www.fpga-faq.com/FAQ_Root.htm , and at the bottom of the page, down load the template of a FAQ page. Then fill in the page the way you would like it to be, and email it to philip@fpga-faq.com If you need help doing this, let me know. It will be published for all to see in the FAQ, and will not be a secret anymore. You may want to look at some other FAQ pages, to get a feel for the current style. This is also an open invitation to everyone else to write some FAQ pages. It is only as good as the sum of the contributions !! Philip Freidin Philip Freidin Fliptronics ###### From: Eric Crabill Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Fri, 05 Apr 2002 13:48:54 -0800 Organization: Xilinx, Incorporated Lines: 46 Message-ID: <3CAE1BC6.5FE0A871@xilinx.com> References: <3CACEF37.DCA68151@andraka.com> NNTP-Posting-Host: 149.199.14.189 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.7 [en]C-CCK-MCD (WinNT; U) X-Accept-Language: en,pdf Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!cyclone.bc.net!sunqbc.risq.qc.ca!newsfeed.mathworks.com!wn3feed!worldnet.att.net!204.127.198.204!attbi_feed4!attbi_feed3!attbi.com!12.120.28.17!attla2!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16188 Hi, Kevin Brace wrote: > I find it surprising that Eric Crabill of Xilinx who supposedly works > with LogiCORE PCI at Xilinx publically admitting the equations inside. > Isn't that supposed to be a trade secret of Xilinx? I don't supposedly work at Xilinx, I actually work at Xilinx, in the IP Solutions Group, developing IP. I am, in a tangential way, associated with the PCI and PCI-X cores... I wouldn't call it a trade secret. You may find the general invention patented and assigned to Xilinx, however. The implementation in Virtex, Spartan-II, Virtex-E, and Spartan-IIE is the same and intended to assist implementation of PCI cores in these devices -- the logic implemented is quite literally a tiny part of our core cast into silicon (and, in the grand scheme of things, immaterial; the real advantage of this "feature" is the dedicated routing associated with it). This feature is, however, undocumented, unsupported, and not intended for general use. It is supported in the context of the Xilinx PCI LogiCORE. The feature was put in the silicon by the request of the PCI Development team, for use by the PCI Development team. If you use it in your own designs, that is fine. However, if you run into problems/issues, you are on your own -- the feature is undocumented and unsupported. If you were to file a case with the Support Hotline, they probably won't be able to help you directly. Such a case would most likely be forwarded to me, and I would write back, "This feature is undocumented, unsupported, and not intended for general use. Sorry." > So, there is really not big deal, and I will like to shatter that myth > that it is some kind of a magic box. I prefer to think of it as a magic box. In fact, the instance name of it in our core is "MAGICBOX". Maybe that is where the notion came from... Anyone who is really interested can determine the logic function using the publically available tool set. Likewise, if you are very curious about the GCLKDEL option, you can experimentally determine what it does. Eric ###### From: Eric Crabill Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Fri, 05 Apr 2002 13:56:29 -0800 Organization: Xilinx, Incorporated Lines: 18 Message-ID: <3CAE1D8D.145767D2@xilinx.com> References: <3CACEF37.DCA68151@andraka.com> NNTP-Posting-Host: 149.199.14.189 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.7 [en]C-CCK-MCD (WinNT; U) X-Accept-Language: en,pdf To: philip@fliptronics.com Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.cwix.com!prairie.attcanada.net!newsfeed.attcanada.net!12.127.17.144!attbt1!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16187 Philip Freidin wrote: > More likely, they just don't know. Xilinx probably has 2000 > people, and of them only 6 people (approx) know the details: > > The Product Planner > The I.C. designer > The I.C. test engineer > The SW QA engineer > The SW engineer that created the simulation model > The engineer in the IP group that created the PCI cores. Even less, if by chance one person did, say, three or four of those things... You are all trying to steal my magic bag! Eric ###### Message-ID: <3CAE6202.4DECF164@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CACEF37.DCA68151@andraka.com> <3CAD3083.D2EA0B0C@andraka.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 34 Date: Sat, 06 Apr 2002 02:47:58 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1018061278 68.15.41.165 (Fri, 05 Apr 2002 21:47:58 EST) NNTP-Posting-Date: Fri, 05 Apr 2002 21:47:58 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news-out.visi.com!hermes.visi.com!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16190 I'm still not following you here. If you do your design hierarchically and keep in hierarchical in the edif netlist, what information is getting lost? You still have the signal names with the bit numbering intact, you still have the structure. Maybe I am missing your point. Nicholas Weaver wrote: > I>I don't buy this. The information is there in the form of the netlist. The > >fact that we get the gains we do out of floorplanning indicates that the > >synthesis is doing fine. Perhaps you mean the synthesis needs to infer > >placement and add the placement info to the primitives? > > That there is convenient structure which the synthesis tool can easily > exploit, as the high level information is there, while the P&R tool > would have to infer and recover. > > It is much the same way where, yes, you can do all the compiler > optimizations at the assembly level, but it is MUCH more > straightforward to do alot of it earler in the process, in the > intermediate form. > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### Message-ID: <3CAE6581.204DEBA2@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: hand placement References: <3CAD173C.71FB0077@attbi.com> <3CAD31A9.F50F3653@andraka.com> <3CADD533.5050706@synplicity.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 149 Date: Sat, 06 Apr 2002 03:02:57 GMT NNTP-Posting-Host: 68.15.41.165 X-Complaints-To: abuse@cox.net X-Trace: news1.east.cox.net 1018062177 68.15.41.165 (Fri, 05 Apr 2002 22:02:57 EST) NNTP-Posting-Date: Fri, 05 Apr 2002 22:02:57 EST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsfeed.icl.net!newsfeed.esat.net!priapus.visi.com!news-out.visi.com!hermes.visi.com!cox.net!news1.east.cox.net.POSTED!53ab2750!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16189 OK, Uncle. I admit that I already do these things in my designs, so perhaps I am not realizing the full benefit of Amplify. We do lots of gate level floorplanning in the code, and use the appropriate attributes to make sure our planning is not disrupted by the good intentions of the tools. Yes, we do a fair amount of structural netlisting...anything that gets used in more than one design or that is replicated in a design basically gets at least the registers structurally instantiated and RLOC's in the code. This is done hierarchically, so it is not nearly as onerous as it sounds...we get a lot of reuse out of some pretty basic modules and the modules created from them. Before you jump to the conclusion that my methodology is inefficient, I should mention that I do turn out somewhere around 15-20M equivalent gates/year, last year that was 8 major designs (V1000 and larger devices, all near top end of the clock rates for the device...several needed heatsinks on the FPGAs) and several smaller ones. I suppose it woud have been fairer for me to say I don't see the cost justified by the added value _for_the_type_of_designs_I_do. In any event thanks for the clarification on the capabilities of Amplify. I did look at it closely when it came out, and found that I was already getting the gains it claimed and more with my methodology. I think it likely is helpful for the average user who has only a foggy inkling of the innards of the FPGA, and not much patience for physical design. Ken McElvain wrote: > I think a more detailed description of what Amplify does is in order. > > The logical hierarchy of a design can be easily reorganized into > a good physical hierarchy without touching the source code. This > may not matter to some people who thought physically from the > beginning of their design and have the experience to know up front > what the organization should be. Part of this reorganization > capability includes the ability to replicate chunks of RTL objects. > You may find that replicating an FSM or counter into different parts > of the chip yields large improvements. > > Because the floorplan definition in Amplify is at an RTL level and we > worked very hard at making generated RTL object names repeatable even > with design changes, the RTL floorplan can survive design changes. > Gate level floorplans that don't stick to module boundaries often have > to be redone for even trivial design changes. (If you are doing > structural netlisting in your HDL and generating placements, then that > would be an exception). Amplify will also perform boundary optimization > on your floorplan which involve timing optimizations mixed with > placing results back into the regions. > > The current version of Amplify (3.0) includes a full detail placer for > regions that works in cooperation with timing optimization of the logic. > This feature is currently only available for Virtex/VirtexE. Detail > placement obviously gives Amplify much more accurate delay information. > > An experienced Amplify user can get substantial performance improvements > in 3-5 iterations through P&R. The first iteration gets you calibrated > and the following iterations are like a game of whack a mole where the > mole doesn't come back up again. > > Amplify obviously works best when the critical paths in your design are > through a chunks of RTL. If you are willing to do a structural design > and hand place everything, then it won't be useful to you. Most designs > aren't done that way. > > Ken McElvain CTO > Synplicity, Inc. > > Ray Andraka wrote: > > > Frankly, I don't see the cost justified by the marginal added value with > > amplify. You can do almost* everything it does with the area constraints in > > the floorplanner, plus with the floorplanner you can lock down some and area > > constrain other logic. > > > > *The one thing it does buy you is to take into consideration the layout > > while doing the synthesis. > > > > Like you mention, there was a time when we did floorplanning with graph > > paper. The GUI makes it easier, but I still use the graph paper method for > > doing placement in the source. > > > > > > Phil Hays wrote: > > > > > >>Jimmy Zhang wrote: > >> > >> > >>>Just keep hearing about this hand placement thing, don't know how it > >>>is done in reality. Does someone actually use their hands to do the > >>>placement as opposed to CAD based P&R. Any hints? > >>> > >>The first way I learned to do this was with a paper diagram of the > >>target chip, writing the constraints with a text editor, and coloring on > >>the paper to indicate what had been put where. I didn't do the best of > >>jobs (had an register reversed, with the msb where the lsb should be), > >>but it was still ~30% faster resulting clock speed than the automatic > >>placement. Made place and route times drop nicely as well. It was even > >>better than that once I got the twist removed. But this is as close to > >>"by hand" as I can picture. > >> > >>The floorplanner that Xilinx provides is just a nicely automated way of > >>doing the same sort of puzzle. Do the data path(s) first, fit things > >>together in a "logical" fashion, and for the first one floor plan at > >>least plan on spending some time fiddling. Some people seem to get this > >>skill right away, and some take longer. > >> > >>A slightly "higher level" way of gaining much of the benefit from > >>floorplanning with potentially rather less effort is to use a "physical > >>design" tool. Synplicity had the first ("Amplify") aimed at FPGA design > >>(and I'm not sure if Mentor, Synopsys or anyone else have anything in > >>this space yet), however there were physical design tools for ASIC > >>design long before Amplify. These work by putting large chunks of the > >>design into subsections of the target chip. > >> > >>Synopsys's ASIC physical design tool set: > >> > >>http://www.synopsys.com/products/phy_syn/phy_syn.html > >> > >>Amplify is at: > >> > >>http://www.synplicity.com/products/amplify.html > >> > >>-- > >>Phil Hays > >> > > > > -- > > --Ray Andraka, P.E. > > President, the Andraka Consulting Group, Inc. > > 401/884-7930 Fax 401/884-7950 > > email ray@andraka.com > > http://www.andraka.com > > > > "They that give up essential liberty to obtain a little > > temporary safety deserve neither liberty nor safety." > > -Benjamin Franklin, 1759 > > > > > > -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759 ###### From: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Sat, 6 Apr 2002 03:30:20 +0000 (UTC) Organization: Unknown Lines: 16 Message-ID: References: <3CAD3083.D2EA0B0C@andraka.com> <3CAE6202.4DECF164@andraka.com> NNTP-Posting-Host: soda.csua.berkeley.edu X-Trace: agate.berkeley.edu 1018063820 80650 128.32.247.226 (6 Apr 2002 03:30:20 GMT) X-Complaints-To: usenet@agate.berkeley.edu NNTP-Posting-Date: Sat, 6 Apr 2002 03:30:20 +0000 (UTC) Originator: nweaver@CSUA.Berkeley.EDU (Nicholas Weaver) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-out.spamkiller.net!propagator2-maxim!propagator-maxim!news-in.spamkiller.net!tethys.csu.net!news-hog.berkeley.edu!ucberkeley!agate.berkeley.edu!agate!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16199 In article <3CAE6202.4DECF164@andraka.com>, Ray Andraka wrote: >I'm still not following you here. If you do your design hierarchically and keep >in hierarchical in the edif netlist, what information is getting lost? You >still have the signal names with the bit numbering intact, you still have the >structure. Maybe I am missing your point. Well, there have been a few academically developed but uncommercialized mapping techniques which work better if you don't have to reinfer structure from the netlist, but actively combine synthesis and placement: eg, gama-mapping and Koch's datapath placement, which are convenient to do in a higher level form, or at least in the academic world. -- Nicholas C. Weaver nweaver@cs.berkeley.edu ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Mon, 08 Apr 2002 00:11:10 -0500 Organization: None Lines: 206 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> NNTP-Posting-Host: 1cust214.tnt76.chi5.da.uu.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1018242121 25361 67.195.182.214 (8 Apr 2002 05:02:01 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Mon, 8 Apr 2002 05:02:01 +0000 (UTC) X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16261 Philip Freidin wrote: > > > > > Not a very nice comment. > > Oh, come on! I know that other have written personal attacks > that were uncalled for, but this hardly counts. How much more > gentle could I have been? The info is available, it's in the > archive, anyone can see it. > When I tried to look up about PCILOGIC within news:comp.arch.fpga before trying to crack it, I couldn't find it, and only after I already figured it out using information from a posting at Opencores.org, I saw that it was discussed at this newsgroup months before. The posting you are referring to assumes that the user have access to FPGA Editor, and it didn't really give instructions on how to instantiate from an HDL design. For an ISE WebPACK user, instantiating from an HDL file is pretty much the only way to use it other than maybe from an ECS schematics tool. You might argue that instantiating from an HDL design is easy, and just declare a blackbox, but for most users, including myself two months ago, it wasn't so obvious. I believe you have more than 10 years experience dealing with FPGAs, but not everyone in this newsgroup has such experience, therefore, things that seem obvious to you might not be so obvious to other less experienced users. I hope you understand that. > > Well, that doesn't make it a secret. Just an undocumented feature > of the chip. Others have reverse engineered its functionality and > published the info in this news group. > Again, the posting you are referring to assumes that the user has access to FPGA Editor. > >I find it surprising that Eric Crabill of Xilinx who supposedly works > >with LogiCORE PCI at Xilinx publically admitting the equations inside. > > Typical Xilinx person being helpfull. > Although he isn't telling me how to use Bitgen's /Gclkdel option. > >Isn't that supposed to be a trade secret of Xilinx? > > Who supposes? Not much of a trade secret if it has been published. > > Since Xilinx tools even output a simulation model for it for > post P&R simulation, there is no way it could be considered a > secret. Just poorly (i.e. not at all) documented. > I guess I will agree with you that if a simulation model can be extracted from ngd2ver or ngd2vhd, after all, it is not a top secret, but Xilinx certainly doesn't make it obvious to most users on how to obtain the simulation model. Again, it might be obvious to you, because you are a lot more experienced than most users, but I don't believe it is obvious to most users including myself two months ago. Therefore, I will still call PCILOGIC a secret feature, but you probably won't agree with that, and neither will I agree with you that it is just another undisclosed feature. > > PCILOGIC may not be a huge secret to you, but a question about > >it seems to come up once every two months or so, and people who answer > >it keeps saying that it is a secret feature only Xilinx knows. > > Who said that ??? Show them too me !!! I'll give them the URL too. > I think you are getting too excited. (It seems like that to me.) Here are a few posters called PCILOGIC a "magic box." You might think it is obvious, but not too many people seem to know about it. http://groups.google.com/groups?hl=en&selm=u95o353k4qrq92%40corp.supernews.com http://groups.google.com/groups?hl=en&selm=a6tujt%24h4nlk%243%40ID-84877.news.dfncis.de When providing the URL to the people who didn't know much about PCILOGIC, I think you should provide the Opencore.org URL I used when I cracked it, and also tell them that a detailed FAQ answer is coming up shortly. http://www.opencores.org/forums/pci/2001/09/00003 > >Yes, if you ask a Xilinx employee about it, they won't tell you at all > >saying that it is an undisclosed feature. > > More likely, they just don't know. Xilinx probably has 2000 people, and > of them only 6 people (approx) know the details: > > The Product Planner > The I.C. designer > The I.C. test engineer > The SW QA engineer > The SW engineer that created the simulation model > The engineer in the IP group that created the PCI cores. > > If you talk to any of the other 1994 people, they will look at the > same info you have access to, and wont find an answer. Then they > might answer "It must be a secret". > Someone who posted a question a few weeks ago said he asked about PCILOGIC to a Xilinx applications engineer, but that person didn't tell him the details of it. You might be assuming that it is easy to talk to the person who knows the details of PCILOGIC, but in a company with several thousands of employees, I am sure it is virtually impossible to get a hold of the engineer who knows about it. Plus, it is a secret (Or you will call it undocumented.) feature, they have much less incentive telling people about it. Eric Crabill of Xilinx who knows about PCILOGIC just happened to be a regular poster of this newsgroup, that's just a coincidence because not all Xilinx employees post or reply to postings at this newsgroup. > Just like the secret that you can do higher quality designs with > schematics and hierarchial floorplanning. And timing based simulation > is unnecessary if you have done fully synchronous design and have 100% > coverage of timespecs with static timing analysis. > Wouldn't you still do Post P&R simulation to make sure the synthesis tool correctly synthesized the RTL code? I have seen a synthesis tool messing up synthesis, causing a crash when I plugged a Spartan-II PCI card with my PCI IP core in it. The RTL was fine, but the way I found what was going wrong was through doing a Post P&R simulation. Some outputs were going undefined, leading to a crash. I turned off several optimization options, and everything worked fine. Ever since that experience, I always do a Post P&R simulation before burning a Configuration PROM. > > >Where should I send all the information? > > Go to http://www.fpga-faq.com/FAQ_Root.htm , and at the bottom of the > page, down load the template of a FAQ page. Then fill in the page the > way you would like it to be, and email it to philip@fpga-faq.com > If you need help doing this, let me know. > > It will be published for all to see in the FAQ, and will not be a > secret anymore. You may want to look at some other FAQ pages, to get a > feel for the current style. > > This is also an open invitation to everyone else to write some FAQ > pages. It is only as good as the sum of the contributions !! > > Philip Freidin > > Philip Freidin > Fliptronics That will be great. PCILOGIC should no longer be secret to anyone who wants to know about it. I started writing the text, and it will take a few more days to finish it. When I am done, I will send you a copy of it, but it will be in a regular text format. (Not HTML.) Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Mon, 08 Apr 2002 01:27:11 -0500 Organization: None Lines: 126 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> <3CAE1BC6.5FE0A871@xilinx.com> NNTP-Posting-Host: 1cust229.tnt89.chi5.da.uu.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1018246681 26117 67.195.71.229 (8 Apr 2002 06:18:01 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Mon, 8 Apr 2002 06:18:01 +0000 (UTC) X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16267 Eric Crabill wrote: > > Hi, > > Kevin Brace wrote: > > I find it surprising that Eric Crabill of Xilinx who supposedly works > > with LogiCORE PCI at Xilinx publically admitting the equations inside. > > Isn't that supposed to be a trade secret of Xilinx? > > I don't supposedly work at Xilinx, I actually work at Xilinx, in the IP > Solutions Group, developing IP. I am, in a tangential way, associated > with the PCI and PCI-X cores... > Sorry, I didn't word it too well. What I wanted to say was, "Eric Crabill of Xilinx who supposedly works with LogiCORE PCI there." > I wouldn't call it a trade secret. You may find the general invention > patented and assigned to Xilinx, however. I will be very interested if such a circuit (After all, PCILOGIC is just a tiny circuit with a few NAND gates.) can be patented. Or are you saying that the concept of CE (Clock Enable) was first patented by Xilinx? What do you mean when you say "general invention," and which patents are you referring to? > The implementation in Virtex, > Spartan-II, Virtex-E, and Spartan-IIE is the same and intended to assist > implementation of PCI cores in these devices -- the logic implemented is > quite literally a tiny part of our core cast into silicon (and, in the > grand scheme of things, immaterial; the real advantage of this "feature" > is the dedicated routing associated with it). > > This feature is, however, undocumented, unsupported, and not intended > for > general use. It is supported in the context of the Xilinx PCI LogiCORE. > The feature was put in the silicon by the request of the PCI Development > team, for use by the PCI Development team. > > If you use it in your own designs, that is fine. However, if you run > into > problems/issues, you are on your own -- the feature is undocumented and > unsupported. > > If you were to file a case with the Support Hotline, they > probably won't be able to help you directly. Such a case would most > likely be forwarded to me, and I would write back, "This feature is > undocumented, unsupported, and not intended for general use. Sorry." > I am fine with not supporting it. I just wish I knew more about how to use NGD2VER, NGD2VHD, and declaring a blackbox in a design earlier. > > I prefer to think of it as a magic box. In fact, the instance name of > it > in our core is "MAGICBOX". Maybe that is where the notion came from... > Yes, calling it "MAGICBOX" in LogiCORE PCI is probably the reason why people started to call it a "magic box." I guess it is indeed a magic box if that solves the timing issues of long unregistered signal paths (IRDY# and TRDY# paths towards AD[63:0]) in 66MHz PCI. However, the delay of unregistered paths going through it seems large. (Tpcilog ~= 1.6ns for IRDY and TRDY in XC2S150-5). > Anyone who is really interested can determine the logic function using > the publically available tool set. Someone else did that. I appreciate that person. > Likewise, if you are very curious > about the GCLKDEL option, you can experimentally determine what it does. > > Eric Okay, here is what I got so far. When I set the /Gclkdel option to 00000, I got an error that the design won't function. In other cases besides 11111 (default value), Bitgen doesn't say anything. That makes it pretty hard to figure out what the values mean. My guess is that smaller the value, greater the delay. When you say, "you can experimentally determine what it does.," do you mean like I have to put some kind value, and see if the PCI card will crash to determine the approximate delay it inserts? Won't that be fairly risky considering silicon variation of actual parts being used? Also, the delay added by using /Gclkdel doesn't get reflected during static timing analysis. Why isn't the delay added before static timing analysis? As an alternative to /Gclkdel, I have come up with an idea of tying two adjacent GCLKBUF to create some extra global clock buffer delay. How does this approach compared to /Gclkdel option, and is it more desirable than /Gclkdel option? Tying up two GCLKBUF creates about 1.0ns of extra delay. Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) ###### Reply-To: "Steve Casselman" From: "Steve Casselman" Newsgroups: comp.arch.fpga References: <1017961909.26884.0.nnrp-01.9e9832fa@news.demon.co.uk> <3CACFBB3.1C73@designtools.co.nz> Subject: Re: hand placement Lines: 46 Organization: Virtual Computer Corporation X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Message-ID: NNTP-Posting-Host: 64.174.106.246 X-Complaints-To: abuse@prodigy.net X-Trace: newssvr14.news.prodigy.com 1018289709 ST000 64.174.106.246 (Mon, 08 Apr 2002 14:15:09 EDT) NNTP-Posting-Date: Mon, 08 Apr 2002 14:15:09 EDT X-UserInfo1: FKPO@SBEQJUMA]DYMBCD^VX@WB]^PCPDLXUNNHLHEQR@ETUCCNSKQFCY@TXDX_WHSVB]ZEJLSNY\^J[CUVSA_QLFC^RQHUPH[P[NRWCCMLSNPOD_ESALHUK@TDFUZHBLJ\XGKL^NXA\EVHSP[D_C^B_^JCX^W]CHBAX]POG@SSAZQ\LE[DCNMUPG_VSC@VJM Date: Mon, 08 Apr 2002 18:15:09 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!deine.net!fr.clara.net!heighliner.fr.clara.net!news.tele.dk!small.news.tele.dk!207.115.63.138!newscon04.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr14.news.prodigy.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16232 I hate to keep talking about my patents but... The first patent http://www.delphion.com/details?pn10=US05684980 is about that. I call it the runtime generation patent. It is really an exercise on how to put together and program a bunch of field programmable gates to be a computer system. Steve Casselman "Jim Granville" wrote in message news:3CACFBB3.1C73@designtools.co.nz... > Steve Casselman wrote: > > > > I took the cost function put it in hardware and ran the database past it > > several times. The cost function accounted for 30% of the placer > > performance. That part of the placer took about 1/3 of a xc4010. From my > > analysis I concluded that ppr could be speed up by 10x and would take about > > 50K gates. This holds to the normal 90/10 rule. Of course Xilinx was moving > > over to par at the time and they concluded that they didn't need the > > speedup. After spending a lot of time with the code I'm convinced that P&R > > is a sure bet for acceleration. Now with the PPC and Virtex II I'm sure that > > over all speedups of 8-10x would be pretty straight forward. I estimate > > about 2 man years of work and a design with 4-8 gig on board would do it. > > > > Steve > > If I have this right, you are talking about using a VirtexPRO as an > engine to route VirtexPRO (et al) ?. > > This becomes the silicon equivalent of the 'compiler bootstrap' :-) > > Maybe it's also a problem, the 'solution of 4 PPCs' is looking for ? > > Xilinx could sell route-boxes, and it would make a pretty impressive > product demonstrator... > > -jg ###### From: Eric Crabill Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Tue, 09 Apr 2002 11:48:16 -0700 Organization: Xilinx, Incorporated Lines: 58 Message-ID: <3CB33770.FD1492F9@xilinx.com> References: <3CACEF37.DCA68151@andraka.com> <3CAE1BC6.5FE0A871@xilinx.com> NNTP-Posting-Host: 149.199.14.189 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.7 [en]C-CCK-MCD (WinNT; U) X-Accept-Language: en,pdf Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!wn14eed!wn1feed!worldnet.att.net!204.127.198.203!attbi_feed3!attbi_feed4!attbi.com!12.120.28.17!attla2!attla1!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16285 Hi Kevin, > I will be very interested if such a circuit (After all, PCILOGIC > is just a tiny circuit with a few NAND gates.) can be patented. > Or are you saying that the concept of CE (Clock Enable) was first > patented by Xilinx? I am not saying either; I am talking about US6292020, which covers "low skew programmable control routing for a programmable logic device" which is certainly a rather obscure topic, but probably relevant to this discussion. > However, the delay of unregistered paths going through it seems large. > (Tpcilog ~= 1.6ns for IRDY and TRDY in XC2S150-5). I think you are not considering that this number includes both the logic delay (equivalent to a LUT + MUXF5) and the input routing delay on the IRDY# and TRDY# signals. It is respectable... The real advantage is the dedicated routing of the output net. That is what saves time. Regarding the gclkdel option: > When you say, "you can experimentally determine what it does.," do you > mean like I have to put some kind value, and see if the PCI card will > crash to determine the approximate delay it inserts? No, I mean you can make a simple test design to measure the clock to out of a flip flop, place and route it, then generate 31 bitstreams (each of the valid gclkdel options). Take all 31 into the lab, and measure the clock to out. The differences you observe will likely be related to the use of that option. At room temperature, at nominal VCC, on that one device you happen to be measuring. > Also, the delay added by using /Gclkdel doesn't get reflected during > static timing analysis. Why isn't the delay added before static timing > analysis? Probably because this "feature" is not intended for general use (you may re-read my boilerplate about "unsupported, undocumented" but is intended for use with the Xilinx PCI core for 66 MHz designs, and in that context only. > As an alternative to /Gclkdel, I have come up with an idea of > tying two adjacent GCLKBUF to create some extra global clock buffer > delay. How does this approach compared to /Gclkdel option, and is > it more desirable than /Gclkdel option? Tying up two GCLKBUF creates > about 1.0ns of extra delay. Your approach is a valid way to add delay, but have you considered: 1. What it does to the clock to out performance? 2. What it does to the input hold (0 ns) requirements? It may fix your input setup problems, but break something else... Eric ###### From: Kevin Brace Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Wed, 10 Apr 2002 17:18:28 -0500 Organization: None Lines: 239 Sender: kevinbraceusenet@hotmail.com Message-ID: References: <3CACEF37.DCA68151@andraka.com> <3CAE1BC6.5FE0A871@xilinx.com> <3CB33770.FD1492F9@xilinx.com> NNTP-Posting-Host: 1cust70.tnt76.chi5.da.uu.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: newsreader.mailgate.org 1018476550 13340 67.195.182.70 (10 Apr 2002 22:09:10 GMT) X-Complaints-To: abuse@mailgate.org NNTP-Posting-Date: Wed, 10 Apr 2002 22:09:10 +0000 (UTC) X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!newsreader.mailgate.org!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16403 Eric Crabill wrote: > > Hi Kevin, > > > I will be very interested if such a circuit (After all, PCILOGIC > > is just a tiny circuit with a few NAND gates.) can be patented. > > Or are you saying that the concept of CE (Clock Enable) was first > > patented by Xilinx? > > I am not saying either; I am talking about US6292020, which covers > "low skew programmable control routing for a programmable logic device" > which is certainly a rather obscure topic, but probably relevant to > this discussion. > Eric, I have done patent search on patents assigned to Xilinx in the past, but I totally missed yours . . . The essence of your patent seems to be that a small logic block on each side of the chip can control all the IOBs of that side. I only saw part of the patent because Delphion no longer lets me see the actual image of the patent without paying them. I also tried USPTO website, but I didn't have the correct plug-in installed in my computer, so I couldn't see all the images. I guess I will try a computer at a library later. However, I find it interesting that PCILOGIC existed from the first Virtex which was released in 1998, but the PCILOGIC patent (Or the general concept of the patent.) wasn't filed until August 2000. Why the delay, although the patent didn't seemed to get stuck at USPTO for years? Anyhow, I will call US Patent 6,292,020, "the PCILOGIC patent", and will be mentioned in a FPGA FAQ about PCILOGIC I am writing right now. > > However, the delay of unregistered paths going through it seems large. > > (Tpcilog ~= 1.6ns for IRDY and TRDY in XC2S150-5). > > I think you are not considering that this number includes both the logic > delay (equivalent to a LUT + MUXF5) and the input routing delay on the > IRDY# and TRDY# signals. It is respectable... The real advantage is > the > dedicated routing of the output net. That is what saves time. > I didn't realize until yesterday, but when IRDY# or TRDY# go through PCILOGIC, their input delay name won't be called Tiopi, but Tiopci instead. In a XC2S150-6CPQ208, Tiopi is 0.664ns, but Tiopci is 0.538ns, so that helps the PCI_CE (Clock Enable) line of the chip. You are right that the Tpcilog I am talking about includes routing delay from IRDY or TRDY to PCILOGIC and the gate delay through it because the gate delay between the pin and PCILOGIC is always 0ns. You are also right that Tpcilog of 1.352 ns in a XC2S150-6CPQ208 is probably going to be better than the routing delay to a 5-input LUT (XST doesn't seem to infer a 5-input LUT for my PCILOGIC emulation logic, so it gets broken into several 4-input LUTs, making the matter worse . . . ) placed right next to the pin. Once I compared the routing delay between PCILOGIC or the emulated one to AD's CE input, and the PCILOGIC's routing delay was far less than the emulated one. However, the thing I was disappointed was, why is the gate delay through one NOT gate (Can be a NAND gate acting as an inverter), one 2-input NAND gate, and one 3-input NAND gate with the routing delay from the pin to PCILOGIC is 1.352 ns? (NAND_IRDY = !(!IRDY * !I1), NAND_TRDY = !(!TRDY * !I3), PCI_CE = !(!I2 * NAND_IRDY * NAND_TRDY)) Why is the delay through those NAND gates that large, and is that normal for a chip fabricated in a 0.18/0.22u process? > Regarding the gclkdel option: > > > When you say, "you can experimentally determine what it does.," do you > > mean like I have to put some kind value, and see if the PCI card will > > crash to determine the approximate delay it inserts? > > No, I mean you can make a simple test design to measure the clock to out > of a flip flop, place and route it, then generate 31 bitstreams (each of > the valid gclkdel options). Take all 31 into the lab, and measure the > clock to out. The differences you observe will likely be related to the > use of that option. At room temperature, at nominal VCC, on that one > device you happen to be measuring. > Okay, you are right, there are always ways to figure something out a fairly simple way. However, the lack of oscilloscope will prevent me from experimenting with it. Won't I need a few GHz oscilloscope to accurately observe the delay? > > Also, the delay added by using /Gclkdel doesn't get reflected during > > static timing analysis. Why isn't the delay added before static timing > > analysis? > > Probably because this "feature" is not intended for general use (you may > re-read my boilerplate about "unsupported, undocumented" but is intended > for use with the Xilinx PCI core for 66 MHz designs, and in that context > only. > I guess I was right (obvious) that /Gclkdel's delay information won't be reflected during static timing analysis. I will guess again that if /Gclkdel information was used to calculate setup/hold/output time during a static timing analysis, it won't be a secret at all just like the way some people figured it out how to obtain a simulation model of PCILOGIC. I am aware of the "unsupported and undocumented" nature of these features though. Despite being unsupported, yesterday I fired up my PCI IP core with PCILOGIC, and it worked perfectly fine in two computers I tested it. (Intel chipset and SiS chipset.) Although all the testing I did was with single cycle I/O and Configuration cycles, and not burst memory cycles where the PCILOGIC shines. So, I guess PCILOGIC works just like the simulation model. > > As an alternative to /Gclkdel, I have come up with an idea of > > tying two adjacent GCLKBUF to create some extra global clock buffer > > delay. How does this approach compared to /Gclkdel option, and is > > it more desirable than /Gclkdel option? Tying up two GCLKBUF creates > > about 1.0ns of extra delay. > > Your approach is a valid way to add delay, but have you considered: > I wasn't sure if my approach was a correct one, but if you are going to say it is a valid way to add delay, I feel a little better. > 1. What it does to the clock to out performance? When only one GCLKBUF is used, the global clock buffer delay of a XC2S150-6CPQ208 is about 1.5ns. Although the 1.5ns number can change depending on how many FFs are actually being used, and when a different chip is used (Will get larger if a bigger chip is used.). When I tie two adjacent GCLKBUFs together (In my case, let the clock signal enter GCLKPAD3, go through GCLKBUF3, and then to GCLKBUF2) that creates about 1.0ns of extra global clock buffer delay. Fortunately, Spartan-II-6's PCI66_3 output buffer is very fast, so when only one GCLKBUF is used, the worst Clock-to-Output delay (Tco or Tval) is about 4.6ns. When two GLCKBUFs are tied together to get the extra 1.0ns of delay, Tval is still about 5.6ns. Since 66MHz PCI's Tval is < 6ns, I can still give up another 0.4ns in theory to help the setup time, but the clock delay is not really adjustable, so I will be happy with whatever I get. (I got 1.0ns of extra time by tying two GCLKBUFs, but I prefer getting a little more. Additional 0.3ns will help a lot, but it won't happen unless I use a bigger chip.) When going from XC2S150-6CPQ208 to XC2S200-6CPQ208, I saw about 0.5ns of increase in global clock delay, because the chip size got larger. (More loading on the global clock line.) If the current trend continues, at XCV1000 (The largest FPGA that can handle 5V PCI.), Tval can be about 5.8ns or 5.9ns, barely meeting the Tval < 6ns requirement, but that is only my guess because I don't have ISE Foundation. So, my own theory is that a bigger chip is probably easier to meet 66MHz PCI timings than a smaller one. (In a typical design, a smaller chip runs faster than a bigger chip, but 66MHz PCI seems like an exception.) I can only imagine what goes on at Xilinx, but whoever worked on the IOB output part must have worked really hard to keep Clock-to-Ouput really low, so that the Tco margin (6.0ns - 4.6ns = 1.4ns) can be given up to help meet 66MHz PCI's stringent Tsu < 3ns requirement. > 2. What it does to the input hold (0 ns) requirements? > > It may fix your input setup problems, but break something else... > > Eric I believe I already got hold time issue under control. IOB input FFs' programmable delay seem to be more than two GCLKBUFs + clock distribution delay. Therefore hold time won't be an issue there. However, I cannot always rely on those input FFs for some signal paths, and in that case, I have to place some FFs far away from the pin to create routing delay. So, my "tie two GCLKBUFs together" scheme doesn't cause problems regarding Tco and Th (Hold Time). fmax is also not a big issue either. Therefore, Tsu < 3ns is pretty much the only problem here. If I calculate the Tsu I got, I have 3.0ns (Tsu of 66MHz PCI.) + 1.5ns (The normal clock distribution delay.) + 1.0ns of extra global clock buffer delay. However, even with 5.5ns to 5.6ns of total Tsu, meeting that number is very hard in a XC2S150-6CPQ208. The key to meeting the total Tsu of 5.6ns seems to keep the levels of 4-input LUTs to below certain number and using floorplanner to group relevant LUTs together within a CLB. Yes, I did that, but . . . I still got 19 paths not meeting the requirement. The best timing score I got so far was 4,909, so I am close, but the design cannot seem to make it . . . Eric, you probably picked the pin out for XC2S150-6CPQ208 so you probably know what I am talking about, but how come REQ# and GNT# pins not placed near the rest of the control signals? (i.e., FRAME#, IRDY#, DEVSEL#, TRDY#, STOP#, and PAR. Around P23 through P34.) That choice seems okay for 33MHz PCI, but not for 66MHz PCI. (Okay, I guess you can say no one intended to do 66MHz PCI in a PQ208 package.) REQ#'s outcome depends on FRAME# and IRDY#, and those two signals have to travel very long distances, therefore, there is virtually no chance meeting the timing requirements of 66MHz PCI. I guess I will have to use an FG456 version of Spartan-II to have any chance (even a very slim chance) to meet 66MHz PCI's Tsu because that one will likely have a better chance of REQ# and GNT# pins being close to rest of the control pins But I haven't tried that out yet because I cannot seem to get the Xilinx LogiCORE PCI pin out for Spartan-II FG456 package without paying something. Kevin Brace (In general, don't respond to me directly, and respond within the newsgroup.) ###### From: Eric Crabill Newsgroups: comp.arch.fpga Subject: Re: hand placement Date: Wed, 10 Apr 2002 16:16:50 -0700 Organization: Xilinx, Incorporated Lines: 114 Message-ID: <3CB4C7E2.AE1F9B76@xilinx.com> References: <3CACEF37.DCA68151@andraka.com> <3CAE1BC6.5FE0A871@xilinx.com> <3CB33770.FD1492F9@xilinx.com> NNTP-Posting-Host: 149.199.14.189 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.7 [en]C-CCK-MCD (WinNT; U) X-Accept-Language: en,pdf Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.mathworks.com!nycmny1-snh1.gtei.net!cambridge1-snf1.gtei.net!news.gtei.net!bos-service1.ext.raytheon.com!attla1!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16372 Hi, > The essence of your patent seems to be that a small logic block > on each side of the chip can control all the IOBs of that side. I think it is more about low skew distribution of control signals along the sides of the programmable array. As you have no doubt figured out, in bus interfaces like PCI, the I/O timing is the tough part and what makes it even more difficult is that there are both minimum (hold) and maximum (setup) delays. The timing needs to fall in a certain "window" for it to be correct. > Why the delay, although the patent didn't seemed to get stuck > at USPTO for years? There is some "time limit" between public disclosure and when you can no longer file. I'm not a laywer. However, I did check this with our legal department before bothering to file. > I didn't realize until yesterday, but when IRDY# or TRDY# go > through PCILOGIC, their input delay name won't be called Tiopi, > but Tiopci instead. I think you mentioned that you don't have FPGA Editor, but if you were to look at these special IOBs, you would notice that they are different from most other IOBs. They are called PCIIOBs, and are used for the special pins on the left and right sides. They have a "direct" output that goes to the PCILOGIC without going through any switch boxes. > Why is the delay through those NAND gates that large, and is > that normal for a chip fabricated in a 0.18/0.22u process? You need to be aware of the difference between the actual silicon and a model of the silicon. All of the stuff you see in the tools, and the speedfiles, is a software model of the silicon. From a modeling point of view, does it matter if: Tpcilogic = 2 ns pci_ce route = 2 ns -or- Tpcilogic = 1 ns pci_ce route = 3 ns The answer is no, it does not, if you cannot use these separately. The sum of the path is what is important. The individual timing parameters don't make a difference in this case. Another thing to consider is people building the models may elect to simplify the model if it makes sense. For example, those I1, I2, I3 inputs. I doubt they all have the same propagation delay to the output of the PCILOGIC block. But they are probably modeled that way, using one "worst case" delay. > Won't I need a few GHz oscilloscope to accurately observe the > delay? My example was only that. You can craft all sorts of test patterns, using multiple global buffers, which could exhibit four times the delay (might be easier to measure). > So, my own theory is that a bigger chip is probably easier to meet > 66MHz PCI timings than a smaller one. (In a typical design, a > smaller chip runs faster than a bigger chip, but 66MHz PCI seems > like an exception.) As the array size gets bigger: * setup gets "easier" to meet * hold gets "harder" to meet * clock to out gets "harder" to meet > I believe I already got hold time issue under control. IOB > input FFs' programmable delay seem to be more than two GCLKBUFs > + clock distribution delay. Therefore hold time won't be an > issue there. The Virtex/Spartan-II datasheet guarantees zero hold time for input flip flops in IOBs only under the condition that the input delay buffer is enabled, and that you are using one global buffer. If you do anything else, that guarantee goes out the window... The only other thing I'm aware of which quotes hold times is the "pin to pin" datasheet style report that comes out of the timing analyzer. You may want to check that. > However, I cannot always rely on those input FFs for some > signal paths, and in that case, I have to place some FFs > far away from the pin to create routing delay. Yes, this is a general problem and you need to account for it when you design a PCI interface. > but how come REQ# and GNT# pins not placed near the rest > of the control signals? (i.e., FRAME#, IRDY#, DEVSEL#, > TRDY#, STOP#, and PAR. Around P23 through P34.) > That choice seems okay for 33MHz PCI, but not for 66MHz > PCI. (Okay, I guess you can say no one intended to do > 66MHz PCI in a PQ208 package.) There is a two way tradeoff (at least) when you are picking a pinout: 1. Pin ordering to match external edge connector 2. Pin ordering to maximize internal performance It's my guess that whoever picked the pinout elected to optimize this pinout for #1, probably thinking that it was not going to be used for a 66 MHz design. Hope that helps, Eric ###### Reply-To: "Steve Casselman" From: "Steve Casselman" Newsgroups: comp.arch.fpga References: <3CACEF37.DCA68151@andraka.com> <3CAE1BC6.5FE0A871@xilinx.com> <3CB33770.FD1492F9@xilinx.com> <3CB4C7E2.AE1F9B76@xilinx.com> Subject: Re: hand placement Lines: 27 Organization: Virtual Computer Corporation X-Newsreader: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Message-ID: NNTP-Posting-Host: 64.174.106.246 X-Complaints-To: abuse@prodigy.net X-Trace: newssvr14.news.prodigy.com 1018484267 ST000 64.174.106.246 (Wed, 10 Apr 2002 20:17:47 EDT) NNTP-Posting-Date: Wed, 10 Apr 2002 20:17:47 EDT X-UserInfo1: SCSYASBEQJUMA]DYMBCD^VX@WB]^PCPDLXUNNHLHEQR@ETUCCNSKQFCY@TXDX_WHSVB]ZEJLSNY\^J[CUVSA_QLFC^RQHUPH[P[NRWCCMLSNPOD_ESALHUK@TDFUZHBLJ\XGKL^NXA\EVHSP[D_C^B_^JCX^W]CHBAX]POG@SSAZQ\LE[DCNMUPG_VSC@VJM Date: Thu, 11 Apr 2002 00:17:47 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!fr.clara.net!heighliner.fr.clara.net!news.tele.dk!small.news.tele.dk!207.115.63.138!newscon04.news.prodigy.com!newsmst01.news.prodigy.com!prodigy.com!postmaster.news.prodigy.com!newssvr14.news.prodigy.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:16365 There is a one year time limit between the time 1) it is disclosed publicly or 2) offered for sale. I pretty sure you have to sign some papers to the effect. You should really watch out about this I just read where the inventor of the blue led said in court that "he told some lies in his patent application" and now he might be put on trial for that (either Nature or Science this week). Steve -- from 35 USC - Patent Laws uspto-- CHAPTER 10 - PATENTABILITY OF INVENTIONS Sec. 100 Definitions. (b)the invention was patented or described in a printed publication in this or a foreign country or in public use or on sale in this country,more than one year prior to the date of the application for patent in the United States,or > There is some "time limit" between public disclosure and when you > can no longer file. I'm not a laywer. However, I did check this > with our legal department before bothering to file.