From: "Joel Kolstad" Newsgroups: comp.arch.fpga Subject: Bad Xilinx bitstream=big bang? Lines: 71 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Message-ID: Date: Sat, 03 Mar 2001 02:45:35 GMT NNTP-Posting-Host: 63.53.248.157 X-Complaints-To: abuse@earthlink.net X-Trace: newsread2.prod.itd.earthlink.net 983587535 63.53.248.157 (Fri, 02 Mar 2001 18:45:35 PST) NNTP-Posting-Date: Fri, 02 Mar 2001 18:45:35 PST Organization: EarthLink Inc. -- http://www.EarthLink.net Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:4943 We have some PCI cards at work that we recently upgraded -- for both price and performance reasons -- to use Xilinx XCV600E parts instead of the XCV400 parts that the old board used. We've found out the hard way that Very Bad Things happen if you attempt to load the old bitstream for the XCV400 onto the XCV600E board. In particular: -- In the PC that we initially bring boards up on, the power supply usually shuts down and requires hard power cycling (pulling the plug) to get the PC to come back to life. On a PC power supply this behaviors indicates a severe overcurrent condition -- I've seen it happen before when a power rail on the PC has been inadvertently grounded. The power supply in this PC is 145 watts (standard microATX power supply). -- On another machine, the 1.8V power supply on our PCB blew up. :-) This machine is used to test full systems (lots-o-PCI cards), and as such has a 450 watt ATX power supply. Hence it would appear that loading the XCV400 bitstream into an XCV600E causes a rather extreme overcurrent condition, and the 145W power supply just gives up the ghost and shuts down, whereas the 450W power supply watches and laughs as our board blows up instead. Power to this board is achieved as follows: We tap 12V off a floppy drive connector and feed it to a Power Trends "Big Hammer" 3.3V switching power supply module; this module can produce something like 18A at 3.3V. The 3.3V powers almost everything on the board, including the PCI interface IC we're using (PLX 9054) that has its local bus connected to the FPGA. The FPGA power supply is the same on the new board as the old (there was very little changed from the old board to new boards), except that it's setup for 1.8V instead of 2.5V, of course. The power supply consists of a Linear Technology LT1575 linear regulator controller controlling two paralleled International Rectifier N channel MOSFETs. The LT1575 obtains gate drive for the FETs from the 12V input, and the FETs themselves have the 3.3V rail connected to their drains and therefore drop 1.5V to obtain 2.5V. Note that we don't connect _any_ power pins from the PCI bus; we required significantly more 3.3V current that the PCI bus is rated to carry. All the ground pins are connected, of course. On the 450W machine, both FETs literally blew up, making decent sized craters on top of their SO-8 packages. Oops. Now what I want to know is... I had always thought that there was some CRC checking in the FPGA bitstream files, and that you could pretty much feed the FPGA random gibberish and be very unlikely to actually get the thing to accept the bitstream and go through power-on initialization. In fact, we're manually bit-banging the CClk line on the FPGA (we're using serial slave mode), so there aren't even enough clock pulses provided to the 600E to make it think it should even _consider_ going through power-on initiailization, since the 600E requires about two hundred thousand extra bits (and CClks) than what the 400 file would provide it with (we stop generating CClk when we're out of configuration data bits). With our current setup it's difficult to probe around on the board and try to figure out exactly _when_ the overcurrent condition starts. Loading the 400 file takes a couple of seconds, and the 145W PC will power down within a couple of seconds after that. My suspicion is that the overcurrent condition has already started long before we're gotten anywhere near to finishing the transmission of the 400's bitstream. So... does anybody have any experience with this? The possibility that feeding a 600E a 400 bitstream causes it to draw massive currents seems awfully remote to me. The LT1575/dual FET power supply can put out 2A all day long (this was its design goal -- we're dissipating 1.5V*2A=3W or 1.5W/FET in this case), I would wager it can put out 4A for many minutes, and to physically blow up both FETs I would have to think that it's passing at least 10A for a little while. Strange, very strange. ---Joel Kolstad ###### Message-ID: <3AA08CCD.341AA107@earthlink.net> From: Peter Alfke Reply-To: palfke@earthlink.net X-Mailer: Mozilla 4.61 (Macintosh; I; PPC) X-Accept-Language: en,pdf MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353" Content-Transfer-Encoding: 7bit Lines: 87 Date: Sat, 03 Mar 2001 06:20:02 GMT NNTP-Posting-Host: 209.179.246.220 X-Complaints-To: abuse@earthlink.net X-Trace: newsread1.prod.itd.earthlink.net 983600402 209.179.246.220 (Fri, 02 Mar 2001 22:20:02 PST) NNTP-Posting-Date: Fri, 02 Mar 2001 22:20:02 PST Organization: EarthLink Inc. -- http://www.EarthLink.net Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.ip-plus.net!news.tesion.net!news.belwue.de!news.uni-ulm.de!rz.uni-karlsruhe.de!schlund.de!newsfeed01.sul.t-online.de!newsfeed00.sul.t-online.de!t-online.de!colt.net!dispose.news.demon.net!demon!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread1.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:4939 Joel, I am very sorry about your mishap, and I have inserted my comments in the appropriate places in your story. The gist is: Virtex parts do check for CRC errors, but not for formatting errors. And you sent a legitimately CRC-protected file, just the wrong format... Horrendous amount of internal contention. Joel Kolstad wrote: > We have some PCI cards at work that we recently upgraded -- for both price > and performance reasons -- to use Xilinx XCV600E parts instead of the XCV400 > parts that the old board used. We've found out the hard way that Very Bad > Things happen if you attempt to load the old bitstream for the XCV400 onto > the XCV600E board. In particular: > > -- > > Now what I want to know is... I had always thought that there was some CRC > checking in the FPGA bitstream files, and that you could pretty much feed > the FPGA random gibberish and be very unlikely to actually get the thing to > accept the bitstream Correct. If there were a CRC error, the part would recognize this. But there was no CRC error... > and go through power-on initialization. No, no. power-on initialization is done much earlier, right after you applied Vcc. This has nothing to do with CCLK. The parts use their own internal oscillator for that purpose. > In fact, we're > manually bit-banging the CClk line on the FPGA (we're using serial slave > mode), so there aren't even enough clock pulses provided to the 600E to make > it think it should even _consider_ going through power-on initiailization, see above. It has done this sucessfully long before. > > since the 600E requires about two hundred thousand extra bits (and CClks) > than what the 400 file would provide it with (we stop generating CClk when > we're out of configuration data bits). > > With our current setup it's difficult to probe around on the board and try > to figure out exactly _when_ the overcurrent condition starts. Loading the > 400 file takes a couple of seconds, and the 145W PC will power down within a > couple of seconds after that. My suspicion is that the overcurrent > condition has already started long before we're gotten anywhere near to > finishing the transmission of the 400's bitstream. Yes. The internal logic becomes active as you feed in the data. ( I may stand corrected here. I carry too much XC4000 bagage in my head, and I am at home, no access to other experts. But Austin can jump in, while I am gone for the coming week. Seminars in Europe) > > So... does anybody have any experience with this? The possibility that > feeding a 600E a 400 bitstream causes it to draw massive currents seems > awfully remote to me. No, it's ugly, but not surprising. The part considers this a garbage bitstream , but with legitimate CRC. I know this is not ideal, but that's the way it is. > The LT1575/dual FET power supply can put out 2A all > day long (this was its design goal -- we're dissipating 1.5V*2A=3W or > 1.5W/FET in this case), I would wager it can put out 4A for many minutes, > and to physically blow up both FETs I would have to think that it's passing > at least 10A for a little while. Strange, very strange. Not so strange. Consider the very large number of internal nodes, let's say over 50,000. Let's assume that, through nonsense configuration, 10% are driven by contending levels on both sides of the wire. And let's assume a realistic 5 mA per contention: 5000 times 5 mA = 25 A ! This distributed nature of the current also shows why the Virtex part ( most likely ?) survived. The current is more or less evenly spread over the whole die, which is more than a square centimeter in area. I am not making excuses, just describe the phenomenon, which is quite rational, albeit not desirable. Ask Austin whether Virtex-II is protected against this kind of mishap. Peter Alfke, Xilinx Applications (Friday-night emergency services) ###### Message-ID: <3AA1318A.F2D8E5EF@aracnet.com> From: eteam Reply-To: eteam@aracnet.com Organization: The E-Team X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: <3AA08CCD.341AA107@earthlink.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 39 Date: Sat, 03 Mar 2001 10:01:46 -0800 NNTP-Posting-Host: 198.102.179.71 X-Complaints-To: news@aracnet.com X-Trace: typhoon.aracnet.com 983642505 198.102.179.71 (Sat, 03 Mar 2001 10:01:45 PST) NNTP-Posting-Date: Sat, 03 Mar 2001 10:01:45 PST Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.he.net!typhoon.aracnet.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:4936 Peter (and Austin and Phil), Here is my take (*please* correct me where I'm mistaken): It sounds like there are two weaknesses in (at least) some of the Xilinx device families that can lead to catastrophic failures: 1. Devices don't check the programming data stream for a "match" of target device. Thus, you can try to program a Virtex 600E device with a Virtex 400E configuration data stream, and the configuration data will be accepted. This "hole" allows the designer to make a mistake, and be burned by it. The workaround is, simply, to make sure that your configuration files were compiled for the correct target device; you screw up at your own peril. 2. More ominous is that drivers for internal multi-source busses are not disabled (tri-stated, if you will) before and during the configuration and powerup sequence, when the internal state of the device cannot be controlled or specified by the designer. I'm not sure there is *any* workaround to this, short of a re-design of the FPGA die. We need to understand the breadth of this problem (if the above assessments are basically correct): which device families are affected (afflicted), etc. etc. I'm not posting this to cause alarm, but to distill the issues at hand as clearly as possible, and avoid any FUD. Rather than get excited, it would be good for all concerned to await Xilinx's response which, if history is a guide, will be an honest and open discussion of the facts, and which will provide essential guidance to the design community. Bob Elkind, eteam@aracnet.com ###### From: "Joel Kolstad" Newsgroups: comp.arch.fpga References: <3AA08CCD.341AA107@earthlink.net> Subject: Re: Bad Xilinx bitstream=big bang? Lines: 10 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Message-ID: Date: Sat, 03 Mar 2001 18:26:17 GMT NNTP-Posting-Host: 63.53.248.41 X-Complaints-To: abuse@earthlink.net X-Trace: newsread2.prod.itd.earthlink.net 983643977 63.53.248.41 (Sat, 03 Mar 2001 10:26:17 PST) NNTP-Posting-Date: Sat, 03 Mar 2001 10:26:17 PST Organization: EarthLink Inc. -- http://www.EarthLink.net Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!psinet-eu-nl!newsfeeds.belnet.be!news.belnet.be!feed2.onemain.com!feed1.onemain.com!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:4945 Thanks for the explanation, Peter. I'm thinking we'll add something to the board so that the PC's software will be able to detect what type of FPGA the board has loaded on it, and not feed it incorrect bitstreams. The FPGA itself survived just fine, as far as I can tell. :-) ---Joel ###### Message-ID: <3AA1736E.FE23FF67@earthlink.net> From: Peter Alfke Reply-To: palfke@earthlink.net X-Mailer: Mozilla 4.61 (Macintosh; I; PPC) X-Accept-Language: en,pdf MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: <3AA08CCD.341AA107@earthlink.net> <3AA1318A.F2D8E5EF@aracnet.com> Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353" Content-Transfer-Encoding: 7bit Lines: 62 Date: Sat, 03 Mar 2001 22:43:44 GMT NNTP-Posting-Host: 209.178.167.76 X-Complaints-To: abuse@earthlink.net X-Trace: newsread1.prod.itd.earthlink.net 983659424 209.178.167.76 (Sat, 03 Mar 2001 14:43:44 PST) NNTP-Posting-Date: Sat, 03 Mar 2001 14:43:44 PST Organization: EarthLink Inc. -- http://www.EarthLink.net Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!unlisys!news.snafu.de!news1.ebone.net!news.ebone.net!nycmny1-snf1.gtei.net!cpk-news-hub1.bbnplanet.com!lsanca1-snf1!news.gtei.net!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread1.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:4964 Before we get all worked up, let me clarify: Power-up is an event that occurs long before any bitstream is started. So it is a totally different subject. Feeding a legitimate bitstream, that has passed the standard design-rule check, DRC, and is compiled for this particular device, will never cause any contention or other strange behavior. The problem reported was one of ( obviously unintentional ) feeding a bitstream that was properly created for a different device. If it had been a "bad" bitstream, the CRC would have caught it. I think Virtex-II takes care of such issues, but I let Austin answer that. Peter Alfke eteam wrote: > Peter (and Austin and Phil), > > Here is my take (*please* correct me where I'm mistaken): > > It sounds like there are two weaknesses in (at least) > some of the Xilinx device families that can lead to > catastrophic failures: > > 1. Devices don't check the programming data stream > for a "match" of target device. Thus, you can try to > program a Virtex 600E device with a Virtex 400E > configuration data stream, and the configuration data > will be accepted. > > This "hole" allows the designer to make a mistake, and be > burned by it. The workaround is, simply, to make sure that > your configuration files were compiled for the correct target > device; you screw up at your own peril. > > 2. More ominous is that drivers for internal > multi-source busses are not disabled (tri-stated, if > you will) before and during the configuration and powerup > sequence, when the internal state of the device cannot be > controlled or specified by the designer. I'm not sure there > is *any* workaround to this, short of a re-design of the FPGA die. > > We need to understand the breadth of this problem (if the above > assessments are basically correct): which device families are > affected (afflicted), etc. etc. > > I'm not posting this to cause alarm, but to distill the > issues at hand as clearly as possible, and avoid any FUD. > > Rather than get excited, it would be good for all concerned > to await Xilinx's response which, if history is a guide, will be > an honest and open discussion of the facts, and which will provide > essential guidance to the design community. > > Bob Elkind, eteam@aracnet.com ###### Message-ID: <3AA175EC.F280E232@earthlink.net> From: Peter Alfke Reply-To: palfke@earthlink.net X-Mailer: Mozilla 4.61 (Macintosh; I; PPC) X-Accept-Language: en,pdf MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: <3AA08CCD.341AA107@earthlink.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 23 Date: Sat, 03 Mar 2001 22:54:16 GMT NNTP-Posting-Host: 209.178.167.76 X-Complaints-To: abuse@earthlink.net X-Trace: newsread1.prod.itd.earthlink.net 983660056 209.178.167.76 (Sat, 03 Mar 2001 14:54:16 PST) NNTP-Posting-Date: Sat, 03 Mar 2001 14:54:16 PST Organization: EarthLink Inc. -- http://www.EarthLink.net Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread1.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:4963 I am quire certain that XC4000 and prior devices automatically avoid this problem, since they check for a start bit in specific bitstream locations. If the frame structure is "wrong", then there is a 50% chance of detecting the error after every frame. After a few frames, the probability of detection gets very high. After hundred of frames it is practically 100%. I am glad the Virtex devices survived this jolt. Peter Alfke Joel Kolstad wrote: > Thanks for the explanation, Peter. I'm thinking we'll add something to the > board so that the PC's software will be able to detect what type of FPGA the > board has loaded on it, and not feed it incorrect bitstreams. > > The FPGA itself survived just fine, as far as I can tell. :-) > > ---Joel ###### Sender: eric@ruckus.brouhaha.com From: Eric Smith Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: <3AA08CCD.341AA107@earthlink.net> X-Disclaimer: Everything I write is false. Organization: Eric Conspiracy Secret Labs X-Eric-Conspiracy: There is no conspiracy. Date: 03 Mar 2001 15:44:07 -0800 Message-ID: Lines: 17 User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii NNTP-Posting-Host: ruckus.brouhaha.com X-Trace: 3 Mar 2001 15:44:39 -0800, ruckus.brouhaha.com Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!uni-erlangen.de!newsfeed.germany.net!news.tele.dk!171.64.14.106!newsfeed.stanford.edu!news.kjsl.com!news.spies.com!ruckus.brouhaha.com Xref: chonsp.franklin.ch comp.arch.fpga:4967 Peter Alfke writes: > The gist is: > Virtex parts do check for CRC errors, but not for formatting errors. And you > sent a legitimately CRC-protected file, just the wrong format... Horrendous > amount of internal contention. [...] > Correct. If there were a CRC error, the part would recognize this. But there > was no CRC error... Is there some reason why the part doesn't ALSO recognize that the bitstream is too short? I wouldn't think it would expect the CRC until it had filled all of the RAM cells. This suggests that in addition to length checking, you guys might want to design a part id number into future parts, and have it fail to configure if the part id in the bitstream doesn't match the part id of the part. ###### Message-ID: <3AA1A50D.404532B1@earthlink.net> From: Peter Alfke Reply-To: palfke@earthlink.net X-Mailer: Mozilla 4.61 (Macintosh; I; PPC) X-Accept-Language: en,pdf MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: <3AA08CCD.341AA107@earthlink.net> Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353" Content-Transfer-Encoding: 7bit Lines: 16 Date: Sun, 04 Mar 2001 02:15:35 GMT NNTP-Posting-Host: 209.179.193.230 X-Complaints-To: abuse@earthlink.net X-Trace: newsread2.prod.itd.earthlink.net 983672135 209.179.193.230 (Sat, 03 Mar 2001 18:15:35 PST) NNTP-Posting-Date: Sat, 03 Mar 2001 18:15:35 PST Organization: EarthLink Inc. -- http://www.EarthLink.net Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!uni-erlangen.de!news-nue1.dfn.de!news-lei1.dfn.de!news-fra1.dfn.de!news.tele.dk!199.60.229.5!newsfeed.direct.ca!look.ca!newsfeed1.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:4962 Eric Smith wrote: > This suggests that in addition to length checking, you guys might want > to design a part id number into future parts, and have it fail to > configure if the part id in the bitstream doesn't match the part id of > the part. Done, in the new family Virtex-II. Peter Alfke ###### Message-ID: <3AA348E7.9DF750AA@earthlink.net> From: Peter Alfke Reply-To: palfke@earthlink.net X-Mailer: Mozilla 4.61 (Macintosh; I; PPC) X-Accept-Language: en,pdf MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 104 Date: Mon, 05 Mar 2001 08:06:09 GMT NNTP-Posting-Host: 209.179.245.225 X-Complaints-To: abuse@earthlink.net X-Trace: newsread2.prod.itd.earthlink.net 983779569 209.179.245.225 (Mon, 05 Mar 2001 00:06:09 PST) NNTP-Posting-Date: Mon, 05 Mar 2001 00:06:09 PST Organization: EarthLink Inc. -- http://www.EarthLink.net Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:5003 Joel, I am very sorry about your mishap, and I have inserted my comments in the appropriate places in your story. The gist is: Virtex parts do check for CRC errors, but not for formatting errors. And you sent a legitimately CRC-protected file, just the wrong format... Horrendous amount of internal contention. Joel Kolstad wrote: > We have some PCI cards at work that we recently upgraded -- for both price > and performance reasons -- to use Xilinx XCV600E parts instead of the XCV400 > parts that the old board used. We've found out the hard way that Very Bad > Things happen if you attempt to load the old bitstream for the XCV400 onto > the XCV600E board. In particular: > > -- > > Now what I want to know is... I had always thought that there was some CRC > checking in the FPGA bitstream files, and that you could pretty much feed > the FPGA random gibberish and be very unlikely to actually get the thing to > accept the bitstream Correct. If there were a CRC error, the part would recognize this. But there was no CRC error... > and go through power-on initialization. No, no. power-on initialization is done much earlier, right after you applied Vcc. This has nothing to do with CCLK. The parts use their own internal oscillator for that purpose. > In fact, we're > manually bit-banging the CClk line on the FPGA (we're using serial slave > mode), so there aren't even enough clock pulses provided to the 600E to make > it think it should even _consider_ going through power-on initiailization, see above. It has done this sucessfully long before. > > since the 600E requires about two hundred thousand extra bits (and CClks) > than what the 400 file would provide it with (we stop generating CClk when > we're out of configuration data bits). > > With our current setup it's difficult to probe around on the board and try > to figure out exactly _when_ the overcurrent condition starts. Loading the > 400 file takes a couple of seconds, and the 145W PC will power down within a > couple of seconds after that. My suspicion is that the overcurrent > condition has already started long before we're gotten anywhere near to > finishing the transmission of the 400's bitstream. Yes. The internal logic becomes active as you feed in the data. ( I may stand corrected here. I carry too much XC4000 bagage in my head, and I am at home, no access to other experts. But Austin can jump in, while I am gone for the coming week. Seminars in Europe) > > So... does anybody have any experience with this? The possibility that > feeding a 600E a 400 bitstream causes it to draw massive currents seems > awfully remote to me. No, it's ugly, but not surprising. The part considers this a garbage bitstream , but with legitimate CRC. I know this is not ideal, but that's the way it is. > The LT1575/dual FET power supply can put out 2A all > day long (this was its design goal -- we're dissipating 1.5V*2A=3W or > 1.5W/FET in this case), I would wager it can put out 4A for many minutes, > and to physically blow up both FETs I would have to think that it's passing > at least 10A for a little while. Strange, very strange. Not so strange. Consider the very large number of internal nodes, let's say over 50,000. Let's assume that, through nonsense configuration, 10% are driven by contending levels on both sides of the wire. And let's assume a realistic 5 mA per contention: 5000 times 5 mA = 25 A ! This distributed nature of the current also shows why the Virtex part ( most likely ?) survived. The current is more or less evenly spread over the whole die, which is more than a square centimeter in area. I am not making excuses, just describe the phenomenon, which is quite rational, albeit not desirable. Ask Austin whether Virtex-II is protected against this kind of mishap. Peter Alfke, Xilinx Applications (Friday-night emergency services) ###### From: Brian Drummond Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? Date: Mon, 05 Mar 2001 17:31:38 +0000 Message-ID: References: <3AA08CCD.341AA107@earthlink.net> NNTP-Posting-Host: shapes.demon.co.uk X-NNTP-Posting-Host: shapes.demon.co.uk:158.152.228.158 X-Trace: news.demon.co.uk 983813254 nnrp-01:19180 NO-IDENT shapes.demon.co.uk:158.152.228.158 X-Complaints-To: abuse@demon.net X-Newsreader: Forte Agent 1.7/32.534 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 21 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!dispose.news.demon.net!news.demon.co.uk!demon!shapes.demon.co.uk!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:5029 On 03 Mar 2001 15:44:07 -0800, Eric Smith wrote: >Peter Alfke writes: >> The gist is: >> Virtex parts do check for CRC errors, but not for formatting errors. And you >> sent a legitimately CRC-protected file, just the wrong format... Horrendous >> amount of internal contention. >[...] >> Correct. If there were a CRC error, the part would recognize this. But there >> was no CRC error... > >Is there some reason why the part doesn't ALSO recognize that the bitstream >is too short? I wouldn't think it would expect the CRC until it had filled >all of the RAM cells. > Anything to do with partial reconfiguration maybe? Like, is it possible to generate a _valid_ short bitstream to reprogram part of the device but leaving the remainder unchanged? - Brian ###### From: krw@btv.ibm.com (Keith R. Williams) Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? Date: Mon, 05 Mar 2001 18:06:41 GMT Organization: IBM Global Services North -- Burlington, Vermont, USA Lines: 31 Message-ID: <3aa3d52f.17694513@mdnews.btv.ibm.com> References: <3AA08CCD.341AA107@earthlink.net> NNTP-Posting-Host: sneakers.btv.ibm.com X-Trace: news.btv.ibm.com 983815766 24696 9.66.117.41 (5 Mar 2001 18:09:26 GMT) X-Complaints-To: news@btv.ibm.com NNTP-Posting-Date: 5 Mar 2001 18:09:26 GMT X-Newsreader: Forte Free Agent 1.21/32.243 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!news.maxwell.syr.edu!newsfeed.skycache.com!Cidera!portc03.blue.aol.com!newsjunkie.ans.net!news.chips.ibm.com!newsfeed.btv.ibm.com!news.btv.ibm.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:5012 On Mon, 05 Mar 2001 17:31:38 +0000, Brian Drummond wrote: >On 03 Mar 2001 15:44:07 -0800, Eric Smith > wrote: > >>Peter Alfke writes: >>> The gist is: >>> Virtex parts do check for CRC errors, but not for formatting errors. And you >>> sent a legitimately CRC-protected file, just the wrong format... Horrendous >>> amount of internal contention. >>[...] >>> Correct. If there were a CRC error, the part would recognize this. But there >>> was no CRC error... >> >>Is there some reason why the part doesn't ALSO recognize that the bitstream >>is too short? I wouldn't think it would expect the CRC until it had filled >>all of the RAM cells. >> >Anything to do with partial reconfiguration maybe? >Like, is it possible to generate a _valid_ short bitstream to reprogram >part of the device but leaving the remainder unchanged? Perhaps you've just stepped on another reason the tools don't support partial reconfiguration? Two _valid_ short bitstreams may create many drivers on the same wire. ---- Keith ###### Sender: eric@ruckus.brouhaha.com From: Eric Smith Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? References: <3AA08CCD.341AA107@earthlink.net> X-Disclaimer: Everything I write is false. Organization: Eric Conspiracy Secret Labs X-Eric-Conspiracy: There is no conspiracy. Date: 05 Mar 2001 10:07:40 -0800 Message-ID: Lines: 18 User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii NNTP-Posting-Host: ruckus.brouhaha.com X-Trace: 5 Mar 2001 10:08:32 -0800, ruckus.brouhaha.com Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!uni-erlangen.de!news-nue1.dfn.de!news-lei1.dfn.de!news-fra1.dfn.de!news.tele.dk!212.74.64.35!colt.net!nycmny1-snf1.gtei.net!cpk-news-hub1.bbnplanet.com!news.gtei.net!newsfeed.mathworks.com!news.kjsl.com!news.spies.com!ruckus.brouhaha.com Xref: chonsp.franklin.ch comp.arch.fpga:5017 I wrote: > Is there some reason why the part doesn't ALSO recognize that the bitstream > is too short? I wouldn't think it would expect the CRC until it had filled > all of the RAM cells. Brian Drummond writes: > Anything to do with partial reconfiguration maybe? > Like, is it possible to generate a _valid_ short bitstream to reprogram > part of the device but leaving the remainder unchanged? Having looked over the XAPP 176 appnote on configuration and readback of the Spartan II over the weekend, I now have a better appreciation for how the config process works. I think your hypothesis is correct. I guess the thing that still surprises me is that each size of FPGA needs a different frame size in the bitstream (table 4 on page 15, XC2S200 not specified), so the parts should have been able to detect the wrong frame size, even if they can't detect the wrong total image size. ###### From: Brian Drummond Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? Date: Tue, 06 Mar 2001 19:43:02 +0000 Message-ID: <89v9atgl80fgeteo8tjvbi5nkosoq7rci6@4ax.com> References: <3AA08CCD.341AA107@earthlink.net> <3aa3d52f.17694513@mdnews.btv.ibm.com> NNTP-Posting-Host: shapes.demon.co.uk X-NNTP-Posting-Host: shapes.demon.co.uk:158.152.228.158 X-Trace: news.demon.co.uk 983907528 nnrp-09:3390 NO-IDENT shapes.demon.co.uk:158.152.228.158 X-Complaints-To: abuse@demon.net X-Newsreader: Forte Agent 1.7/32.534 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 37 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!fr.clara.net!heighliner.fr.clara.net!diablo.netcom.net.uk!netcom.net.uk!newsfeed.icl.net!dispose.news.demon.net!news.demon.co.uk!demon!shapes.demon.co.uk!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:5055 On Mon, 05 Mar 2001 18:06:41 GMT, krw@btv.ibm.com (Keith R. Williams) wrote: >On Mon, 05 Mar 2001 17:31:38 +0000, Brian Drummond > wrote: > >>On 03 Mar 2001 15:44:07 -0800, Eric Smith >> wrote: >> >>>Peter Alfke writes: >>>> The gist is: >>>> Virtex parts do check for CRC errors, but not for formatting errors. And you >>>> sent a legitimately CRC-protected file, just the wrong format... Horrendous >>>> amount of internal contention. >>>[...] >>>> Correct. If there were a CRC error, the part would recognize this. But there >>>> was no CRC error... >>> >>>Is there some reason why the part doesn't ALSO recognize that the bitstream >>>is too short? I wouldn't think it would expect the CRC until it had filled >>>all of the RAM cells. >>> >>Anything to do with partial reconfiguration maybe? >>Like, is it possible to generate a _valid_ short bitstream to reprogram >>part of the device but leaving the remainder unchanged? > >Perhaps you've just stepped on another reason the tools don't support >partial reconfiguration? Two _valid_ short bitstreams may create many >drivers on the same wire. Ouch! The Virtex-II device ID feature can't protect against THAT one! I'm not sure anything can. Except maybe some design rule checker running on the set of placed/routed NCD files prior to bitfile generation. Doesn't look like an easy problem. - Brian ###### From: Austin Lesea Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? Date: Tue, 06 Mar 2001 19:19:40 -0800 Organization: Xilinx Lines: 59 Message-ID: <3AA5A8CB.DDAB961A@xilinx.com> References: <3AA08CCD.341AA107@earthlink.net> <3aa3d52f.17694513@mdnews.btv.ibm.com> <89v9atgl80fgeteo8tjvbi5nkosoq7rci6@4ax.com> NNTP-Posting-Host: 149.199.249.6 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.7 [en]C-CCK-MCD (WinNT; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!fu-berlin.de!newsfeed.direct.ca!look.ca!nntp2.aus1.giganews.com!NetNews1!attws2!attsl2!attla2!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:5037 All, When using the new Virtex II Internal Configuration Access Port (ICAP), one has to also use the tools properly with the logic floorplanner, nailing down the separate regions, and IO intefaces between the separate regions to be reconfigured. We are on the "cutting edge" of providing the tools, the hardware, and the methodologies, so this certainly isn't a smooth and easy process (yet). The basic features are there (ICAP, floorplanner, etc). There must be first the floor planning, and the identification of reconfigurable regions (some systems architecture must go it first). This is an exciting feature that has the universities all asking the right questions, and hopefully ready to provide some answers to today's problems in reconfigurable computing. Austin Brian Drummond wrote: > On Mon, 05 Mar 2001 18:06:41 GMT, krw@btv.ibm.com (Keith R. Williams) > wrote: > > >On Mon, 05 Mar 2001 17:31:38 +0000, Brian Drummond > > wrote: > > > >>On 03 Mar 2001 15:44:07 -0800, Eric Smith > >> wrote: > >> > >>>Peter Alfke writes: > >>>> The gist is: > >>>> Virtex parts do check for CRC errors, but not for formatting errors. And you > >>>> sent a legitimately CRC-protected file, just the wrong format... Horrendous > >>>> amount of internal contention. > >>>[...] > >>>> Correct. If there were a CRC error, the part would recognize this. But there > >>>> was no CRC error... > >>> > >>>Is there some reason why the part doesn't ALSO recognize that the bitstream > >>>is too short? I wouldn't think it would expect the CRC until it had filled > >>>all of the RAM cells. > >>> > >>Anything to do with partial reconfiguration maybe? > >>Like, is it possible to generate a _valid_ short bitstream to reprogram > >>part of the device but leaving the remainder unchanged? > > > >Perhaps you've just stepped on another reason the tools don't support > >partial reconfiguration? Two _valid_ short bitstreams may create many > >drivers on the same wire. > > Ouch! The Virtex-II device ID feature can't protect against THAT one! > I'm not sure anything can. Except maybe some design rule checker running > on the set of placed/routed NCD files prior to bitfile generation. > Doesn't look like an easy problem. > > - Brian ###### From: alfred fuchs Newsgroups: comp.arch.fpga Subject: Re: Bad Xilinx bitstream=big bang? Date: Wed, 07 Mar 2001 12:19:55 +0100 Organization: Siemens Lines: 58 Message-ID: <3AA6195B.EFAB63D1@siemens.at> References: <3AA08CCD.341AA107@earthlink.net> NNTP-Posting-Host: pca540.erd.siemens.at Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: news.siemens.at 983963987 32310 158.226.169.109 (7 Mar 2001 11:19:47 GMT) X-Complaints-To: usenet@news.siemens.at NNTP-Posting-Date: Wed, 7 Mar 2001 11:19:47 +0000 (UTC) X-Mailer: Mozilla 4.73 [de]C-CCK-MCD DT (WinNT; U) X-Accept-Language: fr,de,en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!unlisys!news.snafu.de!nautilus.visp-europe.psi.com!newsfeed.Austria.EU.net!newsfeed.kpnqwest.at!news.siemens.at!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:5034 In October 1999 at the EXUS in Paris we made Xilinx aware of that self-destruct feature of the Virtex series. A second participant confirmed the phenomenon immediately. Xilinx confirmed it about a week later. Apparently they did not find a workaround. Is anyone aware of a warning issued by Xilinx? Alfred Fuchs Siemens PSE PRO LMS Peter (and Austin and Phil), Here is my take (*please* correct me where I'm mistaken): It sounds like there are two weaknesses in (at least) some of the Xilinx device families that can lead to catastrophic failures: 1. Devices don't check the programming data stream for a "match" of target device. Thus, you can try to program a Virtex 600E device with a Virtex 400E configuration data stream, and the configuration data will be accepted. This "hole" allows the designer to make a mistake, and be burned by it. The workaround is, simply, to make sure that your configuration files were compiled for the correct target device; you screw up at your own peril. 2. More ominous is that drivers for internal multi-source busses are not disabled (tri-stated, if you will) before and during the configuration and powerup sequence, when the internal state of the device cannot be controlled or specified by the designer. I'm not sure there is *any* workaround to this, short of a re-design of the FPGA die. We need to understand the breadth of this problem (if the above assessments are basically correct): which device families are affected (afflicted), etc. etc. I'm not posting this to cause alarm, but to distill the issues at hand as clearly as possible, and avoid any FUD. Rather than get excited, it would be good for all concerned to await Xilinx's response which, if history is a guide, will be an honest and open discussion of the facts, and which will provide essential guidance to the design community. Bob Elkind, eteam@aracnet.com