From: "Joel Kolstad" <JoelKolstad@Earthlink.Net>
Newsgroups: comp.arch.fpga
Subject: Bad Xilinx bitstream=big bang?
Lines: 71
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Message-ID: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net>
Date: Sat, 03 Mar 2001 02:45:35 GMT
NNTP-Posting-Host: 63.53.248.157
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread2.prod.itd.earthlink.net 983587535 63.53.248.157 (Fri, 02 Mar 2001 18:45:35 PST)
NNTP-Posting-Date: Fri, 02 Mar 2001 18:45:35 PST
Organization: EarthLink Inc. -- http://www.EarthLink.net
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:4943

We have some PCI cards at work that we recently upgraded -- for both price
and performance reasons -- to use Xilinx XCV600E parts instead of the XCV400
parts that the old board used.  We've found out the hard way that Very Bad
Things happen if you attempt to load the old bitstream for the XCV400 onto
the XCV600E board.  In particular:

-- In the PC that we initially bring boards up on, the power supply usually
shuts down and requires hard power cycling (pulling the plug) to get the PC
to come back to life.  On a PC power supply this behaviors indicates a
severe overcurrent condition -- I've seen it happen before when a power rail
on the PC has been inadvertently grounded.  The power supply in this PC is
145 watts (standard microATX power supply).

-- On another machine, the 1.8V power supply on our PCB blew up. :-)  This
machine is used to test full systems (lots-o-PCI cards), and as such has a
450 watt ATX power supply.

Hence it would appear that loading the XCV400 bitstream into an XCV600E
causes a rather extreme overcurrent condition, and the 145W power supply
just gives up the ghost and shuts down, whereas the 450W power supply
watches and laughs as our board blows up instead.

Power to this board is achieved as follows: We tap 12V off a floppy drive
connector and feed it to a Power Trends "Big Hammer" 3.3V switching power
supply module; this module can produce something like 18A at 3.3V.  The 3.3V
powers almost everything on the board, including the PCI interface IC we're
using (PLX 9054) that has its local bus connected to the FPGA.  The FPGA
power supply is the same on the new board as the old (there was very little
changed from the old board to new boards), except that it's setup for 1.8V
instead of 2.5V, of course.  The power supply consists of a Linear
Technology LT1575 linear regulator controller controlling two paralleled
International Rectifier N channel MOSFETs.  The LT1575 obtains gate drive
for the FETs from the 12V input, and the FETs themselves have the 3.3V rail
connected to their drains and therefore drop 1.5V to obtain 2.5V.  Note that
we don't connect _any_ power pins from the PCI bus; we required
significantly more 3.3V current that the PCI bus is rated to carry.  All the
ground pins are connected, of course.

On the 450W machine, both FETs literally blew up, making decent sized
craters on top of their SO-8 packages.  Oops.

Now what I want to know is... I had always thought that there was some CRC
checking in the FPGA bitstream files, and that you could pretty much feed
the FPGA random gibberish and be very unlikely to actually get the thing to
accept the bitstream and go through power-on initialization.  In fact, we're
manually bit-banging the CClk line on the FPGA (we're using serial slave
mode), so there aren't even enough clock pulses provided to the 600E to make
it think it should even _consider_ going through power-on initiailization,
since the 600E requires about two hundred thousand extra bits (and CClks)
than what the 400 file would provide it with (we stop generating CClk when
we're out of configuration data bits).

With our current setup it's difficult to probe around on the board and try
to figure out exactly _when_ the overcurrent condition starts.  Loading the
400 file takes a couple of seconds, and the 145W PC will power down within a
couple of seconds after that.  My suspicion is that the overcurrent
condition has already started long before we're gotten anywhere near to
finishing the transmission of the 400's bitstream.

So... does anybody have any experience with this?  The possibility that
feeding a 600E a 400 bitstream causes it to draw massive currents seems
awfully remote to me.  The LT1575/dual FET power supply can put out 2A all
day long (this was its design goal -- we're dissipating 1.5V*2A=3W or
1.5W/FET in this case), I would wager it can put out 4A for many minutes,
and to physically blow up both FETs I would have to think that it's passing
at least 10A for a little while.  Strange, very strange.

---Joel Kolstad

######

Message-ID: <3AA08CCD.341AA107@earthlink.net>
From: Peter Alfke <palfke@earthlink.net>
Reply-To: palfke@earthlink.net
X-Mailer: Mozilla 4.61 (Macintosh; I; PPC)
X-Accept-Language: en,pdf
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net>
Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353"
Content-Transfer-Encoding: 7bit
Lines: 87
Date: Sat, 03 Mar 2001 06:20:02 GMT
NNTP-Posting-Host: 209.179.246.220
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread1.prod.itd.earthlink.net 983600402 209.179.246.220 (Fri, 02 Mar 2001 22:20:02 PST)
NNTP-Posting-Date: Fri, 02 Mar 2001 22:20:02 PST
Organization: EarthLink Inc. -- http://www.EarthLink.net
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.ip-plus.net!news.tesion.net!news.belwue.de!news.uni-ulm.de!rz.uni-karlsruhe.de!schlund.de!newsfeed01.sul.t-online.de!newsfeed00.sul.t-online.de!t-online.de!colt.net!dispose.news.demon.net!demon!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread1.prod.itd.earthlink.net.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:4939

Joel, I am very sorry about your mishap, and I have inserted my comments in the
appropriate places in your story.
The gist is:
Virtex parts do check for CRC errors, but not for formatting errors. And you
sent a legitimately CRC-protected file, just the wrong format... Horrendous
amount of internal contention.

Joel Kolstad wrote:

> We have some PCI cards at work that we recently upgraded -- for both price
> and performance reasons -- to use Xilinx XCV600E parts instead of the XCV400
> parts that the old board used.  We've found out the hard way that Very Bad
> Things happen if you attempt to load the old bitstream for the XCV400 onto
> the XCV600E board.  In particular:
>
> -- <snip>

>
> Now what I want to know is... I had always thought that there was some CRC
> checking in the FPGA bitstream files, and that you could pretty much feed
> the FPGA random gibberish and be very unlikely to actually get the thing to
> accept the bitstream

Correct. If there were a CRC error, the part would recognize this. But there
was no CRC error...

> and go through power-on initialization.

No, no. power-on initialization is done much earlier, right after you applied
Vcc. This has nothing to do with CCLK. The parts use their own internal
oscillator for that purpose.

> In fact, we're
> manually bit-banging the CClk line on the FPGA (we're using serial slave
> mode), so there aren't even enough clock pulses provided to the 600E to make
> it think it should even _consider_ going through power-on initiailization,

see above. It has done this sucessfully long before.

>
> since the 600E requires about two hundred thousand extra bits (and CClks)
> than what the 400 file would provide it with (we stop generating CClk when
> we're out of configuration data bits).
>
> With our current setup it's difficult to probe around on the board and try
> to figure out exactly _when_ the overcurrent condition starts.  Loading the
> 400 file takes a couple of seconds, and the 145W PC will power down within a
> couple of seconds after that.  My suspicion is that the overcurrent
> condition has already started long before we're gotten anywhere near to
> finishing the transmission of the 400's bitstream.

Yes. The internal logic becomes active as you feed in the data.
( I may stand corrected here. I carry too much XC4000 bagage in my head, and I
am at home, no access to other experts. But Austin can jump in, while I am gone
for the coming week. Seminars in Europe)

>
> So... does anybody have any experience with this?  The possibility that
> feeding a 600E a 400 bitstream causes it to draw massive currents seems
> awfully remote to me.

No, it's ugly, but not surprising. The part considers this a garbage bitstream
, but with legitimate CRC. I know this is not ideal, but that's the way it is.

> The LT1575/dual FET power supply can put out 2A all
> day long (this was its design goal -- we're dissipating 1.5V*2A=3W or
> 1.5W/FET in this case), I would wager it can put out 4A for many minutes,
> and to physically blow up both FETs I would have to think that it's passing
> at least 10A for a little while.  Strange, very strange.

Not so strange. Consider the very large number of internal nodes, let's say
over 50,000.
Let's assume that, through nonsense configuration, 10% are driven by contending
levels on both sides of the wire. And let's assume a realistic 5 mA per
contention: 5000 times 5 mA = 25 A !
This distributed nature of the current also shows why the Virtex part ( most
likely ?) survived. The current is more or less evenly spread over the whole
die, which is more than a square centimeter in area.

I am not making excuses, just describe the phenomenon, which is quite rational,
albeit not desirable.

Ask Austin whether Virtex-II is protected against this kind of mishap.

Peter Alfke, Xilinx Applications (Friday-night emergency services)

######

Message-ID: <3AA1318A.F2D8E5EF@aracnet.com>
From: eteam <eteam@aracnet.com>
Reply-To: eteam@aracnet.com
Organization: The E-Team
X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 39
Date: Sat, 03 Mar 2001 10:01:46 -0800
NNTP-Posting-Host: 198.102.179.71
X-Complaints-To: news@aracnet.com
X-Trace: typhoon.aracnet.com 983642505 198.102.179.71 (Sat, 03 Mar 2001 10:01:45 PST)
NNTP-Posting-Date: Sat, 03 Mar 2001 10:01:45 PST
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.he.net!typhoon.aracnet.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:4936

Peter (and Austin and Phil),

Here is my take (*please* correct me where I'm mistaken):

It sounds like there are two weaknesses in (at least)
some of the Xilinx device families that can lead to
catastrophic failures:

1.  Devices don't check the programming data stream
for a "match" of target device.  Thus, you can try to
program a Virtex 600E device with a Virtex 400E
configuration data stream, and the configuration data
will be accepted.

This "hole" allows the designer to make a mistake, and be
burned by it.  The workaround is, simply, to make sure that
your configuration files were compiled for the correct target
device; you screw up at your own peril.

2.  More ominous is that drivers for internal
multi-source busses are not disabled (tri-stated, if
you will) before and during the configuration and powerup
sequence, when the internal state of the device cannot be
controlled or specified by the designer.  I'm not sure there
is *any* workaround to this, short of a re-design of the FPGA die.

We need to understand the breadth of this problem (if the above
assessments are basically correct): which device families are
affected (afflicted), etc. etc.

I'm not posting this to cause alarm, but to distill the
issues at hand as clearly as possible, and avoid any FUD.

Rather than get excited, it would be good for all concerned
to await Xilinx's response which, if history is a guide, will be
an honest and open discussion of the facts, and which will provide
essential guidance to the design community.

Bob Elkind, eteam@aracnet.com

######

From: "Joel Kolstad" <JoelKolstad@Earthlink.Net>
Newsgroups: comp.arch.fpga
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net>
Subject: Re: Bad Xilinx bitstream=big bang?
Lines: 10
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Message-ID: <dHao6.10761$7Y1.1213191@newsread2.prod.itd.earthlink.net>
Date: Sat, 03 Mar 2001 18:26:17 GMT
NNTP-Posting-Host: 63.53.248.41
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread2.prod.itd.earthlink.net 983643977 63.53.248.41 (Sat, 03 Mar 2001 10:26:17 PST)
NNTP-Posting-Date: Sat, 03 Mar 2001 10:26:17 PST
Organization: EarthLink Inc. -- http://www.EarthLink.net
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!psinet-eu-nl!newsfeeds.belnet.be!news.belnet.be!feed2.onemain.com!feed1.onemain.com!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:4945

Thanks for the explanation, Peter.  I'm thinking we'll add something to the
board so that the PC's software will be able to detect what type of FPGA the
board has loaded on it, and not feed it incorrect bitstreams.

The FPGA itself survived just fine, as far as I can tell. :-)

---Joel

######

Message-ID: <3AA1736E.FE23FF67@earthlink.net>
From: Peter Alfke <palfke@earthlink.net>
Reply-To: palfke@earthlink.net
X-Mailer: Mozilla 4.61 (Macintosh; I; PPC)
X-Accept-Language: en,pdf
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <3AA1318A.F2D8E5EF@aracnet.com>
Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353"
Content-Transfer-Encoding: 7bit
Lines: 62
Date: Sat, 03 Mar 2001 22:43:44 GMT
NNTP-Posting-Host: 209.178.167.76
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread1.prod.itd.earthlink.net 983659424 209.178.167.76 (Sat, 03 Mar 2001 14:43:44 PST)
NNTP-Posting-Date: Sat, 03 Mar 2001 14:43:44 PST
Organization: EarthLink Inc. -- http://www.EarthLink.net
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!unlisys!news.snafu.de!news1.ebone.net!news.ebone.net!nycmny1-snf1.gtei.net!cpk-news-hub1.bbnplanet.com!lsanca1-snf1!news.gtei.net!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread1.prod.itd.earthlink.net.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:4964

Before we get all worked up, let me clarify:

Power-up is an event that occurs long before any bitstream is
started. So it is a totally different subject.

Feeding a legitimate bitstream, that has passed the standard
design-rule check, DRC, and is compiled for this particular device,
will never cause any contention or other strange behavior.

The problem reported was one of ( obviously unintentional ) feeding a
bitstream that was properly created for a different device. If it had
been a "bad" bitstream, the CRC would have caught it.

I think Virtex-II takes care of such issues, but I let Austin answer
that.

Peter Alfke


eteam wrote:

> Peter (and Austin and Phil),
>
> Here is my take (*please* correct me where I'm mistaken):
>
> It sounds like there are two weaknesses in (at least)
> some of the Xilinx device families that can lead to
> catastrophic failures:
>
> 1.  Devices don't check the programming data stream
> for a "match" of target device.  Thus, you can try to
> program a Virtex 600E device with a Virtex 400E
> configuration data stream, and the configuration data
> will be accepted.
>
> This "hole" allows the designer to make a mistake, and be
> burned by it.  The workaround is, simply, to make sure that
> your configuration files were compiled for the correct target
> device; you screw up at your own peril.
>
> 2.  More ominous is that drivers for internal
> multi-source busses are not disabled (tri-stated, if
> you will) before and during the configuration and powerup
> sequence, when the internal state of the device cannot be
> controlled or specified by the designer.  I'm not sure there
> is *any* workaround to this, short of a re-design of the FPGA die.
>
> We need to understand the breadth of this problem (if the above
> assessments are basically correct): which device families are
> affected (afflicted), etc. etc.
>
> I'm not posting this to cause alarm, but to distill the
> issues at hand as clearly as possible, and avoid any FUD.
>
> Rather than get excited, it would be good for all concerned
> to await Xilinx's response which, if history is a guide, will be
> an honest and open discussion of the facts, and which will provide
> essential guidance to the design community.
>
> Bob Elkind, eteam@aracnet.com

######

Message-ID: <3AA175EC.F280E232@earthlink.net>
From: Peter Alfke <palfke@earthlink.net>
Reply-To: palfke@earthlink.net
X-Mailer: Mozilla 4.61 (Macintosh; I; PPC)
X-Accept-Language: en,pdf
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <dHao6.10761$7Y1.1213191@newsread2.prod.itd.earthlink.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 23
Date: Sat, 03 Mar 2001 22:54:16 GMT
NNTP-Posting-Host: 209.178.167.76
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread1.prod.itd.earthlink.net 983660056 209.178.167.76 (Sat, 03 Mar 2001 14:54:16 PST)
NNTP-Posting-Date: Sat, 03 Mar 2001 14:54:16 PST
Organization: EarthLink Inc. -- http://www.EarthLink.net
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread1.prod.itd.earthlink.net.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:4963

I am quire certain that XC4000 and prior devices
automatically avoid this
problem, since they check for a start bit in
specific bitstream locations. If
the frame structure is "wrong", then there is a
50% chance of detecting the
error after every frame. After a few frames, the
probability of detection gets
very high. After hundred of frames it is
practically 100%.

I am glad the Virtex devices survived this jolt.
Peter Alfke

Joel Kolstad wrote:

> Thanks for the explanation, Peter.  I'm thinking we'll add something to the
> board so that the PC's software will be able to detect what type of FPGA the
> board has loaded on it, and not feed it incorrect bitstreams.
>
> The FPGA itself survived just fine, as far as I can tell. :-)
>
> ---Joel

######

Sender: eric@ruckus.brouhaha.com
From: Eric Smith <eric-no-spam-for-me@brouhaha.com>
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net>
X-Disclaimer: Everything I write is false.
Organization: Eric Conspiracy Secret Labs
X-Eric-Conspiracy: There is no conspiracy.
Date: 03 Mar 2001 15:44:07 -0800
Message-ID: <qhn1b2a054.fsf@ruckus.brouhaha.com>
Lines: 17
User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
NNTP-Posting-Host: ruckus.brouhaha.com
X-Trace: 3 Mar 2001 15:44:39 -0800, ruckus.brouhaha.com
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!uni-erlangen.de!newsfeed.germany.net!news.tele.dk!171.64.14.106!newsfeed.stanford.edu!news.kjsl.com!news.spies.com!ruckus.brouhaha.com
Xref: chonsp.franklin.ch comp.arch.fpga:4967

Peter Alfke <palfke@earthlink.net> writes:
> The gist is:
> Virtex parts do check for CRC errors, but not for formatting errors. And you
> sent a legitimately CRC-protected file, just the wrong format... Horrendous
> amount of internal contention.
[...]
> Correct. If there were a CRC error, the part would recognize this. But there
> was no CRC error...

Is there some reason why the part doesn't ALSO recognize that the bitstream
is too short?  I wouldn't think it would expect the CRC until it had filled
all of the RAM cells.

This suggests that in addition to length checking, you guys might want
to design a part id number into future parts, and have it fail to
configure if the part id in the bitstream doesn't match the part id of
the part.

######

Message-ID: <3AA1A50D.404532B1@earthlink.net>
From: Peter Alfke <palfke@earthlink.net>
Reply-To: palfke@earthlink.net
X-Mailer: Mozilla 4.61 (Macintosh; I; PPC)
X-Accept-Language: en,pdf
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <qhn1b2a054.fsf@ruckus.brouhaha.com>
Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353"
Content-Transfer-Encoding: 7bit
Lines: 16
Date: Sun, 04 Mar 2001 02:15:35 GMT
NNTP-Posting-Host: 209.179.193.230
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread2.prod.itd.earthlink.net 983672135 209.179.193.230 (Sat, 03 Mar 2001 18:15:35 PST)
NNTP-Posting-Date: Sat, 03 Mar 2001 18:15:35 PST
Organization: EarthLink Inc. -- http://www.EarthLink.net
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!uni-erlangen.de!news-nue1.dfn.de!news-lei1.dfn.de!news-fra1.dfn.de!news.tele.dk!199.60.229.5!newsfeed.direct.ca!look.ca!newsfeed1.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:4962


Eric Smith wrote:

> This suggests that in addition to length checking, you guys might want
> to design a part id number into future parts, and have it fail to
> configure if the part id in the bitstream doesn't match the part id of
> the part.

Done,
in the new family Virtex-II.


Peter Alfke

######

Message-ID: <3AA348E7.9DF750AA@earthlink.net>
From: Peter Alfke <palfke@earthlink.net>
Reply-To: palfke@earthlink.net
X-Mailer: Mozilla 4.61 (Macintosh; I; PPC)
X-Accept-Language: en,pdf
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 104
Date: Mon, 05 Mar 2001 08:06:09 GMT
NNTP-Posting-Host: 209.179.245.225
X-Complaints-To: abuse@earthlink.net
X-Trace: newsread2.prod.itd.earthlink.net 983779569 209.179.245.225 (Mon, 05 Mar 2001 00:06:09 PST)
NNTP-Posting-Date: Mon, 05 Mar 2001 00:06:09 PST
Organization: EarthLink Inc. -- http://www.EarthLink.net
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed2.news.rcn.net!rcn!newsfeed1.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread2.prod.itd.earthlink.net.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:5003

Joel, I am very sorry about your mishap, and I
have inserted my comments in the
appropriate places in your story.
The gist is:
Virtex parts do check for CRC errors, but not for
formatting errors. And you
sent a legitimately CRC-protected file, just the
wrong format... Horrendous
amount of internal contention.

Joel Kolstad wrote:

> We have some PCI cards at work that we recently upgraded -- for both price
> and performance reasons -- to use Xilinx XCV600E parts instead of the XCV400
> parts that the old board used.  We've found out the hard way that Very Bad
> Things happen if you attempt to load the old bitstream for the XCV400 onto
> the XCV600E board.  In particular:
>
> -- <snip>

>
> Now what I want to know is... I had always thought that there was some CRC
> checking in the FPGA bitstream files, and that you could pretty much feed
> the FPGA random gibberish and be very unlikely to actually get the thing to
> accept the bitstream

Correct. If there were a CRC error, the part would
recognize this. But there
was no CRC error...

> and go through power-on initialization.

No, no. power-on initialization is done much
earlier, right after you applied
Vcc. This has nothing to do with CCLK. The parts
use their own internal
oscillator for that purpose.

> In fact, we're
> manually bit-banging the CClk line on the FPGA (we're using serial slave
> mode), so there aren't even enough clock pulses provided to the 600E to make
> it think it should even _consider_ going through power-on initiailization,

see above. It has done this sucessfully long before.

>
> since the 600E requires about two hundred thousand extra bits (and CClks)
> than what the 400 file would provide it with (we stop generating CClk when
> we're out of configuration data bits).
>
> With our current setup it's difficult to probe around on the board and try
> to figure out exactly _when_ the overcurrent condition starts.  Loading the
> 400 file takes a couple of seconds, and the 145W PC will power down within a
> couple of seconds after that.  My suspicion is that the overcurrent
> condition has already started long before we're gotten anywhere near to
> finishing the transmission of the 400's bitstream.

Yes. The internal logic becomes active as you feed
in the data.
( I may stand corrected here. I carry too much
XC4000 bagage in my head, and I
am at home, no access to other experts. But Austin
can jump in, while I am gone
for the coming week. Seminars in Europe)

>
> So... does anybody have any experience with this?  The possibility that
> feeding a 600E a 400 bitstream causes it to draw massive currents seems
> awfully remote to me.

No, it's ugly, but not surprising. The part
considers this a garbage bitstream
, but with legitimate CRC. I know this is not
ideal, but that's the way it is.

> The LT1575/dual FET power supply can put out 2A all
> day long (this was its design goal -- we're dissipating 1.5V*2A=3W or
> 1.5W/FET in this case), I would wager it can put out 4A for many minutes,
> and to physically blow up both FETs I would have to think that it's passing
> at least 10A for a little while.  Strange, very strange.

Not so strange. Consider the very large number of
internal nodes, let's say
over 50,000.
Let's assume that, through nonsense configuration,
10% are driven by contending
levels on both sides of the wire. And let's assume
a realistic 5 mA per
contention: 5000 times 5 mA = 25 A !
This distributed nature of the current also shows
why the Virtex part ( most
likely ?) survived. The current is more or less
evenly spread over the whole
die, which is more than a square centimeter in area.

I am not making excuses, just describe the
phenomenon, which is quite rational,
albeit not desirable.

Ask Austin whether Virtex-II is protected against
this kind of mishap.

Peter Alfke, Xilinx Applications (Friday-night
emergency services)

######

From: Brian Drummond <brian@shapes.demon.co.uk>
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
Date: Mon, 05 Mar 2001 17:31:38 +0000
Message-ID: <pbj7at01r7c7ci8bt4diigabhqscljd2ku@4ax.com>
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <qhn1b2a054.fsf@ruckus.brouhaha.com>
NNTP-Posting-Host: shapes.demon.co.uk
X-NNTP-Posting-Host: shapes.demon.co.uk:158.152.228.158
X-Trace: news.demon.co.uk 983813254 nnrp-01:19180 NO-IDENT shapes.demon.co.uk:158.152.228.158
X-Complaints-To: abuse@demon.net
X-Newsreader: Forte Agent 1.7/32.534
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 21
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!dispose.news.demon.net!news.demon.co.uk!demon!shapes.demon.co.uk!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:5029

On 03 Mar 2001 15:44:07 -0800, Eric Smith
<eric-no-spam-for-me@brouhaha.com> wrote:

>Peter Alfke <palfke@earthlink.net> writes:
>> The gist is:
>> Virtex parts do check for CRC errors, but not for formatting errors. And you
>> sent a legitimately CRC-protected file, just the wrong format... Horrendous
>> amount of internal contention.
>[...]
>> Correct. If there were a CRC error, the part would recognize this. But there
>> was no CRC error...
>
>Is there some reason why the part doesn't ALSO recognize that the bitstream
>is too short?  I wouldn't think it would expect the CRC until it had filled
>all of the RAM cells.
>
Anything to do with partial reconfiguration maybe?
Like, is it possible to generate a _valid_ short bitstream to reprogram
part of the device but leaving the remainder unchanged?

- Brian

######

From: krw@btv.ibm.com (Keith R. Williams)
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
Date: Mon, 05 Mar 2001 18:06:41 GMT
Organization: IBM Global Services North -- Burlington, Vermont, USA
Lines: 31
Message-ID: <3aa3d52f.17694513@mdnews.btv.ibm.com>
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <qhn1b2a054.fsf@ruckus.brouhaha.com> <pbj7at01r7c7ci8bt4diigabhqscljd2ku@4ax.com>
NNTP-Posting-Host: sneakers.btv.ibm.com
X-Trace: news.btv.ibm.com 983815766 24696 9.66.117.41 (5 Mar 2001 18:09:26 GMT)
X-Complaints-To: news@btv.ibm.com
NNTP-Posting-Date: 5 Mar 2001 18:09:26 GMT
X-Newsreader: Forte Free Agent 1.21/32.243
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!news.maxwell.syr.edu!newsfeed.skycache.com!Cidera!portc03.blue.aol.com!newsjunkie.ans.net!news.chips.ibm.com!newsfeed.btv.ibm.com!news.btv.ibm.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:5012

On Mon, 05 Mar 2001 17:31:38 +0000, Brian Drummond
<brian@shapes.demon.co.uk> wrote:

>On 03 Mar 2001 15:44:07 -0800, Eric Smith
><eric-no-spam-for-me@brouhaha.com> wrote:
>
>>Peter Alfke <palfke@earthlink.net> writes:
>>> The gist is:
>>> Virtex parts do check for CRC errors, but not for formatting errors. And you
>>> sent a legitimately CRC-protected file, just the wrong format... Horrendous
>>> amount of internal contention.
>>[...]
>>> Correct. If there were a CRC error, the part would recognize this. But there
>>> was no CRC error...
>>
>>Is there some reason why the part doesn't ALSO recognize that the bitstream
>>is too short?  I wouldn't think it would expect the CRC until it had filled
>>all of the RAM cells.
>>
>Anything to do with partial reconfiguration maybe?
>Like, is it possible to generate a _valid_ short bitstream to reprogram
>part of the device but leaving the remainder unchanged?

Perhaps you've just stepped on another reason the tools don't support
partial reconfiguration? Two _valid_ short bitstreams may create many
drivers on the same wire.

----
  Keith

######

Sender: eric@ruckus.brouhaha.com
From: Eric Smith <eric-no-spam-for-me@brouhaha.com>
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <qhn1b2a054.fsf@ruckus.brouhaha.com> <pbj7at01r7c7ci8bt4diigabhqscljd2ku@4ax.com>
X-Disclaimer: Everything I write is false.
Organization: Eric Conspiracy Secret Labs
X-Eric-Conspiracy: There is no conspiracy.
Date: 05 Mar 2001 10:07:40 -0800
Message-ID: <qhlmqkgkcz.fsf@ruckus.brouhaha.com>
Lines: 18
User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
NNTP-Posting-Host: ruckus.brouhaha.com
X-Trace: 5 Mar 2001 10:08:32 -0800, ruckus.brouhaha.com
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!uni-erlangen.de!news-nue1.dfn.de!news-lei1.dfn.de!news-fra1.dfn.de!news.tele.dk!212.74.64.35!colt.net!nycmny1-snf1.gtei.net!cpk-news-hub1.bbnplanet.com!news.gtei.net!newsfeed.mathworks.com!news.kjsl.com!news.spies.com!ruckus.brouhaha.com
Xref: chonsp.franklin.ch comp.arch.fpga:5017

I wrote:
> Is there some reason why the part doesn't ALSO recognize that the bitstream
> is too short?  I wouldn't think it would expect the CRC until it had filled
> all of the RAM cells.

Brian Drummond <brian@shapes.demon.co.uk> writes:
> Anything to do with partial reconfiguration maybe?
> Like, is it possible to generate a _valid_ short bitstream to reprogram
> part of the device but leaving the remainder unchanged?

Having looked over the XAPP 176 appnote on configuration and readback of
the Spartan II over the weekend, I now have a better appreciation
for how the config process works.  I think your hypothesis is correct.

I guess the thing that still surprises me is that each size of FPGA
needs a different frame size in the bitstream (table 4 on page 15,
XC2S200 not specified), so the parts should have been able to detect the
wrong frame size, even if they can't detect the wrong total image size.

######

From: Brian Drummond <brian@shapes.demon.co.uk>
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
Date: Tue, 06 Mar 2001 19:43:02 +0000
Message-ID: <89v9atgl80fgeteo8tjvbi5nkosoq7rci6@4ax.com>
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <qhn1b2a054.fsf@ruckus.brouhaha.com> <pbj7at01r7c7ci8bt4diigabhqscljd2ku@4ax.com> <3aa3d52f.17694513@mdnews.btv.ibm.com>
NNTP-Posting-Host: shapes.demon.co.uk
X-NNTP-Posting-Host: shapes.demon.co.uk:158.152.228.158
X-Trace: news.demon.co.uk 983907528 nnrp-09:3390 NO-IDENT shapes.demon.co.uk:158.152.228.158
X-Complaints-To: abuse@demon.net
X-Newsreader: Forte Agent 1.7/32.534
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 37
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!fr.clara.net!heighliner.fr.clara.net!diablo.netcom.net.uk!netcom.net.uk!newsfeed.icl.net!dispose.news.demon.net!news.demon.co.uk!demon!shapes.demon.co.uk!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:5055

On Mon, 05 Mar 2001 18:06:41 GMT, krw@btv.ibm.com (Keith R. Williams)
wrote:

>On Mon, 05 Mar 2001 17:31:38 +0000, Brian Drummond
><brian@shapes.demon.co.uk> wrote:
>
>>On 03 Mar 2001 15:44:07 -0800, Eric Smith
>><eric-no-spam-for-me@brouhaha.com> wrote:
>>
>>>Peter Alfke <palfke@earthlink.net> writes:
>>>> The gist is:
>>>> Virtex parts do check for CRC errors, but not for formatting errors. And you
>>>> sent a legitimately CRC-protected file, just the wrong format... Horrendous
>>>> amount of internal contention.
>>>[...]
>>>> Correct. If there were a CRC error, the part would recognize this. But there
>>>> was no CRC error...
>>>
>>>Is there some reason why the part doesn't ALSO recognize that the bitstream
>>>is too short?  I wouldn't think it would expect the CRC until it had filled
>>>all of the RAM cells.
>>>
>>Anything to do with partial reconfiguration maybe?
>>Like, is it possible to generate a _valid_ short bitstream to reprogram
>>part of the device but leaving the remainder unchanged?
>
>Perhaps you've just stepped on another reason the tools don't support
>partial reconfiguration? Two _valid_ short bitstreams may create many
>drivers on the same wire.

Ouch! The Virtex-II device ID feature can't protect against THAT one!
I'm not sure anything can. Except maybe some design rule checker running
on the set of placed/routed NCD files prior to bitfile generation.
Doesn't look like an easy problem.

- Brian

######

From: Austin Lesea <austin.lesea@xilinx.com>
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
Date: Tue, 06 Mar 2001 19:19:40 -0800
Organization: Xilinx
Lines: 59
Message-ID: <3AA5A8CB.DDAB961A@xilinx.com>
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <qhn1b2a054.fsf@ruckus.brouhaha.com> <pbj7at01r7c7ci8bt4diigabhqscljd2ku@4ax.com> <3aa3d52f.17694513@mdnews.btv.ibm.com> <89v9atgl80fgeteo8tjvbi5nkosoq7rci6@4ax.com>
NNTP-Posting-Host: 149.199.249.6
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD   (WinNT; U)
X-Accept-Language: en
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!fu-berlin.de!newsfeed.direct.ca!look.ca!nntp2.aus1.giganews.com!NetNews1!attws2!attsl2!attla2!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:5037

All,

When using the new Virtex II Internal Configuration Access Port (ICAP), one has to
also use the tools properly with the logic floorplanner, nailing down the separate
regions, and IO intefaces between the separate regions to be reconfigured.

We are on the "cutting edge" of providing the tools, the hardware, and the
methodologies, so this certainly isn't a smooth and easy process (yet).

The basic features are there (ICAP, floorplanner, etc).

There must be first the floor planning, and the identification of reconfigurable
regions (some systems architecture must go it first).

This is an exciting feature that has the universities all asking the right questions,
and hopefully ready to provide some answers to today's problems in reconfigurable
computing.

Austin

Brian Drummond wrote:

> On Mon, 05 Mar 2001 18:06:41 GMT, krw@btv.ibm.com (Keith R. Williams)
> wrote:
>
> >On Mon, 05 Mar 2001 17:31:38 +0000, Brian Drummond
> ><brian@shapes.demon.co.uk> wrote:
> >
> >>On 03 Mar 2001 15:44:07 -0800, Eric Smith
> >><eric-no-spam-for-me@brouhaha.com> wrote:
> >>
> >>>Peter Alfke <palfke@earthlink.net> writes:
> >>>> The gist is:
> >>>> Virtex parts do check for CRC errors, but not for formatting errors. And you
> >>>> sent a legitimately CRC-protected file, just the wrong format... Horrendous
> >>>> amount of internal contention.
> >>>[...]
> >>>> Correct. If there were a CRC error, the part would recognize this. But there
> >>>> was no CRC error...
> >>>
> >>>Is there some reason why the part doesn't ALSO recognize that the bitstream
> >>>is too short?  I wouldn't think it would expect the CRC until it had filled
> >>>all of the RAM cells.
> >>>
> >>Anything to do with partial reconfiguration maybe?
> >>Like, is it possible to generate a _valid_ short bitstream to reprogram
> >>part of the device but leaving the remainder unchanged?
> >
> >Perhaps you've just stepped on another reason the tools don't support
> >partial reconfiguration? Two _valid_ short bitstreams may create many
> >drivers on the same wire.
>
> Ouch! The Virtex-II device ID feature can't protect against THAT one!
> I'm not sure anything can. Except maybe some design rule checker running
> on the set of placed/routed NCD files prior to bitfile generation.
> Doesn't look like an easy problem.
>
> - Brian

######

From: alfred fuchs <alfred.fuchs@siemens.at>
Newsgroups: comp.arch.fpga
Subject: Re: Bad Xilinx bitstream=big bang?
Date: Wed, 07 Mar 2001 12:19:55 +0100
Organization: Siemens
Lines: 58
Message-ID: <3AA6195B.EFAB63D1@siemens.at>
References: <jVYn6.9307$7Y1.998079@newsread2.prod.itd.earthlink.net> <3AA08CCD.341AA107@earthlink.net> <qhn1b2a054.fsf@ruckus.brouhaha.com> <pbj7at01r7c7ci8bt4diigabhqscljd2ku@4ax.com> <qhlmqkgkcz.fsf@ruckus.brouhaha.com>
NNTP-Posting-Host: pca540.erd.siemens.at
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: news.siemens.at 983963987 32310 158.226.169.109 (7 Mar 2001 11:19:47 GMT)
X-Complaints-To: usenet@news.siemens.at
NNTP-Posting-Date: Wed, 7 Mar 2001 11:19:47 +0000 (UTC)
X-Mailer: Mozilla 4.73 [de]C-CCK-MCD DT  (WinNT; U)
X-Accept-Language: fr,de,en
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!unlisys!news.snafu.de!nautilus.visp-europe.psi.com!newsfeed.Austria.EU.net!newsfeed.kpnqwest.at!news.siemens.at!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:5034

In October 1999 at the EXUS in Paris we made Xilinx aware of that self-destruct
feature of the Virtex series.
A second participant confirmed the phenomenon immediately.
Xilinx confirmed it about a week later.

Apparently they did not find a workaround.

Is anyone aware of a warning issued by Xilinx?

Alfred Fuchs
Siemens PSE PRO LMS


Peter (and Austin and Phil),

Here is my take (*please* correct me where I'm mistaken):

It sounds like there are two weaknesses in (at least)
some of the Xilinx device families that can lead to
catastrophic failures:

1.  Devices don't check the programming data stream
for a "match" of target device.  Thus, you can try to
program a Virtex 600E device with a Virtex 400E
configuration data stream, and the configuration data
will be accepted.

This "hole" allows the designer to make a mistake, and be
burned by it.  The workaround is, simply, to make sure that
your configuration files were compiled for the correct target
device; you screw up at your own peril.

2.  More ominous is that drivers for internal
multi-source busses are not disabled (tri-stated, if
you will) before and during the configuration and powerup
sequence, when the internal state of the device cannot be
controlled or specified by the designer.  I'm not sure there
is *any* workaround to this, short of a re-design of the FPGA die.

We need to understand the breadth of this problem (if the above
assessments are basically correct): which device families are
affected (afflicted), etc. etc.

I'm not posting this to cause alarm, but to distill the
issues at hand as clearly as possible, and avoid any FUD.

Rather than get excited, it would be good for all concerned
to await Xilinx's response which, if history is a guide, will be
an honest and open discussion of the facts, and which will provide
essential guidance to the design community.

Bob Elkind, eteam@aracnet.com