From: "Carl Brannen" <carl.brannen@terabeam.com>
Newsgroups: comp.arch.fpga
Subject: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Tue, 18 Dec 2001 19:20:37 +0000 (UTC)
Organization: Mailgate.ORG Server - http://www.Mailgate.ORG
Lines: 92
Message-ID: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org>
NNTP-Posting-Host: firewall.terabeam.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: news.mailgate.org 1008686318 2620 216.137.15.2 (Tue Dec 18 20:20:37 2001)
X-Complaints-To: abuse@mailgate.org
NNTP-Posting-Date: Tue, 18 Dec 2001 19:20:37 +0000 (UTC)
Injector-Info: news.mailgate.org; posting-host=firewall.terabeam.com; posting-account=51709; posting-date=1008686318
User-Agent: Mailgate Web Server
X-URL: http://www.Mailgate.ORG
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!web2news!firewall.terabeam.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12513

This came as a result of thinking about how to make more efficient barrel
shifters.  Most readers are familiar with fall through barrel shifters and how
they're usually implemented.  (With columns of 2 to 1 muxes that each column
dedicated to shifting the data by a different power of 2.)

I got to thinking about how to use MUXF5s for barrel shifters.  It's clear that
if you could bring out the terms that feed the MUXF5 you could get three
results out of a slice instead of just two.  That would essentially give me
three 2 to 1 muxes in one slice instead of two, and it would really improve
barrel shifter packing (and maybe be good for random logic use).
 
The problem with doing it is that it's hard to get the output of the "F" LUT
out of the slice.  But it can be done by brining it out the CARRY-OUT.  You use
a MUXCY, and apply '0' and '1' to the DI and CI inputs, and the "F" LUT output
to the S input of the MUXCY.  That programs the MUXCY to be a buffer of the "F"
LUT, and you get the (otherwise hidden) "F" LUT output as the carry-out (which
can easily route to the outside of the slice).
 
The only problem with that is that it's very hard to program the CI of the
MUXCY to '1' and still use the MUXF5.  In fact, since the BX input is going to
have to be used by the 'S' input of the MUXF5, you have to use the carry input
of the slice.  And that puts the problem of generating a '1' into the next door
neighbor to the slice, where you'll have to use a LUT to generate it, thereby
wasting the LUT you were trying to save.
 
That was where my analysis ended, but the other night I realized that for a
barrel shifter, I don't have to control the value of that carry-out at all
times.  I only need to control it when I'm actually going to select it in the
next stage of logic.  And since the selector for the next stage of logic will
be the same selector as is coming in on the 'S' pin for the MUXF5, that
suggests that there might be a solution.
 
In fact there is.  You program the "DI" input of the MUXCY to '1', and connect
both the MUXCY.CI input  and the MUXF5.S input to the same BX input.  That's a
natural use for the carry input, but it's usually not done because normally
arithmetic logic is not used at the same time as the MUXF5 pin.  But you can do
it.
 
The result is that when the MUXF5 selects the "F" LUT, (i.e. when the BX input
is '1'), the MUXF5 operates normally, but the Carry-out will be forced to '1'.
But that's the condition under which you would normally ignore the carry-out
anyway, if you were using the circuit as a barrel shifter (with positive shift
amount).
 
On the other hand, when the SHIFT input is low, the "G" LUT is selected for the
MUXF5, and the DI and CI inputs of the MUXCY end up as '1' and '0'
respectively.  That causes the CARRY-OUT to follow the complement of the "F"
LUT output.  But we all know how easy it is to invert logic in a Xilinx, so I
just put an inverter on the CARRY-OUT and the logic takes care of getting rid
of the inversion for me.
 
The router isn't too good at combining MUXCYs and MUXF5s, so to get this into a
single slice I have to RLOC it.  But the good news is that this works.
 
Generalizing, this means that I can get certain collections of 3 logic
functions in a single slice.  The general rule is:
 
The three logic functions are {F,G',F5}, and the 9 input variables are
{F1,F2,F3,F4,G1,G2,G3,G4, and BX}
 
F <= LUT(F1,F2,F3,F4);
 
G' <= LUT(G1,G2,G3,G4) nand  BX;
 
with BX select
    F5 <=
        G when '0',
        F when others;
 
I thought this was cool.  It allows a 16-bit wide 0 to 7 bit barrel shifter in
just 36 LUTs which is 12 less than the number needed to create a barrel shift
the usual way.

The logic for the above was implemented in schematics, but I could easily
convert this to VHDL if anyone is interested.

I also figured out a way to program a column of slices to perform a vector of 3
to 1 muxes instead of just 2 to 1 muxes.  This can be used to create very
efficient barrel shifters where the shift amount is a power of 3.  An example
would be a barrel shifter that shifts between 0 and 8 bits.  With the usual
barrel shift technique, such a barrel shifter would require 4 stages, but using
3 to 1 muxes it requires only 2 stages.  There's some fixed costs associated
with computing the controls for the stages, (and since it uses arithmetic
functions), driving the CARRY-IN for each stage.  (It's actually more complex
than I'm implying here.)  I'll post code for it if anyone is interested.

Carl


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

######

From: Steven Derrien <sderrien@irisa.fr>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Tue, 18 Dec 2001 20:36:44 +0100
Organization: INRIA  - RENNES
Lines: 105
Message-ID: <3C1F9ACC.AFDA823@irisa.fr>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org>
NNTP-Posting-Host: spyder.irisa.fr
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Trace: news.irisa.fr 1008704203 30176 131.254.51.10 (18 Dec 2001 19:36:43 GMT)
X-Complaints-To: usenet@irisa.fr
NNTP-Posting-Date: 18 Dec 2001 19:36:43 GMT
X-Mailer: Mozilla 4.75 [en] (WinNT; U)
X-Accept-Language: en, fr
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!skynet.be!skynet.be!freenix!fr.usenet-edu.net!usenet-edu.net!ciril.fr!univ-angers.fr!news!univ-rennes1.fr!irisa.fr!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12550

Hello,

This could be very useful for optimizing floating point adders or substracters,
since
they use large barrel shifters for normalization and denormalization. If you have
some VHDL for this I'd be eager to use it to see how it affects area and speed !!

Steven


Carl Brannen wrote:

> This came as a result of thinking about how to make more efficient barrel
> shifters.  Most readers are familiar with fall through barrel shifters and how
> they're usually implemented.  (With columns of 2 to 1 muxes that each column
> dedicated to shifting the data by a different power of 2.)
>
> I got to thinking about how to use MUXF5s for barrel shifters.  It's clear that
> if you could bring out the terms that feed the MUXF5 you could get three
> results out of a slice instead of just two.  That would essentially give me
> three 2 to 1 muxes in one slice instead of two, and it would really improve
> barrel shifter packing (and maybe be good for random logic use).
>
> The problem with doing it is that it's hard to get the output of the "F" LUT
> out of the slice.  But it can be done by brining it out the CARRY-OUT.  You use
> a MUXCY, and apply '0' and '1' to the DI and CI inputs, and the "F" LUT output
> to the S input of the MUXCY.  That programs the MUXCY to be a buffer of the "F"
> LUT, and you get the (otherwise hidden) "F" LUT output as the carry-out (which
> can easily route to the outside of the slice).
>
> The only problem with that is that it's very hard to program the CI of the
> MUXCY to '1' and still use the MUXF5.  In fact, since the BX input is going to
> have to be used by the 'S' input of the MUXF5, you have to use the carry input
> of the slice.  And that puts the problem of generating a '1' into the next door
> neighbor to the slice, where you'll have to use a LUT to generate it, thereby
> wasting the LUT you were trying to save.
>
> That was where my analysis ended, but the other night I realized that for a
> barrel shifter, I don't have to control the value of that carry-out at all
> times.  I only need to control it when I'm actually going to select it in the
> next stage of logic.  And since the selector for the next stage of logic will
> be the same selector as is coming in on the 'S' pin for the MUXF5, that
> suggests that there might be a solution.
>
> In fact there is.  You program the "DI" input of the MUXCY to '1', and connect
> both the MUXCY.CI input  and the MUXF5.S input to the same BX input.  That's a
> natural use for the carry input, but it's usually not done because normally
> arithmetic logic is not used at the same time as the MUXF5 pin.  But you can do
> it.
>
> The result is that when the MUXF5 selects the "F" LUT, (i.e. when the BX input
> is '1'), the MUXF5 operates normally, but the Carry-out will be forced to '1'.
> But that's the condition under which you would normally ignore the carry-out
> anyway, if you were using the circuit as a barrel shifter (with positive shift
> amount).
>
> On the other hand, when the SHIFT input is low, the "G" LUT is selected for the
> MUXF5, and the DI and CI inputs of the MUXCY end up as '1' and '0'
> respectively.  That causes the CARRY-OUT to follow the complement of the "F"
> LUT output.  But we all know how easy it is to invert logic in a Xilinx, so I
> just put an inverter on the CARRY-OUT and the logic takes care of getting rid
> of the inversion for me.
>
> The router isn't too good at combining MUXCYs and MUXF5s, so to get this into a
> single slice I have to RLOC it.  But the good news is that this works.
>
> Generalizing, this means that I can get certain collections of 3 logic
> functions in a single slice.  The general rule is:
>
> The three logic functions are {F,G',F5}, and the 9 input variables are
> {F1,F2,F3,F4,G1,G2,G3,G4, and BX}
>
> F <= LUT(F1,F2,F3,F4);
>
> G' <= LUT(G1,G2,G3,G4) nand  BX;
>
> with BX select
>     F5 <=
>         G when '0',
>         F when others;
>
> I thought this was cool.  It allows a 16-bit wide 0 to 7 bit barrel shifter in
> just 36 LUTs which is 12 less than the number needed to create a barrel shift
> the usual way.
>
> The logic for the above was implemented in schematics, but I could easily
> convert this to VHDL if anyone is interested.
>
> I also figured out a way to program a column of slices to perform a vector of 3
> to 1 muxes instead of just 2 to 1 muxes.  This can be used to create very
> efficient barrel shifters where the shift amount is a power of 3.  An example
> would be a barrel shifter that shifts between 0 and 8 bits.  With the usual
> barrel shift technique, such a barrel shifter would require 4 stages, but using
> 3 to 1 muxes it requires only 2 stages.  There's some fixed costs associated
> with computing the controls for the stages, (and since it uses arithmetic
> functions), driving the CARRY-IN for each stage.  (It's actually more complex
> than I'm implying here.)  I'll post code for it if anyone is interested.
>
> Carl
>
> --
> Posted from firewall.terabeam.com [216.137.15.2]
> via Mailgate.ORG Server - http://www.Mailgate.ORG

######

From: Peter Alfke <peter.alfke@xilinx.com>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Tue, 18 Dec 2001 12:12:25 -0800
Organization: Xilinx
Lines: 113
Message-ID: <3C1FA329.76322373@xilinx.com>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr>
Reply-To: peter.alfke@xilinx.com
NNTP-Posting-Host: peter.xsj.xilinx.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; x-mac-type="54455854"; x-mac-creator="4D4F5353"
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 4.77C-CCK-MCD {C-UDP; EBM-APPLE} (Macintosh; U; PPC)
X-Accept-Language: en
To: Steven Derrien <sderrien@irisa.fr>
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.cwix.com!news!nntp.wetware.com!attdv1!attdv2!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12482

I agree that it looks very clever and interesting.
But, just as an aside, floating point need arithmetic shifters for normalization,
not barrel shifters.
Also, remember that Virtex-II has lots of multipliers, many of them begging to be
used as "free" shifters ( multipliy by a power of 2 )

Peter Alfke
====================================
Steven Derrien wrote:

> Hello,
>
> This could be very useful for optimizing floating point adders or substracters,
> since
> they use large barrel shifters for normalization and denormalization. If you have
> some VHDL for this I'd be eager to use it to see how it affects area and speed !!
>
> Steven
>
> Carl Brannen wrote:
>
> > This came as a result of thinking about how to make more efficient barrel
> > shifters.  Most readers are familiar with fall through barrel shifters and how
> > they're usually implemented.  (With columns of 2 to 1 muxes that each column
> > dedicated to shifting the data by a different power of 2.)
> >
> > I got to thinking about how to use MUXF5s for barrel shifters.  It's clear that
> > if you could bring out the terms that feed the MUXF5 you could get three
> > results out of a slice instead of just two.  That would essentially give me
> > three 2 to 1 muxes in one slice instead of two, and it would really improve
> > barrel shifter packing (and maybe be good for random logic use).
> >
> > The problem with doing it is that it's hard to get the output of the "F" LUT
> > out of the slice.  But it can be done by brining it out the CARRY-OUT.  You use
> > a MUXCY, and apply '0' and '1' to the DI and CI inputs, and the "F" LUT output
> > to the S input of the MUXCY.  That programs the MUXCY to be a buffer of the "F"
> > LUT, and you get the (otherwise hidden) "F" LUT output as the carry-out (which
> > can easily route to the outside of the slice).
> >
> > The only problem with that is that it's very hard to program the CI of the
> > MUXCY to '1' and still use the MUXF5.  In fact, since the BX input is going to
> > have to be used by the 'S' input of the MUXF5, you have to use the carry input
> > of the slice.  And that puts the problem of generating a '1' into the next door
> > neighbor to the slice, where you'll have to use a LUT to generate it, thereby
> > wasting the LUT you were trying to save.
> >
> > That was where my analysis ended, but the other night I realized that for a
> > barrel shifter, I don't have to control the value of that carry-out at all
> > times.  I only need to control it when I'm actually going to select it in the
> > next stage of logic.  And since the selector for the next stage of logic will
> > be the same selector as is coming in on the 'S' pin for the MUXF5, that
> > suggests that there might be a solution.
> >
> > In fact there is.  You program the "DI" input of the MUXCY to '1', and connect
> > both the MUXCY.CI input  and the MUXF5.S input to the same BX input.  That's a
> > natural use for the carry input, but it's usually not done because normally
> > arithmetic logic is not used at the same time as the MUXF5 pin.  But you can do
> > it.
> >
> > The result is that when the MUXF5 selects the "F" LUT, (i.e. when the BX input
> > is '1'), the MUXF5 operates normally, but the Carry-out will be forced to '1'.
> > But that's the condition under which you would normally ignore the carry-out
> > anyway, if you were using the circuit as a barrel shifter (with positive shift
> > amount).
> >
> > On the other hand, when the SHIFT input is low, the "G" LUT is selected for the
> > MUXF5, and the DI and CI inputs of the MUXCY end up as '1' and '0'
> > respectively.  That causes the CARRY-OUT to follow the complement of the "F"
> > LUT output.  But we all know how easy it is to invert logic in a Xilinx, so I
> > just put an inverter on the CARRY-OUT and the logic takes care of getting rid
> > of the inversion for me.
> >
> > The router isn't too good at combining MUXCYs and MUXF5s, so to get this into a
> > single slice I have to RLOC it.  But the good news is that this works.
> >
> > Generalizing, this means that I can get certain collections of 3 logic
> > functions in a single slice.  The general rule is:
> >
> > The three logic functions are {F,G',F5}, and the 9 input variables are
> > {F1,F2,F3,F4,G1,G2,G3,G4, and BX}
> >
> > F <= LUT(F1,F2,F3,F4);
> >
> > G' <= LUT(G1,G2,G3,G4) nand  BX;
> >
> > with BX select
> >     F5 <=
> >         G when '0',
> >         F when others;
> >
> > I thought this was cool.  It allows a 16-bit wide 0 to 7 bit barrel shifter in
> > just 36 LUTs which is 12 less than the number needed to create a barrel shift
> > the usual way.
> >
> > The logic for the above was implemented in schematics, but I could easily
> > convert this to VHDL if anyone is interested.
> >
> > I also figured out a way to program a column of slices to perform a vector of 3
> > to 1 muxes instead of just 2 to 1 muxes.  This can be used to create very
> > efficient barrel shifters where the shift amount is a power of 3.  An example
> > would be a barrel shifter that shifts between 0 and 8 bits.  With the usual
> > barrel shift technique, such a barrel shifter would require 4 stages, but using
> > 3 to 1 muxes it requires only 2 stages.  There's some fixed costs associated
> > with computing the controls for the stages, (and since it uses arithmetic
> > functions), driving the CARRY-IN for each stage.  (It's actually more complex
> > than I'm implying here.)  I'll post code for it if anyone is interested.
> >
> > Carl
> >
> > --
> > Posted from firewall.terabeam.com [216.137.15.2]
> > via Mailgate.ORG Server - http://www.Mailgate.ORG

######

Message-ID: <3C1FB0C3.37E32F2D@andraka.com>
From: Ray Andraka <ray@andraka.com>
Organization: Andraka Consulting Group, Inc
X-Mailer: Mozilla 4.77 [en] (WinNT; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 34
Date: Tue, 18 Dec 2001 21:09:15 GMT
NNTP-Posting-Host: 24.13.238.93
X-Complaints-To: abuse@home.net
X-Trace: news1.wwck1.ri.home.com 1008709755 24.13.238.93 (Tue, 18 Dec 2001 13:09:15 PST)
NNTP-Posting-Date: Tue, 18 Dec 2001 13:09:15 PST
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feeder.via.net!newshub2.rdc1.sfba.home.com!news.home.com!news1.wwck1.ri.home.com.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12496

Here's my two cents worth (maybe not even that much).

1) Peter, the term barrel shift is commonly (although technically incorrectly) applied
to shifters which have a variable shift distance.  The virtex II multipliers can in
fact be used this way, but it can be done considerably faster (with more pipelining) in
the fabric for very little additional cost, especially when you consider the resources
taken by the added pipeline registers you need in front of and behind the multiplier to
get any where close to the data sheet speeds.  It all comes down to how do I best use
the resources available to me.

2) The carry chain can also be used for a free doubler circuit.   However, watch the
timing.  There exist false paths (that are also quite slow comparatively speaking)
introduced by the non-standard use of the carry chain (the chain connections are only
used to the next neighbor, not all the way up the chain).  Timingwise, the conventional
approach seems to yield better propagation delays in combinatorial only shifters, and
considerably better times in fully pipelined shifters.  This is a good trick to put in
your back pocket for those times where the need for density outweighs the needs of the
clock cycle.

3) I'd be interested in seeing your layout solution.  The layout is not trivial to
making this perform well.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

######

From: "Carl Brannen" <carl.brannen@terabeam.com>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Wed, 19 Dec 2001 04:05:32 +0000 (UTC)
Organization: Mailgate.ORG Server - http://www.Mailgate.ORG
Lines: 98
Message-ID: <04af47cf81c9461ae53b109bdf127ac1.51709@mygate.mailgate.org>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com> <3C1FB0C3.37E32F2D@andraka.com>
NNTP-Posting-Host: firewall.terabeam.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: news.mailgate.org 1008715124 10527 216.137.15.2 (Wed Dec 19 05:05:32 2001)
X-Complaints-To: abuse@mailgate.org
NNTP-Posting-Date: Wed, 19 Dec 2001 04:05:32 +0000 (UTC)
Injector-Info: news.mailgate.org; posting-host=firewall.terabeam.com; posting-account=51709; posting-date=1008715124
User-Agent: Mailgate Web Server
X-URL: http://www.Mailgate.ORG
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!web2news!firewall.terabeam.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12511

Ray:

>2) The carry chain can also be used for a free doubler circuit.
> However, watch the timing.  There exist false paths (that are
> also quite slow comparatively speaking) introduced by the
> non-standard use of the carry chain (the chain connections
> are only used to the next neighbor, not all the way up the
> chain).  Timingwise, the conventional approach seems to yield
> better propagation delays in combinatorial only shifters, and
> considerably better times in fully pipelined shifters.  This
> is a good trick to put in your back pocket for those times
> where the need for density outweighs the needs of the  clock
> cycle.

This is true.  I have another barrrel shift design that's based on the use
of the carry chain.  I'll post it to the thread later, but as a new topic
head.  But the way I used the Carry chain the false path was a true one.
That is, under certain (rare?) circumstances, the carry chain may have to
propagate a signal from one end to the other.  I'll post an explanation when
I put up the VHDL for that barrel shifter (later tonight, maybe).  I think
this is enough cool circuits for one thread.

> 3) I'd be interested in seeing your layout solution.  The layout
> is not trivial to making this perform well.

No floor planning involved.  Here's the statistics when it's implemented by
itself with flip-flops on all inputs and outputs.  The design is a 16-input
barrel shifter, with 3 select inputs giving shifts from 0 to 7 bits.  The
design is a fall through, as is probably most efficient for this type of
barrel shifter.  It's placed and routed into a small VirtexE-8, which is a
speedy little part.  I put a clock period constraint on it of 5ns, but it only
got to 165MHz.  Still, this isn't bad for a fall through 16-wide barrel
shifter with no floor planning and no buffering on the control lines.  If I
get around to it, I'll convert the source from schematic to (readable) VHDL.
The reason it's in schematic form is because I hate to deal with RLOCs in
VHDL:

<<<
Design Information
------------------
Command Line   : map -p xcv50e-8-cs144 -o map.ncd arith.ngd arith.pcf 
Target Device  : xv50e
Target Package : cs144
Target Speed   : -8
Mapper Version : virtexe -- D.27
Mapped Date    : Tue Dec 18 19:57:47 2001

Design Summary
--------------
   Number of errors:      0
   Number of warnings:    1
   Number of Slices:                 36 out of    768    4%
   Number of Slices containing
      unrelated logic:                0 out of     36    0%
   Number of Slice Flip Flops:       51 out of  1,536    3%
   Number of 4 input LUTs:           36 out of  1,536    2%
   Number of bonded IOBs:            35 out of     94   37%
   Number of GCLKs:                   1 out of      4   25%
   Number of GCLKIOBs:                1 out of      4   25%
Total equivalent gate count for design:  696
Additional JTAG gate count for IOBs:  1,728
>>>

<<<
The Number of signals not completely routed for this design is: 0

   The Average Connection Delay for this design is:        0.885 ns
   The Maximum Pin Delay is:                               2.310 ns
   The Average Connection Delay on the 10 Worst Nets is:   1.645 ns
...

--------------------------------------------------------------------------------
  Constraint                                | Requested  | Actual     | Logic 
                                            |            |            | Levels

--------------------------------------------------------------------------------
* NET "CLK" PERIOD =  5 nS   LOW 50.000 %   | 5.000ns    | 6.036ns    | 4    

--------------------------------------------------------------------------------
>>>

<<<
Constraints cover 276 paths, 0 nets, and 184 connections (92.0% coverage)

Design statistics:
   Minimum period:   6.036ns (Maximum frequency: 165.673MHz)

Analysis completed Tue Dec 18 19:58:13 2001

--------------------------------------------------------------------------------
>>>

Carl


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

######

Message-ID: <3C2027D6.F445EA75@andraka.com>
From: Ray Andraka <ray@andraka.com>
Organization: Andraka Consulting Group, Inc
X-Mailer: Mozilla 4.77 [en] (WinNT; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com> <3C1FB0C3.37E32F2D@andraka.com> <04af47cf81c9461ae53b109bdf127ac1.51709@mygate.mailgate.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 124
Date: Wed, 19 Dec 2001 05:37:11 GMT
NNTP-Posting-Host: 24.13.238.93
X-Complaints-To: abuse@home.net
X-Trace: news1.wwck1.ri.home.com 1008740231 24.13.238.93 (Tue, 18 Dec 2001 21:37:11 PST)
NNTP-Posting-Date: Tue, 18 Dec 2001 21:37:11 PST
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!44051!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.direct.ca!look.ca!newshub2.rdc1.sfba.home.com!news.home.com!news1.wwck1.ri.home.com.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12492

We don't use them in a fall through configuration very often.  As a point of
reference, we have one in a 160 MHz design in a VirtexE-6 that does a 0 to 15
position shift with a 2 clock latency, including rounding at the output.  It is 19
bits wide at the output.  It is not the critical path in the design.  IIRC, there
are 3 levels of conventional shifter, a register, then the last layer along with a
carry chain for the round.  That one is VHDL with the RLOCs in the code.  We prefer
VHDL for placed designs now because of the capability of the generate statement (the
design I was describing is a parameterized generate so it can take arbitrary input
and output widths as well as number of clocks of latency).  Off-hand, I think with
careful layout you could get a 3 layer fall through design using the conventional
approach well above 200 MHz in an E-8.

Carl Brannen wrote:

> Ray:
>
> >2) The carry chain can also be used for a free doubler circuit.
> > However, watch the timing.  There exist false paths (that are
> > also quite slow comparatively speaking) introduced by the
> > non-standard use of the carry chain (the chain connections
> > are only used to the next neighbor, not all the way up the
> > chain).  Timingwise, the conventional approach seems to yield
> > better propagation delays in combinatorial only shifters, and
> > considerably better times in fully pipelined shifters.  This
> > is a good trick to put in your back pocket for those times
> > where the need for density outweighs the needs of the  clock
> > cycle.
>
> This is true.  I have another barrrel shift design that's based on the use
> of the carry chain.  I'll post it to the thread later, but as a new topic
> head.  But the way I used the Carry chain the false path was a true one.
> That is, under certain (rare?) circumstances, the carry chain may have to
> propagate a signal from one end to the other.  I'll post an explanation when
> I put up the VHDL for that barrel shifter (later tonight, maybe).  I think
> this is enough cool circuits for one thread.
>
> > 3) I'd be interested in seeing your layout solution.  The layout
> > is not trivial to making this perform well.
>
> No floor planning involved.  Here's the statistics when it's implemented by
> itself with flip-flops on all inputs and outputs.  The design is a 16-input
> barrel shifter, with 3 select inputs giving shifts from 0 to 7 bits.  The
> design is a fall through, as is probably most efficient for this type of
> barrel shifter.  It's placed and routed into a small VirtexE-8, which is a
> speedy little part.  I put a clock period constraint on it of 5ns, but it only
> got to 165MHz.  Still, this isn't bad for a fall through 16-wide barrel
> shifter with no floor planning and no buffering on the control lines.  If I
> get around to it, I'll convert the source from schematic to (readable) VHDL.
> The reason it's in schematic form is because I hate to deal with RLOCs in
> VHDL:
>
> <<<
> Design Information
> ------------------
> Command Line   : map -p xcv50e-8-cs144 -o map.ncd arith.ngd arith.pcf
> Target Device  : xv50e
> Target Package : cs144
> Target Speed   : -8
> Mapper Version : virtexe -- D.27
> Mapped Date    : Tue Dec 18 19:57:47 2001
>
> Design Summary
> --------------
>    Number of errors:      0
>    Number of warnings:    1
>    Number of Slices:                 36 out of    768    4%
>    Number of Slices containing
>       unrelated logic:                0 out of     36    0%
>    Number of Slice Flip Flops:       51 out of  1,536    3%
>    Number of 4 input LUTs:           36 out of  1,536    2%
>    Number of bonded IOBs:            35 out of     94   37%
>    Number of GCLKs:                   1 out of      4   25%
>    Number of GCLKIOBs:                1 out of      4   25%
> Total equivalent gate count for design:  696
> Additional JTAG gate count for IOBs:  1,728
> >>>
>
> <<<
> The Number of signals not completely routed for this design is: 0
>
>    The Average Connection Delay for this design is:        0.885 ns
>    The Maximum Pin Delay is:                               2.310 ns
>    The Average Connection Delay on the 10 Worst Nets is:   1.645 ns
> ...
>
> --------------------------------------------------------------------------------
>   Constraint                                | Requested  | Actual     | Logic
>                                             |            |            | Levels
>
> --------------------------------------------------------------------------------
> * NET "CLK" PERIOD =  5 nS   LOW 50.000 %   | 5.000ns    | 6.036ns    | 4
>
> --------------------------------------------------------------------------------
> >>>
>
> <<<
> Constraints cover 276 paths, 0 nets, and 184 connections (92.0% coverage)
>
> Design statistics:
>    Minimum period:   6.036ns (Maximum frequency: 165.673MHz)
>
> Analysis completed Tue Dec 18 19:58:13 2001
>
> --------------------------------------------------------------------------------
> >>>
>
> Carl
>
> --
> Posted from firewall.terabeam.com [216.137.15.2]
> via Mailgate.ORG Server - http://www.Mailgate.ORG

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

######

From: "Carl Brannen" <carl.brannen@terabeam.com>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Wed, 19 Dec 2001 08:31:45 +0000 (UTC)
Organization: Mailgate.ORG Server - http://www.Mailgate.ORG
Lines: 34
Message-ID: <865212a8446d17759b69d0dd8852d212.51709@mygate.mailgate.org>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com> <3C1FB0C3.37E32F2D@andraka.com> <04af47cf81c9461ae53b109bdf127ac1.51709@mygate.mailgate.org> <3C2027D6.F445EA75@andraka.com>
NNTP-Posting-Host: firewall.terabeam.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: news.mailgate.org 1008744250 13480 216.137.15.2 (Wed Dec 19 09:31:45 2001)
X-Complaints-To: abuse@mailgate.org
NNTP-Posting-Date: Wed, 19 Dec 2001 08:31:45 +0000 (UTC)
Injector-Info: news.mailgate.org; posting-host=firewall.terabeam.com; posting-account=51709; posting-date=1008744250
User-Agent: Mailgate Web Server
X-URL: http://www.Mailgate.ORG
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!web2news!firewall.terabeam.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12499

Hi Ray,

No question that messing around with the carries is going to slow down a barrel
shifter. It's been a very long time since I used one (okay, it was actually a
"funnel shifter", but I hate that term, so I call them all barrel shifters).  I
just got the one that uses the carries to effectively fit a 3 to 1 mux into a
LUT to place and route correctly.  It uses 38 LUTs to do a two stage shift of 0
to 8 bits (i.e. a 9-bit barrel shift) on a 16-bit input.

It's very efficient when you need a power of 3 shift size.  You get a 9-bit
shift with only 2 stages of logic instead of the usual 4.  I'll post it to the
thread in a minute.

The reason I made it two stage was simply to prevent the synthesizer from
messing with my logic.  It's slow enough (due to the full length carry) that it
would make more engineering sense as a space saving circuit rather than a
highly pipelined design.  But I'm doing this for fun, so what the heck.

The last rough spot was getting the synthesizer to recognize that a LUT4 wasn't
interfering with a MULT_AND.  I had to instantiate the LUT4s.  I haven't
figured out how to apply an attribute to a generated component (i.e. like your
RLOC usage on generated components).  I'm assuming here that "generated" means
the use of the "generate" command in VHDL.  The basic problem is that I haven't
figured out how to properly address the components.  So I went ahead and
instantiated them individually.

If you have a way around that, I'd appreciate the secret.

Carl


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

######

From: "Stephen Melnikoff" <s.j.melnikoff@iee.org>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Wed, 19 Dec 2001 14:18:24 -0000
Organization: The University of Birmingham news server
Lines: 16
Message-ID: <9vq7ts$ggv$1@usenet.bham.ac.uk>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org>
NNTP-Posting-Host: eee553.bham.ac.uk
X-Trace: usenet.bham.ac.uk 1008771836 16927 147.188.145.247 (19 Dec 2001 14:23:56 GMT)
X-Complaints-To: usenet@usenet.bham.ac.uk
NNTP-Posting-Date: 19 Dec 2001 14:23:56 GMT
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2600.0000
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newscore.univie.ac.at!194.25.134.126.MISMATCH!newsfeed01.sul.t-online.de!newsfeed00.sul.t-online.de!t-online.de!news-lei1.dfn.de!news-fra1.dfn.de!news-koe1.dfn.de!lnewspeer00.lnd.ops.eu.uu.net!emea.uu.net!server1.netnews.ja.net!usenet.bham.ac.uk!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12618

> The problem with doing it is that it's hard to get the output of the "F"
LUT
> out of the slice.  But it can be done by brining it out the CARRY-OUT.

Perhaps I'm missing something, but why can't you send the output of the
F-LUT to output X, and the output of the F5-MUX to output F5?

Stephen Melnikoff.

--
Stephen Melnikoff - s.j.melnikoff@iee.org
Electronic, Electrical and Computer Engineering
University of Birmingham, Birmingham, UK

######

From: gah@ugcs.caltech.edu (glen herrmannsfeldt)
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: 19 Dec 2001 23:27:08 GMT
Organization: California Institute of Technology, Pasadena
Lines: 28
Message-ID: <9vr7oc$rrn@gap.cco.caltech.edu>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com> <3C1FB0C3.37E32F2D@andraka.com> <04af47cf81c9461ae53b109bdf127ac1.51709@mygate.mailgate.org> <3C2027D6.F445EA75@andraka.com> <865212a8446d17759b69d0dd8852d212.51709@mygate.mailgate.org>
NNTP-Posting-Host: zloty.ugcs.caltech.edu
X-Newsreader: NN version 6.5.0 #1 (NOV)
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.stanford.edu!logbridge.uoregon.edu!nntp-server.caltech.edu!gah
Xref: chonsp.franklin.ch comp.arch.fpga:12620

"Carl Brannen" <carl.brannen@terabeam.com> writes:

>No question that messing around with the carries is going to slow 
>down a barrel shifter. It's been a very long time since I used one
> (okay, it was actually a "funnel shifter", but I hate that term, 
>so I call them all barrel shifters).  I just got the one that uses 
>the carries to effectively fit a 3 to 1 mux into a LUT to place and 
>route correctly.  It uses 38 LUTs to do a two stage 
>shift of 0 to 8 bits (i.e. a 9-bit barrel shift) on a 16-bit input.

>It's very efficient when you need a power of 3 shift size.  
>You get a 9-bit shift with only 2 stages of logic instead 
>of the usual 4.  I'll post it to the thread in a minute.

It would seem that you could use it for an 8 bit shift, as part
of a larger binary barrel shifter.  Two would do 64, for example.

Otherwise, one use for barrel shifters is floating point normalization.

IBM used a base 16, instead of base 2, floating point in S/360,
S/370, through the beginning of ESA/390.  I believe one reason
for base 16 floating point is the reduced need for shift logic.

I think that for floating point in FPGA's, where barrel shifting
is relatively more expensive, base 16 should be considered.

-- glen

######

From: Frederic Rivoallon <frederic.rivoallon@xilinx.com>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Thu, 20 Dec 2001 11:29:15 -0800
Organization: Xilinx, Inc.
Lines: 18
Message-ID: <3C223C0B.C9E220CF@xilinx.com>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com> <3C1FB0C3.37E32F2D@andraka.com> <04af47cf81c9461ae53b109bdf127ac1.51709@mygate.mailgate.org> <3C2027D6.F445EA75@andraka.com> <865212a8446d17759b69d0dd8852d212.51709@mygate.mailgate.org>
NNTP-Posting-Host: 149.199.9.135
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Mailer: Mozilla 4.74 [en]C-CCK-MCD   (WinNT; U)
X-Accept-Language: en
To: Carl Brannen <carl.brannen@terabeam.com>
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!cpk-news-hub1.bbnplanet.com!cambridge1-snf1.gtei.net!news.gtei.net!bos-service1.ext.raytheon.com!attla1!ip.att.net!newsgate.xilinx.com!cliff.xsj.xilinx.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12575


Carl Brannen wrote:

> ....  I haven't figured out how to apply an attribute to a generated component
> (i.e. like your
> RLOC usage on generated components).  I'm assuming here that "generated" means
> the use of the "generate" command in VHDL.  The basic problem is that I haven't
> figured out how to properly address the components.  So I went ahead and
> instantiated them individually.
>
> If you have a way around that, I'd appreciate the secret.
>

Check this link (How to Attach Attributes Inside of Generate?):
http://tech-www.informatik.uni-hamburg.de/vhdl/doc/faq/FAQ1.html#attributes

Fr�d�ric

######

From: "Carl Brannen" <carl.brannen@terabeam.com>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Thu, 20 Dec 2001 23:11:24 +0000 (UTC)
Organization: Mailgate.ORG Server - http://www.Mailgate.ORG
Lines: 17
Message-ID: <fa498b3b516e368fdff035a3a2aefb41.51709@mygate.mailgate.org>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com> <3C1FB0C3.37E32F2D@andraka.com> <04af47cf81c9461ae53b109bdf127ac1.51709@mygate.mailgate.org> <3C2027D6.F445EA75@andraka.com> <865212a8446d17759b69d0dd8852d212.51709@mygate.mailgate.org> <3C223C0B.C9E220CF@xilinx.com>
NNTP-Posting-Host: firewall.terabeam.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: news.mailgate.org 1008884794 14787 216.137.15.2 (Fri Dec 21 00:11:24 2001)
X-Complaints-To: abuse@mailgate.org
NNTP-Posting-Date: Thu, 20 Dec 2001 23:11:24 +0000 (UTC)
Injector-Info: news.mailgate.org; posting-host=firewall.terabeam.com; posting-account=51709; posting-date=1008884794
User-Agent: Mailgate Web Server
X-URL: http://www.Mailgate.ORG
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!3330945!news.imp.ch!fr.clara.net!heighliner.fr.clara.net!proxad.net!news-hub.cableinet.net!blueyonder!btnet-peer!btnet-peer0!btnet!news.mailgate.org!web2news!firewall.terabeam.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12615

Fr�d�ric, that link is much appreciated, and though I haven't tried it yet, I
believ it will do the trick.

Carl

> > 
> Check this link (How to Attach Attributes Inside of Generate?):
> http://tech-www.informatik.uni-hamburg.de/vhdl/doc/faq/FAQ1.html#attributes
> 
> Fr�d�ric


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

######

Message-ID: <3C227282.44E19DC4@andraka.com>
From: Ray Andraka <ray@andraka.com>
Organization: Andraka Consulting Group, Inc
X-Mailer: Mozilla 4.77 [en] (WinNT; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <3C1F9ACC.AFDA823@irisa.fr> <3C1FA329.76322373@xilinx.com> <3C1FB0C3.37E32F2D@andraka.com> <04af47cf81c9461ae53b109bdf127ac1.51709@mygate.mailgate.org> <3C2027D6.F445EA75@andraka.com> <865212a8446d17759b69d0dd8852d212.51709@mygate.mailgate.org> <3C223C0B.C9E220CF@xilinx.com>
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Lines: 56
Date: Thu, 20 Dec 2001 23:20:15 GMT
NNTP-Posting-Host: 24.13.238.93
X-Complaints-To: abuse@home.net
X-Trace: news1.wwck1.ri.home.com 1008890415 24.13.238.93 (Thu, 20 Dec 2001 15:20:15 PST)
NNTP-Posting-Date: Thu, 20 Dec 2001 15:20:15 PST
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!126361!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!netnews.com!newshub2.rdc1.sfba.home.com!news.home.com!news1.wwck1.ri.home.com.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12578

Since this seems to come up fairly frequently, here is a simple example of RLOCs
applied as an attribute inside the generate.  The itoa function is a homebrew
function that converts an integer into an ascii string.  If your synthesizer
recognizes 'image, you could use that instead.  The attributes have to be assigned in
the same level of code as the labels they are being assigned to are due to
visibility.  In otherwords, components inside a generate are not visible outside the
generate, so the attributes have to be in the generate's declaration section.

   LEN:for i in 0 to bits-1 generate
         constant row :natural:=((width-1)/2)-(i/2);
         constant column:natural:=0;
         constant slice:natural:=0;
         constant rloc_str : string := "R" & itoa(row) & "C" & itoa(column) & ".S" &
itoa(slice);
         attribute RLOC of U1: label is rloc_str;
    begin

         U1: FDE port map (
              Q  => dd(j),
              D  => ff_d,
              C  => clk,
              CE =>lcl_en(en_idx));
   end generate LEN;


Frederic Rivoallon wrote:

> Carl Brannen wrote:
>
> > ....  I haven't figured out how to apply an attribute to a generated component
> > (i.e. like your
> > RLOC usage on generated components).  I'm assuming here that "generated" means
> > the use of the "generate" command in VHDL.  The basic problem is that I haven't
> > figured out how to properly address the components.  So I went ahead and
> > instantiated them individually.
> >
> > If you have a way around that, I'd appreciate the secret.
> >
>
> Check this link (How to Attach Attributes Inside of Generate?):
> http://tech-www.informatik.uni-hamburg.de/vhdl/doc/faq/FAQ1.html#attributes
>
> Fr�d�ric

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

######

From: "Carl Brannen" <carl.brannen@terabeam.com>
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
Date: Fri, 21 Dec 2001 23:38:44 +0000 (UTC)
Organization: Mailgate.ORG Server - http://www.Mailgate.ORG
Lines: 17
Message-ID: <b328360181a3d2b93fcd80254a5de7a7.51709@mygate.mailgate.org>
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <9vq7ts$ggv$1@usenet.bham.ac.uk>
NNTP-Posting-Host: firewall.terabeam.com
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: news.mailgate.org 1008971919 29799 216.137.15.2 (Sat Dec 22 00:38:44 2001)
X-Complaints-To: abuse@mailgate.org
NNTP-Posting-Date: Fri, 21 Dec 2001 23:38:44 +0000 (UTC)
Injector-Info: news.mailgate.org; posting-host=firewall.terabeam.com; posting-account=51709; posting-date=1008971919
User-Agent: Mailgate Web Server
X-URL: http://www.Mailgate.ORG
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.mailgate.org!web2news!firewall.terabeam.com!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12710

Hi Stephen,

> Perhaps I'm missing something, but why can't you send the output of the
> F-LUT to output X, and the output of the F5-MUX to output F5?

As far as I can see, you can do that, and then use the same algorithm I
gave in order to get the G-LUT output out of the slice with the F6-MUX.

Unfortunately, my brain is just a few neurons short of quickly figuring out
how efficient it would be.  My guess is that it is an improvement...

Carl


-- 
Posted from firewall.terabeam.com [216.137.15.2] 
via Mailgate.ORG Server - http://www.Mailgate.ORG

######

Message-ID: <3C2411A5.28EB3EE5@andraka.com>
From: Ray Andraka <ray@andraka.com>
Organization: Andraka Consulting Group, Inc
X-Mailer: Mozilla 4.77 [en] (WinNT; U)
X-Accept-Language: en
MIME-Version: 1.0
Newsgroups: comp.arch.fpga
Subject: Re: Barrel shifter puts three 2->1 muxes / slice in Xilinx
References: <fc4fe869771999cd5c1de2cca90f054a.51709@mygate.mailgate.org> <9vq7ts$ggv$1@usenet.bham.ac.uk>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 33
Date: Sat, 22 Dec 2001 04:51:22 GMT
NNTP-Posting-Host: 24.13.238.93
X-Complaints-To: abuse@home.net
X-Trace: news1.wwck1.ri.home.com 1008996682 24.13.238.93 (Fri, 21 Dec 2001 20:51:22 PST)
NNTP-Posting-Date: Fri, 21 Dec 2001 20:51:22 PST
Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed00.sul.t-online.de!t-online.de!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!wn4feed!worldnet.att.net!24.0.0.38!newshub2.rdc1.sfba.home.com!news.home.com!news1.wwck1.ri.home.com.POSTED!not-for-mail
Xref: chonsp.franklin.ch comp.arch.fpga:12707

The F5 output only goes to the F6 mux in the neighboring slice, nowhere else.
At the F6 you run into a similar problem, because it has to get out somewhere,
and it's other input has to be sourced by the F5 in that slice, which in turn
is sourced by the LUTs.

Stephen Melnikoff wrote:

> > The problem with doing it is that it's hard to get the output of the "F"
> LUT
> > out of the slice.  But it can be done by brining it out the CARRY-OUT.
>
> Perhaps I'm missing something, but why can't you send the output of the
> F-LUT to output X, and the output of the F5-MUX to output F5?
>
> Stephen Melnikoff.
>
> --
> Stephen Melnikoff - s.j.melnikoff@iee.org
> Electronic, Electrical and Computer Engineering
> University of Birmingham, Birmingham, UK

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759