From: "Cary McCormick" Newsgroups: comp.arch.fpga Subject: Clock skew with Xilinx DLLs... Lines: 18 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Message-ID: X-Complaints-To: abuse@usenetserver.com X-Abuse-Info: Please be sure to forward a copy of ALL headers X-Abuse-Info: Otherwise we will be unable to process your complaint properly. NNTP-Posting-Date: Thu, 02 Aug 2001 17:38:30 EDT Organization: WebUseNet Corp. - ReInventing The UseNet Date: Thu, 2 Aug 2001 16:47:27 -0500 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!pinatubo.switch.ch!newsfeeds.belnet.be!news.belnet.be!cyclone2.usenetserver.com!cyclone-atl1!e420r-sjo4.usenetserver.com!usenetserver.com!e3500-chi1.usenetserver.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:8828 Hi folks, I'm using a DLL in a SpartanII design and have discovered with lab experimentation that lo and behold, the /4 output lags the edge of the x1 output by about 1ns. I'm certain that I'm using the DLL correctly (BUFGs on both outputs, feedback comes from BUFG'd x1 output) and I imagine that the phase difference is due entirely to loading differences since the /4 clock is *much* more heavily loaded than the x1 clock. So, given that we're kind of stuck with this (what's the point of BUFG's anyway if this happens?) how can I design with this? Will the Design Manager (using 3.1) check for setup problems? Any design tricks that the gurus can share on this matter?? Safety precautions I can add to the UCF file?? Thanks!! Cary McCormick ###### Message-ID: <3B69F32D.8EF5B75F@mail.com> From: John_H X-Mailer: Mozilla 4.75 [en]C-CCK-MCD (Win95; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Clock skew with Xilinx DLLs... References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 22 Date: Fri, 03 Aug 2001 00:41:14 GMT NNTP-Posting-Host: 192.65.17.17 X-Complaints-To: postmaster@tek.com X-Trace: news-west.eli.net 996799274 192.65.17.17 (Thu, 02 Aug 2001 18:41:14 MDT) NNTP-Posting-Date: Thu, 02 Aug 2001 18:41:14 MDT Organization: Tektronix NewsReader Service Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!portc01.blue.aol.com!cyclone2.usenetserver.com!usenetserver.com!telocity-west!TELOCITY!news-west.eli.net!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:8838 I don't know DLL and BUFG count around where you're working, but how about slaving the 1x clock to the /4 ? It'll take some rethinking to get the whole system to come back and align, but if your feedbacks are what give you the edge matching, you need the heavily loaded net in the feedback path. Good luck! Cary McCormick wrote: > Hi folks, > I'm using a DLL in a SpartanII design and have discovered with lab > experimentation that lo and behold, the /4 output lags the edge of the x1 > output by about 1ns. I'm certain that I'm using the DLL correctly (BUFGs on > both outputs, feedback comes from BUFG'd x1 output) and I imagine that the > phase difference is due entirely to loading differences since the /4 clock > is *much* more heavily loaded than the x1 clock. > So, given that we're kind of stuck with this (what's the point of BUFG's > anyway if this happens?) how can I design with this? Will the Design Manager > (using 3.1) check for setup problems? Any design tricks that the gurus can > share on this matter?? Safety precautions I can add to the UCF file?? > Thanks!! > > Cary McCormick ###### Message-ID: <3B6A0A3D.1BEF585D@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Clock skew with Xilinx DLLs... References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 43 Date: Fri, 03 Aug 2001 02:18:12 GMT NNTP-Posting-Host: 24.13.238.93 X-Complaints-To: abuse@home.net X-Trace: news1.wwck1.ri.home.com 996805092 24.13.238.93 (Thu, 02 Aug 2001 19:18:12 PDT) NNTP-Posting-Date: Thu, 02 Aug 2001 19:18:12 PDT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!arclight.uoregon.edu!enews.sgi.com!newshub2.rdc1.sfba.home.com!news.home.com!news1.wwck1.ri.home.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:8768 1ns sounds pretty high for just loading differences. Make sure your clock is clean and stable, also if you have lots of outputs toggling on the same bank as the clock input, you might move them. For experimentation purposes, perhaps you can hold those outputs constant to see if the skew improves. Also, what is the resolution of your test setup, it might be exaggerating the skew. Anyway, I have come to avoid depending on the edges being aligned (the case where we had a problem had on the order of 500ps skew, partly because of greatly unequal loading, partly because of clock jitter we could do nothing about). Changing the UCF generally isn't going to help the situation. The problem comes about in the design because you are moving data between clock domains on edges that are supposed to be aligned. You'll have to change your design to make sure data is stable on both sides of the transition where you cross clock domains. There are a number of ways to do it. You might refer to Peter Alfke's (Xilinx) app note on synchronizing data. It is intended for asynch interfaces, but the mehtods apply here just as well. Cary McCormick wrote: > Hi folks, > I'm using a DLL in a SpartanII design and have discovered with lab > experimentation that lo and behold, the /4 output lags the edge of the x1 > output by about 1ns. I'm certain that I'm using the DLL correctly (BUFGs on > both outputs, feedback comes from BUFG'd x1 output) and I imagine that the > phase difference is due entirely to loading differences since the /4 clock > is *much* more heavily loaded than the x1 clock. > So, given that we're kind of stuck with this (what's the point of BUFG's > anyway if this happens?) how can I design with this? Will the Design Manager > (using 3.1) check for setup problems? Any design tricks that the gurus can > share on this matter?? Safety precautions I can add to the UCF file?? > Thanks!! > > Cary McCormick -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com ###### Message-ID: <3B6AA687.2E3E73E4@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Clock skew with Xilinx DLLs... References: <3B69F32D.8EF5B75F@mail.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 21 Date: Fri, 03 Aug 2001 13:25:01 GMT NNTP-Posting-Host: 24.13.238.93 X-Complaints-To: abuse@home.net X-Trace: news1.wwck1.ri.home.com 996845101 24.13.238.93 (Fri, 03 Aug 2001 06:25:01 PDT) NNTP-Posting-Date: Fri, 03 Aug 2001 06:25:01 PDT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!pinatubo.switch.ch!newsfeeds.belnet.be!news.belnet.be!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newshub2.home.com!news.home.com!news1.wwck1.ri.home.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:8764 In the case we had, the output of the driving flip flop was going to the direct in of a flip-flop in the other slice in the same CLB without passing through a LUT and without using the routing. IIRC the route delay for that fast route is a max of 0.17ns in a -4 part. The Tcko and Tsu/Th values are based on worst case, so a typical chip is going to go much faster. We toggled the output flip flops in adjacent IOBs by the two clocks. This gives you the shortest connection between the clock net and the pad. By using adjacent IOBs, we assumed similar Tcko for both flip-flops. There is no routing variability introduced because the output flop to pad is a dedicated route (through a tristate buffer). We did this in several locations around the chip. Falk wrote: > How can 500ps of clock skew cause this kind of trouble? The clock to out time plus some routing should always be greater than this? How did you measure the skew (skew matching of IO cells, equal routing to IO cells) > > Regards > Falk -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com ###### From: "Kevin Neilson" Newsgroups: comp.arch.fpga References: Subject: Re: Clock skew with Xilinx DLLs... Lines: 41 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Message-ID: Date: Fri, 03 Aug 2001 17:18:41 GMT NNTP-Posting-Host: 209.245.13.19 X-Complaints-To: abuse@earthlink.net X-Trace: newsread1.prod.itd.earthlink.net 996859121 209.245.13.19 (Fri, 03 Aug 2001 10:18:41 PDT) NNTP-Posting-Date: Fri, 03 Aug 2001 10:18:41 PDT Organization: EarthLink Inc. -- http://www.EarthLink.net X-Received-Date: Fri, 03 Aug 2001 10:16:10 PDT (newsmaster1.prod.itd.earthlink.net) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!pinatubo.switch.ch!newsfeeds.belnet.be!news.belnet.be!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!netnews.com!xfer02.netnews.com!newsfeed2.earthlink.net!newsfeed.earthlink.net!newsmaster1.prod.itd.earthlink.net!newsread1.prod.itd.earthlink.net.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:8778 This is exactly what I don't understand about the divided outputs on the DLL. They aren't fed back to the feedback, so they aren't synchronous to the input, so what good are they? There's no way for the DLL to know the delay across the BUFG unless it is fed back, and you can only do that with 1X or 2X outputs. The best thing for you to do is probably use the same clock and clock-enable the slow logic every fourth cycle. Then you have to constrain all that circuitry as 4-cycle multicycle paths. Since the clock enable is then the critical path (since it's not a multicycle path) you may have to use a directive (like syn_direct_enable in Synplify) to ensure that the clock enable gets directly connect to the CE on the flops. "Cary McCormick" wrote in message news:qLja7.10341$C7.5227435@e3500-chi1.usenetserver.com... > > Hi folks, > I'm using a DLL in a SpartanII design and have discovered with lab > experimentation that lo and behold, the /4 output lags the edge of the x1 > output by about 1ns. I'm certain that I'm using the DLL correctly (BUFGs on > both outputs, feedback comes from BUFG'd x1 output) and I imagine that the > phase difference is due entirely to loading differences since the /4 clock > is *much* more heavily loaded than the x1 clock. > So, given that we're kind of stuck with this (what's the point of BUFG's > anyway if this happens?) how can I design with this? Will the Design Manager > (using 3.1) check for setup problems? Any design tricks that the gurus can > share on this matter?? Safety precautions I can add to the UCF file?? > Thanks!! > > Cary McCormick > > > > > ###### Message-ID: <3B6AE71E.52E22A18@andraka.com> From: Ray Andraka Organization: Andraka Consulting Group, Inc X-Mailer: Mozilla 4.77 [en] (WinNT; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.arch.fpga Subject: Re: Clock skew with Xilinx DLLs... References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Lines: 67 Date: Fri, 03 Aug 2001 18:00:35 GMT NNTP-Posting-Host: 24.13.238.93 X-Complaints-To: abuse@home.net X-Trace: news1.wwck1.ri.home.com 996861635 24.13.238.93 (Fri, 03 Aug 2001 11:00:35 PDT) NNTP-Posting-Date: Fri, 03 Aug 2001 11:00:35 PDT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!unlisys!news.snafu.de!news.tele.dk!216.218.192.242!news.he.net!newsfeed.direct.ca!look.ca!newshub2.rdc1.sfba.home.com!news.home.com!news1.wwck1.ri.home.com.POSTED!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:8763 Only one output from the DLL can be fed back to its input, so even using the 2x you run into the same thing. The DLL is designed with the individual outputs closely matched so that there is very little skew coming out of them. Likewise, the bufgs that you can reach with a DLL are co-located and closely matched so that with an equal loading the individual clock trees are phase aligned. The problems occur when the loading on the clock networks is heavily skewed. We also found that jitter on the DLL clock input seems to cause the phase alignment of the individual outputs to move, probably more so than the clock network loading. The problem with using the clock enable as described below is that you can't put that on a clock network, and the "low skew" global networks are too slow if your clocking at even half of what the part is capable of. In this particular case, he said most of the chip is being clocked by the 1/4 clock, in which case you would need to distribute a fast CE over the whole chip. You also unnecessarily congest the routing and increase power. Using two clocks is a good solution, just be careful when crossing the boundaries of the clock domains. Kevin Neilson wrote: > This is exactly what I don't understand about the divided outputs on the > DLL. They aren't fed back to the feedback, so they aren't synchronous to > the input, so what good are they? There's no way for the DLL to know the > delay across the BUFG unless it is fed back, and you can only do that with > 1X or 2X outputs. > > The best thing for you to do is probably use the same clock and clock-enable > the slow logic every fourth cycle. Then you have to constrain all that > circuitry as 4-cycle multicycle paths. Since the clock enable is then the > critical path (since it's not a multicycle path) you may have to use a > directive (like syn_direct_enable in Synplify) to ensure that the clock > enable gets directly connect to the CE on the flops. > > "Cary McCormick" wrote in message > news:qLja7.10341$C7.5227435@e3500-chi1.usenetserver.com... > > > > Hi folks, > > I'm using a DLL in a SpartanII design and have discovered with lab > > experimentation that lo and behold, the /4 output lags the edge of the x1 > > output by about 1ns. I'm certain that I'm using the DLL correctly (BUFGs > on > > both outputs, feedback comes from BUFG'd x1 output) and I imagine that the > > phase difference is due entirely to loading differences since the /4 clock > > is *much* more heavily loaded than the x1 clock. > > So, given that we're kind of stuck with this (what's the point of > BUFG's > > anyway if this happens?) how can I design with this? Will the Design > Manager > > (using 3.1) check for setup problems? Any design tricks that the gurus can > > share on this matter?? Safety precautions I can add to the UCF file?? > > Thanks!! > > > > Cary McCormick > > > > > > > > > > -- -Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com ###### From: Falk Brunner Newsgroups: comp.arch.fpga Subject: Re: Clock skew with Xilinx DLLs... Date: Sat, 04 Aug 2001 17:44:38 +0200 Lines: 19 Message-ID: <3B6C1866.DC44C320@gmx.de> References: <3B69F32D.8EF5B75F@mail.com> <3B6AA687.2E3E73E4@andraka.com> NNTP-Posting-Host: pec-153-92.tnt8.b2.uunet.de (149.225.153.92) Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: fu-berlin.de 996860363 4313819 149.225.153.92 (16 [84877]) X-Mailer: Mozilla 4.08 [de] (Win95; I) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!pinatubo.switch.ch!newsfeeds.belnet.be!news.belnet.be!news.tele.dk!130.133.1.3!fu-berlin.de!uni-berlin.de!pec-153-92.tnt8.b2.uunet.DE!not-for-mail Xref: chonsp.franklin.ch comp.arch.fpga:8816 Ray Andraka schrieb: > > that fast route is a max of 0.17ns in a -4 part. The Tcko and Tsu/Th values are based on worst case, so a typical chip is going to go much faster. Hmmm. > > We toggled the output flip flops in adjacent IOBs by the two clocks. This gives you the shortest connection between the clock net and the pad. By using adjacent IOBs, we assumed similar Tcko for both flip-flops. > There is no routing variability introduced because the output flop to pad is a dedicated route (through a tristate buffer). We did this in several locations around the chip. ;-)) Its always nice to gain knowledge from a master. Thanks a lot. -- MFG Falk