From: "GERRIT SLAVENBURG" Newsgroups: comp.arch,alt.folklore.computers Subject: CPU's with booleans in general purpose registers [new thread] Date: Wed, 10 Nov 1999 15:21:38 -0800 Organization: Prodigy Internet http://www.prodigy.com Lines: 21 Message-ID: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> NNTP-Posting-Host: snfcb612-32.splitrock.net X-Trace: newssvr03-int.news.prodigy.com 942276152 4148544 209.252.7.135 (10 Nov 1999 23:22:32 GMT) X-Complaints-To: abuse@prodigy.net NNTP-Posting-Date: 10 Nov 1999 23:22:32 GMT X-Newsreader: Microsoft Outlook Express 5.00.2314.1300 X-MSMail-Priority: Normal X-Priority: 3 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!fu-berlin.de!howland.erols.net!newscon04-ext.news.prodigy.com.MISMATCH!newscon04!prodigy.com!not-for-mail Thanks to all for suggesting the different architectures and solutions to comparing/branching. What I was actually looking for are CPU architectures that use as PRIMARY method for conditional control flow: predicate operations that produce booleans in gen. purp. registers e.g. SET Rd IF GEQ(Ra, Rb), which sets GPR Rd as 0 or 1 (like the MIPS operations that do exactly this) jump true/jump false operations JUMP if Rd to address (immediate or register content) Several architectures mentioned (MIPS) can do this. It is obviously a good method for efficient evaluation of C-language expressions. Are there any architectures that do ALL conditional jumps this way ? Gerrit ###### From: "Andy Glew" Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Wed, 10 Nov 1999 21:01:56 -0600 Organization: U. Wisc. & Intel (not official) Lines: 20 Message-ID: <80db8a$i1j@spool.cs.wisc.edu> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> NNTP-Posting-Host: egeus.cs.wisc.edu X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!naxos.belnet.be!news.belnet.be!newshub.bart.net!colt.net!news.maxwell.syr.edu!newsswitch.lcs.mit.edu!uchinews!uwvax!news > What I was actually looking for are CPU architectures that > use as PRIMARY method for conditional control flow: Why do I get this sneaking feeling that somebody is preparing a patent lawsuit? I've been burned in this particular area: Xerox has patents on comparisons that generate all 0s or all 1s, forms suitable for use in subsequent masking operations. Apparently it doesn't matter that machines dating all the way back to the 1960s do this; it doesn't matter that my undergraduate CPU project notes have a section discussing whether you should generate 0x00000000/0xFFFFFFFF (good for masking) or 0/1 (good for C) or multiple condition codes in the destination register (like the M88K later did). Apparently what matters are precise wordings, such as "PRIMARY method for comparisons". ###### From: dpeschel@u.washington.edu (Derek Peschel) Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 11 Nov 1999 00:45:29 GMT Organization: University of Washington, Seattle Lines: 35 Message-ID: <80d3j9$8e2$1@nntp6.u.washington.edu> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> NNTP-Posting-Host: saul1.u.washington.edu X-Trace: nntp6.u.washington.edu 942281129 8642 (None) 140.142.17.38 X-Complaints-To: help@cac.washington.edu NNTP-Posting-User: dpeschel Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!newsfeed.nacamar.de!news.maxwell.syr.edu!newsfeed.berkeley.edu!news.u.washington.edu!dpeschel In article <80cuno$u0e$1@newssvr03-int.news.prodigy.com>, GERRIT SLAVENBURG wrote: >Thanks to all for suggesting the different architectures >and solutions to comparing/branching. > >What I was actually looking for are CPU architectures that >use as PRIMARY method for conditional control flow: > > predicate operations that produce booleans in gen. purp. registers > e.g. SET Rd IF GEQ(Ra, Rb), which sets GPR Rd as 0 or 1 > (like the MIPS operations that do exactly this) > > jump true/jump false operations > JUMP if Rd to address (immediate or register content) > >Several architectures mentioned (MIPS) can do this. It is obviously >a good method for efficient evaluation of C-language expressions. >Are there any architectures that do ALL conditional jumps this way ? The Transputer, I think. (I was just reading about it the other day... what a strange architecture!) Note that it uses a stack, however, so I think the _only_ GPR you can set to 0 or 1 is the top-of-stack. I could be wrong... The Transputer has a short opcode/operand format where each is 4 bits, and 13 useful opcodes out of the 16. Two more opcodes allow you to extend the size of the operand beyond 4 bits, and the last opcode is an "escape" mechanism that lets you treat the operand as one of a large additional set of opcodes. So there could be conditional instructions hiding in the extended opcodes, but I doubt it. You may find that other stack-oriented machines share the same conditional jump mechanism. But again, they probably would allow you only to set, and branch based on, the top-of-stack (not other registers). -- Derek ###### Newsgroups: comp.arch,alt.folklore.computers From: nickerson@mirage.boeing.com () Subject: Re: CPU's with booleans in general purpose registers [new thread] X-Nntp-Posting-Host: pundit.ds.boeing.com Message-ID: Lines: 35 Sender: nntp@news.boeing.com (Boeing NNTP News Access) Reply-To: nickerson@mirage.boeing.com () Organization: Boeing Defense & Space Group / Software Systems X-Newsreader: mxrn 6.18-32 References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> Date: Thu, 11 Nov 1999 02:52:28 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.ip-plus.net!News.Amsterdam.UnisourceCS!skynet.be!hermes.visi.com!news-out.visi.com!uunet!ffx.uu.net!xyzzy!mirage.boeing.com!nickerson In article <80cuno$u0e$1@newssvr03-int.news.prodigy.com>, "GERRIT SLAVENBURG" writes: |>Thanks to all for suggesting the different architectures |>and solutions to comparing/branching. |> |>What I was actually looking for are CPU architectures that |>use as PRIMARY method for conditional control flow: |> |> predicate operations that produce booleans in gen. purp. registers |> e.g. SET Rd IF GEQ(Ra, Rb), which sets GPR Rd as 0 or 1 |> (like the MIPS operations that do exactly this) |> |> jump true/jump false operations |> JUMP if Rd to address (immediate or register content) |> |>Several architectures mentioned (MIPS) can do this. It is obviously |>a good method for efficient evaluation of C-language expressions. |>Are there any architectures that do ALL conditional jumps this way ? |> |> Gerrit I'm No Expert but ...; looks like Compaq/DEC Alpha ONLY uses registers to contain the condition for conditional branch & jump; this is opposite from the VAX and thus probably done as a big lesson they learned on bottlenecks to pipelining and multiissue; given that they are also big on both hints and prediction I'd guess that they want to try real hard to be "right" in each memory access while the clock zips along; ref) Alpha AXP Architecture Reference Manual 2nd ed, 1995, Digital Press, R.L. Sites & R.T. Witek, ISBN 1-55558-145-5 --bn (Bart Nickerson) nickerson@pundit.ds.boeing.com (206) 662-0183 ###### From: s_d_lew@my-deja.com Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Thu, 11 Nov 1999 07:19:00 GMT Organization: Deja.com - Before you buy. Lines: 73 Message-ID: <80dql1$8t4$1@nnrp1.deja.com> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> NNTP-Posting-Host: 140.174.105.2 X-Article-Creation-Date: Thu Nov 11 07:19:00 1999 GMT X-Http-User-Agent: Mozilla/4.5 [en] (Win98; I) X-Http-Proxy: 1.0 x26.deja.com:80 (Squid/1.1.22) for client 140.174.105.2 X-MyDeja-Info: XMYDJUIDs_d_lew Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.ip-plus.net!News.Amsterdam.UnisourceCS!newspeer.te.net!news.indigo.ie!iol.ie!newsfeed.icl.net!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!nntp2.deja.com!nnrp1.deja.com!not-for-mail In article <80d3j9$8e2$1@nntp6.u.washington.edu>, dpeschel@u.washington.edu (Derek Peschel) wrote: > In article <80cuno$u0e$1@newssvr03-int.news.prodigy.com>, > GERRIT SLAVENBURG wrote: > >Thanks to all for suggesting the different architectures > >and solutions to comparing/branching. > > > >What I was actually looking for are CPU architectures that > >use as PRIMARY method for conditional control flow: > > > > predicate operations that produce booleans in gen. purp. registers > > e.g. SET Rd IF GEQ(Ra, Rb), which sets GPR Rd as 0 or 1 > > (like the MIPS operations that do exactly this) > > > > jump true/jump false operations > > JUMP if Rd to address (immediate or register content) > > > >Several architectures mentioned (MIPS) can do this. It is obviously > >a good method for efficient evaluation of C-language expressions. > >Are there any architectures that do ALL conditional jumps this way ? > > The Transputer, I think. (I was just reading about it the other day... what > a strange architecture!) Note that it uses a stack, however, so I think the > _only_ GPR you can set to 0 or 1 is the top-of-stack. > > I could be wrong... The Transputer has a short opcode/operand format where > each is 4 bits, and 13 useful opcodes out of the 16. Two more opcodes allow > you to extend the size of the operand beyond 4 bits, and the last opcode is > an "escape" mechanism that lets you treat the operand as one of a large > additional set of opcodes. So there could be conditional instructions > hiding in the extended opcodes, but I doubt it. > > You may find that other stack-oriented machines share the same conditional > jump mechanism. But again, they probably would allow you only to set, and > branch based on, the top-of-stack (not other registers). > > -- Derek Basically the transputer is similar to the mips except more primative (at least mips has both beqz and bneqz) The transputer (ST20) has a 3 operand stack and used the "cj" (conditional jump if zero/false) instruction which checks the top of stack and does a relative jump based on the operand. To compute the true/false value (non-zero/zero) value unless it was there already you could use "eqc" (equal constant), or "gt" (greater than). However the a quirky thing is that if A (the top of stack) was zero, the branch was taken and the stack was undisturbed, however, if the branch was not taken, the stack was pop-ed and A disappeared making it a pain to write boolean logic code... :-( The only other branch instruction is "lend" (loop end) which is really just a specialized form of cj which was only used by the occam compiler... No conditionally executed code except since the instructions were variable length, you could get funny effects by jumping into the middle of an instruction... -slew Sent via Deja.com http://www.deja.com/ Before you buy. ###### From: "Peter Klausler" Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Thu, 11 Nov 1999 09:38:26 -0600 Organization: Silicon Graphics, Inc. Message-ID: <80enr1$a18$1@murrow.corp.sgi.com> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> NNTP-Posting-Host: eagan-rip213.cray.com X-Newsreader: Microsoft Outlook Express 4.72.3110.5 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3 Lines: 70 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!newsfeed.nacamar.de!dispose.news.demon.net!demon!newsfeed.berkeley.edu!enews.sgi.com!news.corp.sgi.com!not-for-mail GERRIT SLAVENBURG wrote in message <80cuno$u0e$1@newssvr03-int.news.prodigy.com>... >Thanks to all for suggesting the different architectures >and solutions to comparing/branching. It seems to me that there are at least five separable questions here: 1) are comparisons explicit instructions, or are they implicit secondary results of other ops? 2) where do the Boolean results of comparisons go? (general regs, cc, special regs) 3) do comparisons include a predicate and generate a truth value (and perhaps its complement), or do they produce multiple results? 4) where do conditional branches get their data? (one general reg, two regs, cc) 5) what predicates can be used on conditional branches? (t/f, compares to 0, cc mask, ?) Instruction set architectures show incredible variation here. Some have different answers for integer vs. floating-point branching. These questions are among the most interesting in instruction set design. Alpha seems cleanest to me among the scalar architectures: 1) comparisons are explicit 2) all comparisons generate results in registers (though the FP truth value is weird) 3) comparisons specify their predicates and produce single truth values 4) conditional branches read a single register 5) branch predicates include all six comparisons of a single value against zero (x=0, x!=0, x<0, x<=0, x>=0, x>0) The various proprietary Cray architectures are close, with the exceptions that the results of vector comparisons go into mask registers, and the predicate set contains only four comparisons of a single value against zero (=0, !=0, <0, and >=0; so <=0 and >0 are missing). The CDC 6600 and successors were similar. Support for "unsigned" integer comparison was weak. SV2 is of course perfect in this regard but I can't say much about it yet. :-) MIPS uses a condition code register to catch the results of FP comparisons, but has explicit integer comparison instructions for signed/unsigned "less than" that produce register results. MIPS integer branches can compare a single register value for inequality with zero all four ways, and can compare two register values for equality or inequality. MIPS has a permanently zero integer register, so the branch comparisons for =0 and !=0 are just special cases of the two-register equality/inequality branches. SPARC has both integer and FP condition codes, if memory serves. The integer cc is set as a side effect of an integer op when a bit is set in the instruction; the FP cc is the result of an explicit compare. Older micro architectures (680x, x86, 6502) all had integer condition codes that were unconditionally set as side effects of integer ops, which is hard to compile for. Explicit floating-point comparison instructions seem to me to be a necessary consequence of using IEEE-754 FP arithmetic. One likely reason for the schizophrenia in micro architectures between the integer side and the FP side (and not just in regard to branching) is their heritage from the days of separate FP coprocessors. Peter Klausler, Cray Research ###### From: "Andy Glew" Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Thu, 11 Nov 1999 09:43:47 -0600 Organization: U. Wisc. & Intel (not official) Lines: 15 Message-ID: <80enss$ao7@spool.cs.wisc.edu> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80db8a$i1j@spool.cs.wisc.edu> <80ec37$rrd@senator-bedfellow.MIT.EDU> NNTP-Posting-Host: egeus.cs.wisc.edu X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!newsfeed.nacamar.de!newspeer.monmouth.com!newsfeed.mathworks.com!bloom-beacon.mit.edu!newsfeed.sgi.net!nntp2.cerf.net!nntp3.cerf.net!qualcomm.com!uwvax!news > I am informed that US patent litigation is slow and expensive even > if the issues appear fairly clear. A patent lawyer once told me that > he expected his company to get sued over a patent that was clearly > invalid but it would cost a few hundred thousand dollars to win the > case. That's cheap. I think the expectation now is that winning a patent challenge, even the most simple and direct one, costs at least a million dollars. You don't need too many such Pyrhhic victories. ###### From: Jan Vorbrueggen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 11 Nov 1999 10:20:37 +0100 Organization: Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum, Germany Lines: 11 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80db8a$i1j@spool.cs.wisc.edu> NNTP-Posting-Host: luda.neuroinformatik.ruhr-uni-bochum.de X-Trace: sunu789.rz.ruhr-uni-bochum.de 942312026 18990 134.147.176.178 (11 Nov 1999 09:20:26 GMT) NNTP-Posting-Date: 11 Nov 1999 09:20:26 GMT X-Newsreader: Gnus v5.3/Emacs 19.33 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!fu-berlin.de!news.ruhr-uni-bochum.de!not-for-mail "Andy Glew" writes: > I've been burned in this particular area: Xerox has patents on comparisons > that generate all 0s or all 1s, forms suitable for use in subsequent masking > operations. Apparently it doesn't matter that machines dating all the way > back to the 1960s do this; I thought adequate documentation of prior art was a killer for a patent, even if already granted? Jan ###### From: "Mike Duffy" Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Thu, 11 Nov 1999 10:31:11 -0500 Lines: 132 Message-ID: <80eng4$q7i$1@autumn.news.rcn.net> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> X-Trace: yCxCyNWvLJRh6zlGcXPl23Zv8wPdajUPNXJ6FaK0bq4= X-Complaints-To: abuse@rcn.com NNTP-Posting-Date: 11 Nov 1999 15:31:16 GMT X-Newsreader: Microsoft Outlook Express 4.72.3110.5 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.ip-plus.net!News.Amsterdam.UnisourceCS!uunet!ams.uu.net!grolier!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!feed1.news.rcn.net!rcn!not-for-mail Hi Bart, Perhaps I can provide a bit of background on the VAX and Alpha approaches. There are over 20 variations on the VAX Branch instruction, but I'll explain only a couple here. The VAX contains condition codes (bits) in the PSL (Processor Status Longword), which are affected by instruction execution. Branch instructions *may* check these bits to decide on a branch. Specifically Bxxxx instructions (Branch on condition) use these bits, whereas BBxxx (Branch on Bit) , BLxxx (Branch on Low Bit) instructions do not. The Alpha PS (counterpart to the VAX PSL) does *not* contain any such bits. The Alpha evaluates registers, whereas the VAX may evaluate either registers, memory, or PSL bits. There are two popular ways to branch and jump on VAX, and one of them is decidedly boolean, and based on register contents. One depends on condition codes in the PSL (Processor Status Longword). Consider the following code: CMPL R4,08(R7) BEQL 1$ The CMPL compares the contents of register R4 to the memory location 08 bytes past the address contained in R7 (This is a common approach, registers are often used to contain the base address of a data structure; fields within the structure are referenced by offset. The Alpha architecture also provides this addressing mode) The CMPL instruction then sets the condition bits in the PSL. In this example, the critical bit is the Z (zero) bit. If the values are equal, Z will be set. The BEQL (Branch if Equal) then tests the Z bit and branches to label 1$ if Z is set. The other main approach is to branch depending on register contents, or some location indicated by register, using the BLB instruction. Consider this code: CALLS #0,routine BLBC R0,1$ The CALLS instruction calls a routine with stack arguments. The routine returns, leaving a success (or failure) code in register R0. (This is the convention - R0 and R1 are considered "scratch" registers and you may not depend on their value across calls). The BLBC (Branch if Low Bit Clear) branches only if the low-order bit of R0 is clear. Under VMS, it is customary for all success codes to be odd and all failure codes to be even. Thus success and failure may be determined by BLBC or BLBS (Branch if Low Bit Set), without determining exactly which return value (of all the possible codes) was actually returned. Thus, all even values effectively evaluate to zero, and all odd valued evaluate to one, when BLBx is used. Both the VAX and Alpha architectures provide BEQL, BNEQ, BLBC, BLBS, and some others, but the collecition of VAX variations is larger than the Alpha, as you might expect. Thus, the VAX architecture, as it is used by real software, quite often uses registers as the operand to a branch instruction, but is not limited to doing so. Rather than being a "performance bottleneck" to multiissue and pipelining, I prefer to think that all the variations would simply be hideously difficult to implement on an architecture like Alpha. After all, the "Reduced" in "RISC" comes from only implementing relatively quick-and-easy instructions. I am a fan of both architectures, but, truth be told, evaluating a crash dump or code in memory on Alpha is more difficult, due to instruction reordering, imprecise trap delivery and the like. These things are necessary to Make It Go Fast, but complicate debugging and support. -Mike Duffy Sr. Software Engineer Process Software Corp. www.process.com nickerson@mirage.boeing.com wrote in message ... > >In article <80cuno$u0e$1@newssvr03-int.news.prodigy.com>, >"GERRIT SLAVENBURG" writes: >|>Thanks to all for suggesting the different architectures >|>and solutions to comparing/branching. >|> >|>What I was actually looking for are CPU architectures that >|>use as PRIMARY method for conditional control flow: >|> >|> predicate operations that produce booleans in gen. purp. registers >|> e.g. SET Rd IF GEQ(Ra, Rb), which sets GPR Rd as 0 or 1 >|> (like the MIPS operations that do exactly this) >|> >|> jump true/jump false operations >|> JUMP if Rd to address (immediate or register content) >|> >|>Several architectures mentioned (MIPS) can do this. It is obviously >|>a good method for efficient evaluation of C-language expressions. >|>Are there any architectures that do ALL conditional jumps this way ? >|> >|> Gerrit > >I'm No Expert but ...; looks like Compaq/DEC Alpha ONLY uses >registers to contain the condition for conditional branch & jump; >this is opposite from the VAX and thus probably done as a big >lesson they learned on bottlenecks to pipelining and multiissue; >given that they are also big on both hints and prediction I'd >guess that they want to try real hard to be "right" in each >memory access while the clock zips along; >ref) Alpha AXP Architecture Reference Manual 2nd ed, 1995, >Digital Press, R.L. Sites & R.T. Witek, ISBN 1-55558-145-5 > >--bn (Bart Nickerson) >nickerson@pundit.ds.boeing.com >(206) 662-0183 ###### From: jfc@mit.edu (John F Carr) Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 11 Nov 1999 12:16:39 GMT Organization: Massachvsetts Institvte of Technology Lines: 15 Message-ID: <80ec37$rrd@senator-bedfellow.MIT.EDU> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80db8a$i1j@spool.cs.wisc.edu> NNTP-Posting-Host: nerd-xing.mit.edu Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!fu-berlin.de!news.maxwell.syr.edu!newsfeed.cwix.com!bloom-beacon.mit.edu!senator-bedfellow.mit.edu!jfc In article , Jan Vorbrueggen wrote: >I thought adequate documentation of prior art was a killer for a patent, even >if already granted? I am informed that US patent litigation is slow and expensive even if the issues appear fairly clear. A patent lawyer once told me that he expected his company to get sued over a patent that was clearly invalid but it would cost a few hundred thousand dollars to win the case. -- John Carr (jfc@mit.edu) ###### From: "Bill Todd" Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Thu, 11 Nov 1999 13:27:03 -0500 Organization: Why won't Outlook let me leave this blank? Lines: 32 Message-ID: <80f1oh$1fp$1@pyrite.mv.net> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80eng4$q7i$1@autumn.news.rcn.net> NNTP-Posting-Host: bnh-3-30.mv.com X-Trace: pyrite.mv.net 942344785 1529 199.125.99.158 (11 Nov 1999 18:26:25 GMT) X-Complaints-To: abuse@mv.com NNTP-Posting-Date: 11 Nov 1999 18:26:25 GMT X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!newsfeed.nacamar.de!newspeer.monmouth.com!newsfeed.mathworks.com!news.mv.net!not-for-mail Jan Vorbrueggen wrote in message news:y466z9uj55.fsf@mailhost.neuroinformatik.ruhr-uni-bochum.de... > "Mike Duffy" writes: > > > Rather than being a "performance bottleneck" to multiissue and pipelining, > > But they are. > > > I prefer to think that all the variations would simply be hideously difficult > > to implement on an architecture like Alpha. > > That's not the problem. The problem is that many of the CICS/VAX ways > introduce artificial interinstruction dependencies that hinder both faster > hardware and faster software. > > Jan Ah, yes. The VAX condition-code architecture was only slightly less convoluted than the 11's, where once the 'carry' condition code was set you could cheerfully write another page or so of assembly before RETURNing from the function with carry as the success/fail indicator - as long, of course, as you were careful not to include any instruction that affected the carry code (adding a comment to that effect was also considered helpful to subsequent maintainers). Architectures that like to re-order instructions in hardware or software could find this awkward. - bill ###### From: Jan Vorbrueggen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 11 Nov 1999 14:17:30 +0100 Organization: Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum, Germany Lines: 12 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80db8a$i1j@spool.cs.wisc.edu> <80ec37$rrd@senator-bedfellow.MIT.EDU> NNTP-Posting-Host: luda.neuroinformatik.ruhr-uni-bochum.de X-Trace: sunu789.rz.ruhr-uni-bochum.de 942326238 563 134.147.176.178 (11 Nov 1999 13:17:18 GMT) NNTP-Posting-Date: 11 Nov 1999 13:17:18 GMT X-Newsreader: Gnus v5.3/Emacs 19.33 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!news.uni-stuttgart.de!news.ruhr-uni-bochum.de!not-for-mail jfc@mit.edu (John F Carr) writes: > I am informed that US patent litigation is slow and expensive even > if the issues appear fairly clear. A patent lawyer once told me that > he expected his company to get sued over a patent that was clearly > invalid but it would cost a few hundred thousand dollars to win the case. Ah yes, that American "innovation" where you get to pay for the privilege of winning a civil suit, be it as plaintiff or even as defendent. Makes sense from a "big business" perspective. Jan ###### From: Jan Vorbrueggen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 11 Nov 1999 16:45:26 +0100 Organization: Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum, Germany Lines: 14 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80eng4$q7i$1@autumn.news.rcn.net> NNTP-Posting-Host: luda.neuroinformatik.ruhr-uni-bochum.de X-Trace: sunu789.rz.ruhr-uni-bochum.de 942335112 7898 134.147.176.178 (11 Nov 1999 15:45:12 GMT) NNTP-Posting-Date: 11 Nov 1999 15:45:12 GMT X-Newsreader: Gnus v5.3/Emacs 19.33 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed.mathworks.com!feeder.qis.net!europa.netcrusader.net!205.252.116.205!howland.erols.net!f.de.uu.net!news.uni-stuttgart.de!news.belwue.de!informatik.tu-muenchen.de!news.ruhr-uni-bochum.de!not-for-mail "Mike Duffy" writes: > Rather than being a "performance bottleneck" to multiissue and pipelining, But they are. > I prefer to think that all the variations would simply be hideously difficult > to implement on an architecture like Alpha. That's not the problem. The problem is that many of the CICS/VAX ways introduce artificial interinstruction dependencies that hinder both faster hardware and faster software. Jan ###### Path: chonsp.franklin.ch!usenet From: Neil Franklin Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 11 Nov 1999 22:18:18 +0100 Organization: My own Private Self Lines: 67 Sender: neil@chonsp.franklin.ch Message-ID: <6ud7tglobp.fsf@chonsp.franklin.ch> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> X-Newsreader: Gnus v5.3/Emacs 19.34 s_d_lew@my-deja.com writes: > > In article <80d3j9$8e2$1@nntp6.u.washington.edu>, > dpeschel@u.washington.edu (Derek Peschel) wrote: > > > > The Transputer, I think. (I was just reading about it the other > day... what > > a strange architecture!) Note that it uses a stack, however, so I > think the > > _only_ GPR you can set to 0 or 1 is the top-of-stack. Yes. The Transputer has no visible flags. So the only way to do conditionals is to use the stack, like for any other intermediate result. > > You may find that other stack-oriented machines share the same > conditional > > jump mechanism. But again, they probably would allow you only to set, > and > > branch based on, the top-of-stack (not other registers). The Transputer does it that way. The cj (conditional jump) instruction (0xAn) is actually add n to instruction pointer if A register is 0. > Basically the transputer is similar to the mips except more primative > (at least mips has both beqz and bneqz) Where? The Transputer is an 3-level stack. The MIPS is an 32 register 3-address machine. The only things they have in commonis that they are both fairly simple designs that rely on fast instruction execution. And both use no condition flags. Back to the original question about conditions in general registers: ARM is also such an architecture, very MIPS like, albeit with 16 registers. AFAIK of all modern processors (i.e. non 86x80 or 68k) only PPC uses dedicated condition registers (at least it has 8 of them). > The transputer (ST20) has a 3 operand stack and used the "cj" > (conditional jump if zero/false) instruction which checks the top > of stack and does a relative jump based on the operand. To compute > the true/false value (non-zero/zero) value unless it was there already > you could use "eqc" (equal constant), or "gt" (greater than). Yes. > However the a quirky thing is that if A (the top of stack) was > zero, the branch was taken and the stack was undisturbed, however, > if the branch was not taken, the stack was pop-ed and A disappeared > making it a pain to write boolean logic code... :-( Yes, annoying. They should have allways popped. > The only other branch instruction is "lend" (loop end) which is really > just a specialized form of cj which was only used by the occam > compiler... Actually doing roughly the job of Z80s DJNZ or 68ks DBcc. -- Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/ ###### From: "Andy Glew" Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Thu, 11 Nov 1999 19:47:39 -0600 Organization: U. Wisc. & Intel (not official) Lines: 88 Message-ID: <80fr98$g46@spool.cs.wisc.edu> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80eng4$q7i$1@autumn.news.rcn.net> <80f1oh$1fp$1@pyrite.mv.net> <382B3B8E.1DBE@hda.hydro.com> NNTP-Posting-Host: egeus.cs.wisc.edu X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!fu-berlin.de!newsfeed.berkeley.edu!uchinews!uwvax!news > next4: ... > adc eax,ebx ; Add to running total, with carry wraparound > mov [edi+ecx*4] ; Write to destination > inc ecx > jnz next4: > > The key here is that the ADC instruction will use the incoming state of > the carry flag and update it for the next iteration, while the loop > count code > > INC ECX > JNZ next4 > > will update all flags _except_ the carry flag, so it can be used on the > next iteration! Warning: on P6 family processors, this may incur the flags equivalent of a partial register stall: ADC writes all of the flags the reorder buffer entry is marked "all valid" INC writes all but the carry flag. P6 does not track the flags individually, and just marks the reorder buffer entry "all but CF valid". JNZ uses ZF, not CF, so it can read a value from the reorder buffer ... ADC reads CF. If the input register has not yet retired, it will see that the reorder buffer entry is marked "all but CF valid". And it will stall at the rename pipestage until CF becomes valid, when the last instruction that wrote the flags, the INC, retires. Now, I understand that in real code you will unroll the loop, so normally you will only have CF span an INC once every unroll-factor ADCs. But, this could introduce a hiccup at every loop iteration. Now^2, it is possible that this code, when running flat out, has minimal delay from executing the INC to the INC retiring. That *might* remove the partial flags stall. Nevertheless, I'd be wont to check it. Other versions of this code accumulate the carries in a higher order part of the accumulator, which they only have to add into the main accumulator once every 2^16 or 2^32 passes. mov esi, ....source... mov edi,.... destination.... xor eax,eax ; Zero the checksum reg & clear Carry at the same time xor edx,edx ; zero inter-iteration carry next4: mov ebx,[esi] ; Load next 4 bytes from the packet adc eax,ebx ; Add to running total, with carry wraparound mov [edi], ebx ; Write to destination mov ebx,[esi+4] adc eax,ebx mov [edi+4], ebx ... mov ebx,[esi+MAX_SPAN] adc eax,ebx mov [edi+MAX_SPAN], ebx ; Write to destination ** adc edx, 0 lea esi,[esi+MAX_SPAN] lea edi,[edi+MAX_SPAN] cmp edi,....end of destination... ; yeah, yeah, I can eliminate this... jlt next4: ;; finally, at end absorb carries add eax,edx adc eax,0 I think this avoids the partial flag stall. ###### From: Terje Mathisen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Thu, 11 Nov 1999 22:56:30 +0100 Organization: Hydro Lines: 52 Message-ID: <382B3B8E.1DBE@hda.hydro.com> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80eng4$q7i$1@autumn.news.rcn.net> <80f1oh$1fp$1@pyrite.mv.net> NNTP-Posting-Host: 136.164.13.19 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 3.04Gold (WinNT; I) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.ip-plus.net!News.Amsterdam.UnisourceCS!newspeer.te.net!news.indigo.ie!iol.ie!newsfeed.icl.net!newsfeed.icl.net!news.algonet.se!algonet!newsfeed1.online.no!newsfeed.online.no!hydro.com!not-for-mail Bill Todd wrote: > > Jan Vorbrueggen wrote in > > That's not the problem. The problem is that many of the CICS/VAX ways > > introduce artificial interinstruction dependencies that hinder both faster > > hardware and faster software. > > Ah, yes. The VAX condition-code architecture was only slightly less > convoluted than the 11's, where once the 'carry' condition code was set you > could cheerfully write another page or so of assembly before RETURNing from > the function with carry as the success/fail indicator - as long, of course, > as you were careful not to include any instruction that affected the carry > code (adding a comment to that effect was also considered helpful to > subsequent maintainers). I agree re. the comment requirement, but this is still the standard way to write high-performance x86 code, in fact all of you who use x86 Linux depend on this kind of code millions of times per day, since it is a crucial part of the tcpip copy&checksum function: Here's a simplified, non-pipelined version: xor eax,eax ; Zero the checksum reg & clear Carry at the same time next4: mov ebx,[esi] ; Load next 4 bytes from the packet lea esi,[esi+4] ; Update input pointer, without any flag modification adc eax,ebx ; Add to running total, with carry wraparound mov [edi+ecx*4] ; Write to destination inc ecx jnz next4: 3 cycles per dword copied, easily unrolled to get close to 1.5 cycles/dword. The key here is that the ADC instruction will use the incoming state of the carry flag and update it for the next iteration, while the loop count code INC ECX JNZ next4 will update all flags _except_ the carry flag, so it can be used on the next iteration! Terje -- - Using self-discipline, see http://www.eiffel.com/discipline "almost all programming can be viewed as an exercise in caching" ###### From: "Thomas Womack" Newsgroups: comp.arch,alt.folklore.computers References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80enr1$a18$1@murrow.corp.sgi.com> Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Fri, 12 Nov 1999 01:18:02 -0000 Lines: 20 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 Message-ID: <382c6e7c_2@einstien.netscapeonline.co.uk> X-Report: Report abuse to abuse@netscapeonline.co.uk X-Original-NNTP-Posting-Host: 10.20.1.166 NNTP-Posting-Host: 10.20.1.166 X-Trace: 12 Nov 1999 19:46:05 GMT, 10.20.1.166 X-Report: Report abuse to abuse@netscapeonline.co.uk Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!newsfeed-zh.ip-plus.net!news.ip-plus.net!News.Amsterdam.UnisourceCS!uunet!ams.uu.net!grolier!newsfeed.tli.de!newsfeed.icl.net!iclnet!plato.netscapeonline.co.uk!10.20.1.166 Peter Klausler wrote > SV2 is of course perfect in this regard but I can't say much about it yet. > :-) I'm slightly surprised that there's not a single SV1 in the new Top500 - weren't they supposed to be the replacement for customers currently using {C|T}90 systems? It's interesting that the T932 comes in only at #267. What impresses me about the new Top500 is how close to peak some of the Japanese vector supercomputers manage to run linpack. (a Fujitsu VPP800/63 gets 482.5 Linpack GFLOPs from 63 8GFLOP-peak processors). Is it really the case that there's only one such computer in existence, or is the table not entirely complete? Tom ###### From: Jan Vorbrueggen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 12 Nov 1999 08:56:55 +0100 Organization: Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum, Germany Lines: 15 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> NNTP-Posting-Host: luda.neuroinformatik.ruhr-uni-bochum.de X-Trace: sunu789.rz.ruhr-uni-bochum.de 942393400 16391 134.147.176.178 (12 Nov 1999 07:56:40 GMT) NNTP-Posting-Date: 12 Nov 1999 07:56:40 GMT X-Newsreader: Gnus v5.3/Emacs 19.33 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!news.uni-stuttgart.de!news.ruhr-uni-bochum.de!not-for-mail Neil Franklin writes: > > However the a quirky thing is that if A (the top of stack) was > > zero, the branch was taken and the stack was undisturbed, however, > > if the branch was not taken, the stack was pop-ed and A disappeared > > making it a pain to write boolean logic code... :-( > > Yes, annoying. They should have allways popped. I think the better solution was to pop on branch, i.e., remove the known value 0, or pop never. Always removing TOS kills that valuable value you might just have calculated laboriously...at least the T8 and followon had DUP and REV. Jan ###### From: "Mike Duffy" Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Fri, 12 Nov 1999 10:57:53 -0500 Lines: 79 Message-ID: <80hde3$9pu$1@autumn.news.rcn.net> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80eng4$q7i$1@autumn.news.rcn.net> X-Trace: 5NXQmk++RvB8xoTxM4X28GXqy2v4Fmcjp/14mV7ocSw= X-Complaints-To: abuse@rcn.com NNTP-Posting-Date: 12 Nov 1999 15:57:55 GMT X-Newsreader: Microsoft Outlook Express 4.72.3110.5 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!news-kar1.dfn.de!news-fra1.dfn.de!news0.de.colt.net!colt.net!newsfeed.icl.net!newsfeed.axxsys.net!nntp.abs.net!chnws02.mediaone.net!207.172.3.37!feed1.news.rcn.net!rcn!not-for-mail Jan Vorbrueggen wrote in message ... > >"Mike Duffy" writes: > >> Rather than being a "performance bottleneck" to multiissue and pipelining, > >But they are. > Hi Jan, thanks for answering. I was looking at it from the standpoint that a common set of PSL bits potentially modified by multiple instructions at once presents an obvious problem. The natural solution is for each set of related instructions to use a unique set of bits, "somewhere else". In an architecture (VAX) which specifically disallows pipelining, that particular concern never arises. When you're designing an architecture which includes multiple issue, such dependancies go out the window before you ever ask whether you can make it fast. Perhaps I don't understand why a Branch instruction reading from a processor register is necessarily slower than one reading from a general register. Or did I miss the real point? Is it because the individual Alpha Compare instructions are concerned with only one relationship (equality, less-than, etc) rather than a VAX Compare operation setting all the bits for multiple relationships and then letting the Branch instruction decide which bit is important? I can certainly see a speed implication there. > >> I prefer to think that all the variations would simply be hideously difficult >> to implement on an architecture like Alpha. > >That's not the problem. The problem is that many of the CICS/VAX ways >introduce artificial interinstruction dependencies that hinder both faster >hardware and faster software. > Now that I think further about it, I believe my point is still a problem, but not the only problem I still maintain that a RISC architecture makes little sense if you throw in every instruction variation you can think of (like VAX). Hard things to implement would be among the first things discarded (unless they were absolutely necessary). Too many similar instructions would be discarded because they take up space and your compiler might never use some of them. Dependencies still exist on RISC boxes, but we try to minimize them by interleaving unrelated instructions while we wait for the dependencies to be satisfied. Anything which gets in the way of that (such as shared processor registers) goes away because it makes no sense, and it ruins the magic trick which creates speed. "Not Simple" = "Not Fast" Mike ###### From: s_d_lew@my-deja.com Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Fri, 12 Nov 1999 18:38:04 GMT Organization: Deja.com - Before you buy. Lines: 37 Message-ID: <80hmqb$43l$1@nnrp1.deja.com> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> NNTP-Posting-Host: 140.174.105.2 X-Article-Creation-Date: Fri Nov 12 18:38:04 1999 GMT X-Http-User-Agent: Mozilla/4.5 [en] (Win98; I) X-Http-Proxy: 1.0 x42.deja.com:80 (Squid/1.1.22) for client 140.174.105.2 X-MyDeja-Info: XMYDJUIDs_d_lew Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed.berkeley.edu!news.maxwell.syr.edu!nntp2.deja.com!nnrp1.deja.com!not-for-mail In article , Jan Vorbrueggen wrote: > Neil Franklin writes: > > > > However the a quirky thing is that if A (the top of stack) was > > > zero, the branch was taken and the stack was undisturbed, however, > > > if the branch was not taken, the stack was pop-ed and A disappeared > > > making it a pain to write boolean logic code... :-( > > > > Yes, annoying. They should have allways popped. > > I think the better solution was to pop on branch, i.e., remove the known > value 0, or pop never. Always removing TOS kills that valuable value you > might just have calculated laboriously...at least the T8 and followon had > DUP and REV. > > Jan The reason those bristol guys always gave me for cj's behavior was that branches were for "exceptional code" so that you computed the assertion and if it was true you just continue going (saving you the trouble of poping the value yourself)... However, it was also those bristol guys that tried to make the assembly language of the transputer a trade secret too, maybe there's some reason they wanted to keep it secret.. ;-) -slew Sent via Deja.com http://www.deja.com/ Before you buy. ###### Path: chonsp.franklin.ch!usenet From: Neil Franklin Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 13 Nov 1999 00:25:30 +0100 Organization: My own Private Self Lines: 61 Sender: neil@chonsp.franklin.ch Message-ID: <6ubt8z5m39.fsf@chonsp.franklin.ch> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> <80hmqb$43l$1@nnrp1.deja.com> X-Newsreader: Gnus v5.3/Emacs 19.34 s_d_lew@my-deja.com writes: > > In article , > Jan Vorbrueggen > wrote: > > Neil Franklin writes: > > > > > > zero, the branch was taken and the stack was undisturbed, however, > > > > if the branch was not taken, the stack was pop-ed and A > disappeared > > > > making it a pain to write boolean logic code... :-( > > > > > > Yes, annoying. They should have allways popped. > > > > I think the better solution was to pop on branch, i.e., remove the > known > > value 0, or pop never. Always removing TOS kills that valuable value > you > > might just have calculated laboriously... Ususally this laboriously calculated value consists of: a flag to decide, whether you want to branch or not, which you in the cj have used for its purpose. In those case you want to re-use it later, either store it and later load it, or use an dup to get two copies. > at least the T8 and followon > had > > DUP and REV. Hmmmm? I just dived into "Transputer Instruction Set - A compiler writers guide", the official Inmos publication. No reference to dup being T8xx only. Are they capable of even screwing up their manuals? > The reason those bristol guys always gave me for cj's behavior was > that branches were for "exceptional code" so that you computed the > assertion and if it was true you just continue going (saving you the > trouble of poping the value yourself)... Accepted. Why should you pop it, you do not for anything else. That is the whole principle of stack machines. They auto-pop. If you want to keep somethign dup it. The problem here is more in _not_ auto-pop-ing when the branch is taken, neccessating an pop there where you land. Bad if that is code that can also be executed by directly running into that code. > However, it was also those bristol guys that tried to make the > assembly language of the transputer a trade secret too, maybe > there's some reason they wanted to keep it secret.. ;-) I think that was more an "Assembly is dead" ideological thing. Who would want anything else than out great HLL (Occam)? At least they later did issue the compiler writers guide. -- Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/ ###### From: s_d_lew@my-deja.com Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Sat, 13 Nov 1999 01:59:50 GMT Organization: Deja.com - Before you buy. Lines: 81 Message-ID: <80igml$n7e$1@nnrp1.deja.com> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> <80hmqb$43l$1@nnrp1.deja.com> <6ubt8z5m39.fsf@chonsp.franklin.ch> NNTP-Posting-Host: 140.174.105.2 X-Article-Creation-Date: Sat Nov 13 01:59:50 1999 GMT X-Http-User-Agent: Mozilla/4.5 [en] (Win98; I) X-Http-Proxy: 1.0 x24.deja.com:80 (Squid/1.1.22) for client 140.174.105.2 X-MyDeja-Info: XMYDJUIDs_d_lew Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed.mathworks.com!newshub.northeast.verio.net!newspeer.monmouth.com!nntp2.deja.com!nnrp1.deja.com!not-for-mail In article <6ubt8z5m39.fsf@chonsp.franklin.ch>, Neil Franklin wrote: > s_d_lew@my-deja.com writes: > > > > In article , > > Jan Vorbrueggen > > wrote: > > > Neil Franklin writes: > > > > > > > > zero, the branch was taken and the stack was undisturbed, however, > > > > > if the branch was not taken, the stack was pop-ed and A > > disappeared > > > > > making it a pain to write boolean logic code... :-( > > > > > > > > Yes, annoying. They should have allways popped. > > > > > > I think the better solution was to pop on branch, i.e., remove the > > known > > > value 0, or pop never. Always removing TOS kills that valuable value > > you > > > might just have calculated laboriously... > > Ususally this laboriously calculated value consists of: a flag to > decide, whether you want to branch or not, which you in the cj have > used for its purpose. In those case you want to re-use it later, either > store it and later load it, or use an dup to get two copies. > > > at least the T8 and followon > > had > > > DUP and REV. > > Hmmmm? I just dived into "Transputer Instruction Set - A compiler > writers guide", the official Inmos publication. No reference to dup > being T8xx only. Are they capable of even screwing up their manuals? Well this is truly trivia, but on the T414 there was no DUP instruction, but there was a REV instruction. When the T800 instruction set was being designed, the compiler writers complained so much that they had to put it in. So DUP appeared in all future transputers (T425, T225, T805, ST20, etc...)... Bonus points to anyone who remembers what the compiler writers were complaining about... ;-) Extra credit to anyone who can explain the difference between ROT and POP (or was it just a documentation error?) > > The reason those bristol guys always gave me for cj's behavior was > > that branches were for "exceptional code" so that you computed the > > assertion and if it was true you just continue going (saving you the > > trouble of poping the value yourself)... > > Accepted. Why should you pop it, you do not for anything else. That > is the whole principle of stack machines. They auto-pop. If you want > to keep somethign dup it. > > The problem here is more in _not_ auto-pop-ing when the branch is taken, > neccessating an pop there where you land. Bad if that is code that can > also be executed by directly running into that code. > > > However, it was also those bristol guys that tried to make the > > assembly language of the transputer a trade secret too, maybe > > there's some reason they wanted to keep it secret.. ;-) > > I think that was more an "Assembly is dead" ideological thing. Who > would want anything else than out great HLL (Occam)? > > At least they later did issue the compiler writers guide. Only under duress ;-) -slew Sent via Deja.com http://www.deja.com/ Before you buy. ###### From: Andreas Kaiser Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Sat, 13 Nov 1999 21:28:15 +0100 Organization: Ananke Lines: 32 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> <80hmqb$43l$1@nnrp1.deja.com> <6ubt8z5m39.fsf@chonsp.franklin.ch> <80igml$n7e$1@nnrp1.deja.com> NNTP-Posting-Host: isdn166.s.netic.de Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: news.lf.net 942524896 91712 212.9.162.166 (13 Nov 1999 20:28:16 GMT) X-Complaints-To: usenet@news.LF.net NNTP-Posting-Date: 13 Nov 1999 20:28:16 GMT X-Newsreader: Forte Agent 1.6/32.525 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!news.uni-stuttgart.de!news.belwue.de!LF.net!not-for-mail On Sat, 13 Nov 1999 01:59:50 GMT, s_d_lew@my-deja.com wrote: >Well this is truly trivia, but on the T414 there was no DUP >instruction, but there was a REV instruction. When the T800 >instruction set was being designed, the compiler writers >complained so much that they had to put it in. IMHO the sole reason why it appeared in the T800 was the floating point stuff - some address operands were used twice in conversion sequences. INMOS was minimalistic here - the T414 did not need DUP when programmed in OCCAM, and these beasts were definitely not designed with C in mind. >Bonus points to anyone who remembers what the compiler writers >were complaining about... ;-) I've built a compiler system based on Johnsons "Portable C Compiler" (with code generation completely rewritten, or course). But the lack of a proper instruction set for the C language was the easier part. It was fairly easy to replace DUP by store and load, although slow. However if you needed synchonization between more than 2 processes running at any priority level, things were getting bizarre. Or more generally, if you needed producer/consumer channels between more than 2 processes (if these had been available, they could have been used for synchonization). >Extra credit to anyone who can explain the difference between >ROT and POP (or was it just a documentation error?) Can't tell, because they didn't exist at that time (T212/414/800). ###### From: Andreas Kaiser Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Sat, 13 Nov 1999 21:28:16 +0100 Organization: Ananke Lines: 43 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> NNTP-Posting-Host: isdn166.s.netic.de Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: news.lf.net 942524897 91712 212.9.162.166 (13 Nov 1999 20:28:17 GMT) X-Complaints-To: usenet@news.LF.net NNTP-Posting-Date: 13 Nov 1999 20:28:17 GMT X-Newsreader: Forte Agent 1.6/32.525 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!rz.uni-karlsruhe.de!news.uni-stuttgart.de!news.belwue.de!LF.net!not-for-mail On Thu, 11 Nov 1999 07:19:00 GMT, s_d_lew@my-deja.com wrote: >However the a quirky thing is that if A (the top of stack) was >zero, the branch was taken and the stack was undisturbed, however, >if the branch was not taken, the stack was pop-ed and A disappeared >making it a pain to write boolean logic code... :-( I beg to differ. It was just a bit different, but not at all painful. Actually, compilers got simpler this way, and the result exceptionally compact. Classic boolean expressions like condition1 AND condition2 AND condition2 compiled as: condition1 CJ l1 condition2 CJ l1 condition3 l1: leave a proper 0/1 state on TOS. Quite a bit more complex if CJ drops the zero operand. And if the non-zero operand is left on TOS, conditions 2 and 3 have trouble accessing stack operands pushed before condition 1. Other logical expressions were rewritten for fit into this scheme: condition1 AND (condition2 OR condition3) was rewitten as condition1 AND NOT ((NOT condition1) AND (NOT condition2)) and compiled as condition1 CJ l1 not condition2 CJ l2 not condition3 l2: EQC 0 (EQuals Constant) l1: The C conditional operator ?: was a littler harder though, further complicated because an unconditional jump could not be used as it had the potention of voiding the operand stack. But the transputers were designed for OCCAM and OCCAM did not have such an operator. ###### From: Bernd Paysan Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: Sun, 14 Nov 1999 01:30:39 +0100 Organization: Bernd Paysan, 81477 Muenchen, Germany Lines: 14 Message-ID: <382E02AF.5C82DA83@gmx.de> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> <80hmqb$43l$1@nnrp1.deja.com> <6ubt8z5m39.fsf@chonsp.franklin.ch> <80igml$n7e$1@nnrp1.deja.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: news@paysan.nom NNTP-Posting-Date: 14 Nov 1999 00:30:39 GMT X-Mailer: Mozilla 3.0 (X11; I; Linux 2.3.24 i586) NNTP-Posting-Host: dial053.mucweb.de X-Trace: 14 Nov 1999 21:41:51 +0100, dial053.mucweb.de Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!fu-berlin.de!newsfeed.nacamar.de!rz.uni-karlsruhe.de!blackbush.xlink.net!isar.de!news.muenchen.roses.de!news.touch.net!granny.paysan.nom!not-for-mail s_d_lew@my-deja.com wrote: > Extra credit to anyone who can explain the difference between > ROT and POP (or was it just a documentation error?) I'm not sure if the later Transputer instructions followed Forth naming conventions (at least for DUP, they did), but then, ROT and POP are inverse to each others (ROT in Forth gets the thirdmost element as new top of stack, while POP obviously means that the top element gets the third one, with the Transputer's limit of 3 stack elements). -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/ ###### From: Jan Vorbrueggen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 14 Nov 1999 21:04:01 +0100 Organization: Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum, Germany Lines: 22 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> <80hmqb$43l$1@nnrp1.deja.com> <6ubt8z5m39.fsf@chonsp.franklin.ch> NNTP-Posting-Host: luda.neuroinformatik.ruhr-uni-bochum.de X-Trace: sunu789.rz.ruhr-uni-bochum.de 942609825 17131 134.147.176.178 (14 Nov 1999 20:03:45 GMT) NNTP-Posting-Date: 14 Nov 1999 20:03:45 GMT X-Newsreader: Gnus v5.3/Emacs 19.33 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!fu-berlin.de!news.ruhr-uni-bochum.de!not-for-mail Neil Franklin writes: > Ususally this laboriously calculated value consists of: a flag to > decide, whether you want to branch or not, which you in the cj have > used for its purpose. In those case you want to re-use it later, either > store it and later load it, or use an dup to get two copies. Nope. Imagine a piece of unrolled code, and you have a variable governing the remaining number of iterations. It will be counted down to zero, and you branch out-of-line when that is the case. Popping it when continuing is nonsense. DUP before has the disadvantage that it's a two-bye opcode, IIRC, and that it might push a valuable integer of the execution stack. In fact, if it were to leave the zero on the stack after branching, a simple ADD would remove it. > Hmmmm? I just dived into "Transputer Instruction Set - A compiler > writers guide", the official Inmos publication. No reference to dup > being T8xx only. Are they capable of even screwing up their manuals? There was life before the T8, believe me, and without DUP and POP. Jan ###### From: Jan Vorbrueggen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 14 Nov 1999 21:06:28 +0100 Organization: Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum, Germany Lines: 12 Message-ID: References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80d3j9$8e2$1@nntp6.u.washington.edu> <80dql1$8t4$1@nnrp1.deja.com> <6ud7tglobp.fsf@chonsp.franklin.ch> <80hmqb$43l$1@nnrp1.deja.com> <6ubt8z5m39.fsf@chonsp.franklin.ch> <80igml$n7e$1@nnrp1.deja.com> NNTP-Posting-Host: luda.neuroinformatik.ruhr-uni-bochum.de X-Trace: sunu789.rz.ruhr-uni-bochum.de 942609972 17131 134.147.176.178 (14 Nov 1999 20:06:12 GMT) NNTP-Posting-Date: 14 Nov 1999 20:06:12 GMT X-Newsreader: Gnus v5.3/Emacs 19.33 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!newsfeed.rhein-neckar.de!news.rhein-neckar.de!news.prima.de!news.ruhrgebiet.individual.net!news-koe1.dfn.de!news.ruhr-uni-bochum.de!not-for-mail s_d_lew@my-deja.com writes: > Extra credit to anyone who can explain the difference between > ROT and POP (or was it just a documentation error?) I thought ROT never officially existed. The difference is that ROT rotates the stack, in principle, thus retaining all three values, while POP removes one. All implementations before the T9 left the same value in the C "register" as it was; I can imagine that the T9 with its superscalar and renaming architecture would behave differently. Jan ###### From: gah@ugcs.caltech.edu (glen herrmannsfeldt) Newsgroups: comp.arch,alt.folklore.computers Subject: Re: CPU's with booleans in general purpose registers [new thread] Date: 18 Nov 1999 01:22:41 GMT Organization: California Institute of Technology, Pasadena Lines: 30 Message-ID: <80vkd1$odd@gap.cco.caltech.edu> References: <80cuno$u0e$1@newssvr03-int.news.prodigy.com> <80eng4$q7i$1@autumn.news.rcn.net> <80hde3$9pu$1@autumn.news.rcn.net> NNTP-Posting-Host: xeno.ugcs.caltech.edu X-Newsreader: NN version 6.5.0 #3 (NOV) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed.berkeley.edu!uchinews!nntp-server.caltech.edu!gah "Mike Duffy" writes: (snip) >Perhaps I don't understand why a Branch instruction reading from >a processor register is necessarily slower than one reading from a general >register. >Or did I miss the real point? >Is it because the individual Alpha Compare instructions are concerned >with only one relationship (equality, less-than, etc) rather than a VAX >Compare >operation setting all the bits for multiple relationships and then letting >the Branch >instruction decide which bit is important? >I can certainly see a speed implication there. For a multiple issue processor, which already has to keep track of which registers are used, and maybe not do any register renaming if there are already enough registers, there is an advantage in not having to keep track of multiple sources and destinations for condition code bits. The compiler should be sure to use different registers for instructions that could possibly execute at the same time. Even with register renaming it should still simplify the processor logic. -- glen