From: "Jan Gray" Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga References: <7lassu$p2q$1@mail.pl.unisys.com> <7lccq8$tv$1@mermaid.ucc.gu.uwa.edu.au> <7ld5uj$2h2$4@autumn.news.rcn.net> <377A0DF8.1E1FAFE9@trailing-edge.com> <37829F73.8B3EEF9F@intel.com> <7lu9ab$n0k@dfw-ixnews7.ix.netcom.com> Subject: Alto in an FPGA (was CPU's directly executing HLL's) Lines: 48 X-Newsreader: Microsoft Outlook Express 4.72.2106.4 X-MimeOLE: Produced By Microsoft MimeOLE V4.72.2106.4 Message-ID: X-Trace: +rUPLudGFXmLILXdabYeNQmgKnRKiHNKSkJOKUzYHOqIOLYIOi/duIiVmKABA9YJ6ZGk/Mt0j0jd!rboMpp9IK6PZU1PGH/+9zIV2PAOJWXm47aDAkOD0F3pQDNPHSj2O8e6zs4KKJAmKIg/QKaw= X-Complaints-To: abuse@gte.net X-Abuse-Info: Please be sure to forward a copy of ALL headers X-Abuse-Info: Otherwise we will be unable to process your complaint properly NNTP-Posting-Date: Wed, 07 Jul 1999 17:22:50 GMT Distribution: world Date: Wed, 07 Jul 1999 17:22:50 GMT Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!isdnet!newsfeed.tli.de!newsfeed.nacamar.de!newspeer.clara.net!news.clara.net!diablo.theplanet.net!amsterdam1-snf1!washdc3-snh1.gtei.net!news.gtei.net!paloalto-snr1.gtei.net.POSTED!not-for-mail Paul Wallich wrote in message ... >It's a little amusing to note that the emulator, the thread executing the user's >program, was actually the lowest-priority thread. (Also amusing to >think that Alto micromachine was something like 1600 gates -- you >could build dozens of them on a single FPGA). Perhaps, but if you count the register files and constant and microcode memory it was much larger than 1600 gates. A while back (around Alto's 25th anniversary) I briefly considered implementing an Alto in a Xilinx XC4000 FPGA. A 1979 era Alto processor, *excluding microcode memory*, requires approximately 400 configurable logic blocks (CLBs): CLBs What ---- ---- 16 32x16-bit R registers 128 8x32x16-bit S registers (1979 Alto) (16 32x16-bit S registers (1974 Alto)) 128 256x16-bit constant memory 64? rest of datapath 64? control (4096 4096x32-bit microcode control memory) ---- ~400 CLBs + lots of TBUFs (the 16-bit "processor bus" is driven by 9+ sources) This would probably fill a 24x24 CLB Xilinx XCS30XL. Perhaps you could include processor and equivalent I/O controllers in an XCS40XL. Now Xilinx has introduced their Virtex device family, which features 8+ 256x16 dual port embedded SRAM blocks. You could implement the S registers in one block ram, the constant memory in another. A 2KW subset of the 4KW control memory would require 16 more, but would still fit in one of the larger Virtex devices. ref: Thacker et al, Alto: A Personal Computer, chapter 33 in Siewiorek et al, Computer Structures: Principles and Examples, McGraw-Hill, 1982 BTW, you can theoretically build dozens of simple CPUs in a single FPGA: see discussion thread at http://deja.com/getdoc.xp?AN=277216882 (XC4085XL) and also http://deja.com/getdoc.xp?AN=444640841 (Virtex). Jan Gray ###### From: pw@panix.com (Paul Wallich) Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's) Date: Wed, 07 Jul 1999 16:20:48 -0400 Organization: PANIX Public Access Internet and UNIX, NYC Lines: 56 Distribution: world Message-ID: References: <7lassu$p2q$1@mail.pl.unisys.com> <7lccq8$tv$1@mermaid.ucc.gu.uwa.edu.au> <7ld5uj$2h2$4@autumn.news.rcn.net> <377A0DF8.1E1FAFE9@trailing-edge.com> <37829F73.8B3EEF9F@intel.com> <7lu9ab$n0k@dfw-ixnews7.ix.netcom.com> NNTP-Posting-Host: pw.dialup.access.net X-Trace: news.panix.com 931378861 18041 166.84.250.178 (7 Jul 1999 20:21:01 GMT) X-Complaints-To: abuse@panix.com NNTP-Posting-Date: 7 Jul 1999 20:21:01 GMT X-Newsreader: MT-NewsWatcher 2.4.4 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news1.sunrise.ch!news.imp.ch!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.mathworks.com!panix!news.panix.com!pw In article , "Jan Gray" wrote: >Paul Wallich wrote in message ... >>It's a little amusing to note that the emulator, the thread executing the >user's >>program, was actually the lowest-priority thread. (Also amusing to >>think that Alto micromachine was something like 1600 gates -- you >>could build dozens of them on a single FPGA). > >Perhaps, but if you count the register files and constant and microcode >memory it was much larger than 1600 gates. But should you? One of the things that's pretty clear is that the micromachine remained relatively constant while the register files and control stores got mucked around. Given the bandwidth requirements (60 MHz by essentially 3 x 32 bits) you could put everything offchip easily enough. And at the time (MSI) the partitioning seemed clear... (I'm only sort of kidding -- the question of what the CPU is goes right along with the question of what language it "directly executes".) >A while back (around Alto's 25th anniversary) I briefly considered >implementing an Alto in a Xilinx XC4000 FPGA. A 1979 era Alto processor, >*excluding microcode memory*, requires approximately 400 configurable logic >blocks (CLBs): > >CLBs What >---- ---- >16 32x16-bit R registers >128 8x32x16-bit S registers (1979 Alto) >(16 32x16-bit S registers (1974 Alto)) >128 256x16-bit constant memory >64? rest of datapath >64? control >(4096 4096x32-bit microcode control memory) >---- >~400 CLBs + lots of TBUFs (the 16-bit "processor bus" is driven by 9+ >sources) > >This would probably fill a 24x24 CLB Xilinx XCS30XL. Perhaps you could >include processor and equivalent I/O controllers in an XCS40XL. > >Now Xilinx has introduced their Virtex device family, which features 8+ >256x16 dual port embedded SRAM blocks. You could implement the S registers >in one block ram, the constant memory in another. A 2KW subset of the 4KW >control memory would require 16 more, but would still fit in one of the >larger Virtex devices. > >ref: Thacker et al, Alto: A Personal Computer, chapter 33 in Siewiorek et >al, Computer Structures: Principles and Examples, McGraw-Hill, 1982 Or: in Lavendel et al, eds., A Decade of Research, (pp 224-238), R.R. Bowker, 1980. ###### From: gillies@cs.ubc.ca (Donald Gillies) Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's) Date: 14 Jul 1999 19:18:45 -0700 Organization: Computer Science, University of British Columbia, Canada Lines: 46 Message-ID: <7mjge5$pqt$1@cascade.cs.ubc.ca> References: <7lassu$p2q$1@mail.pl.unisys.com> <7lccq8$tv$1@mermaid.ucc.gu.uwa.edu.au> <7ld5uj$2h2$4@autumn.news.rcn.net> <377A0DF8.1E1FAFE9@trailing-edge.com> <37829F73.8B3EEF9F@intel.com> <7lu9ab$n0k@dfw-ixnews7.ix.netcom.com> NNTP-Posting-Host: cascade.cs.ubc.ca X-Trace: mughi.cs.ubc.ca 932005126 18883 142.103.7.7 (15 Jul 1999 02:18:46 GMT) X-Complaints-To: usenet@cs.ubc.ca NNTP-Posting-Date: 15 Jul 1999 02:18:46 GMT X-Newsreader: NN version 6.5.1 (NOV) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.rhein-neckar.de!news-kar1.dfn.de!news-han1.dfn.de!news-koe1.dfn.de!do.de.uu.net!newsfeed.esat.net!nntp.primenet.com!nntp.gctr.net!newsfeed.direct.ca!nntp.cs.ubc.ca!cs.ubc.ca!not-for-mail pw@panix.com (Paul Wallich) writes: >In article , "Jan Gray" > wrote: >>Paul Wallich wrote in message ... >>>It's a little amusing to note that the emulator, the thread executing the >>user's >>>program, was actually the lowest-priority thread. (Also amusing to >>>think that Alto micromachine was something like 1600 gates -- you >>>could build dozens of them on a single FPGA). >> >>Perhaps, but if you count the register files and constant and microcode >>memory it was much larger than 1600 gates. >But should you? One of the things that's pretty clear is that the >micromachine remained relatively constant while the register >files and control stores got mucked around. Given the bandwidth >requirements (60 MHz by essentially 3 x 32 bits) you could put >everything offchip easily enough. And at the time (MSI) the >partitioning seemed clear... >(I'm only sort of kidding -- the question of what the CPU is goes >right along with the question of what language it "directly >executes".) Actually, if you're willing to take the Alto one step higher and build a wilflower-class D-machine with the MESA instruction set, this might have the ideal instruction set for a massively parallel FPGA processor. The code density of this machine was something like 2-3 byte-opcodes per high-level language statement. In other words, it was probably the code density champ of all time. That's why Xerox kept it secret and shot itself in the foot. Of course, I'm only sort of kidding - MESA was sort of icky with one foot in the 16-bit world and one foot in the 32-bit world. LONG POINTERs were no fun... Don Gillies - t_dgilli.x@qualcomm.x.com - Planetwide Software, Inc. (consultant) / Globalstar Satellite CDMA Project, Qualcomm Inc., 6455 Lusk Blvd San Diego, California 92121 - phone: 619-651-2326. Adjunct Professor of EE, UBC, Vancouver BC Canada V6T 1Z4 http://www.ee.ubc.ca/home/staff/faculty/gillies/etc/www/index.html (remove x's to reply by email) ###### Date: Wed, 14 Jul 1999 21:27:37 -0700 From: curbow@best.com (Dave Curbow) Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's) Message-ID: References: <7lassu$p2q$1@mail.pl.unisys.com> <7lccq8$tv$1@mermaid.ucc.gu.uwa.edu.au> <7ld5uj$2h2$4@autumn.news.rcn.net> <377A0DF8.1E1FAFE9@trailing-edge.com> <37829F73.8B3EEF9F@intel.com> <7lu9ab$n0k@dfw-ixnews7.ix.netcom.com> <7mjge5$pqt$1@cascade.cs.ubc.ca> X-Newsreader: MT-NewsWatcher 2.4.4 Lines: 62 NNTP-Posting-Host: dynamic45.pm07.san-jose.best.com X-Trace: nntp1.ba.best.com 932012476 221 209.24.165.173 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!news.maxwell.syr.edu!news1.best.com!news3.best.com!nntp1.ba.best.com!not-for-mail In article <7mjge5$pqt$1@cascade.cs.ubc.ca>, gillies@cs.ubc.ca (Donald Gillies) wrote: > Actually, if you're willing to take the Alto one step higher and build > a wilflower-class D-machine with the MESA instruction set, this might > have the ideal instruction set for a massively parallel FPGA > processor. The code density of this machine was something like 2-3 > byte-opcodes per high-level language statement. In other words, it > was probably the code density champ of all time. That's why Xerox > kept it secret and shot itself in the foot. > > Of course, I'm only sort of kidding - MESA was sort of icky with one > foot in the 16-bit world and one foot in the 32-bit world. LONG > POINTERs were no fun... > > > Don Gillies - t_dgilli.x@qualcomm.x.com - Planetwide Software, Inc. > (consultant) / Globalstar Satellite CDMA Project, Qualcomm Inc., > 6455 Lusk Blvd San Diego, California 92121 - phone: 619-651-2326. > Adjunct Professor of EE, UBC, Vancouver BC Canada V6T 1Z4 > http://www.ee.ubc.ca/home/staff/faculty/gillies/etc/www/index.html > (remove x's to reply by email) Actually Xerox built multiprocessor D-class machines. The first version I knew of was done by the printer people in El Segundo, CA -- around 1986. Interpress the precursor to Postscript was defined so that one (printed) page was described completely independent from the others -- unlike Postscript where you can describe an image and then use multiple (Print? it's been too long since I wrote Postscript) to blast the image. Thus, it was possible and desirable to have multiple Mesa processors, so each one could be creating the raster image for another page. That's one reason why Interpress printers were (are?) so much faster than Postscript ones. (Also, Interpress was always binary, unlike Postscript 1 which was ASCII that had to be interpreted.) Anyway, the printer people took the 6085 (second generation Xerox Star hardware) and made a 4(?) processor system. They also modified the Pilot operating system. This design was used for many years -- in fact, I was recently told that some Xerox printers still use Mesa. (When Xerox was pulling the plug on the final support of Star, they got some cries of protest from a group that needed some compiler fixes!!! for their project. This was a 6 pass compiler written in before '82.) In 1987-ish, PARC was working on the Dragon processor -- a multiprocessor version of the Dorado box. (Dorado was an ECL processor that was I think 4 times faster than the Dlion, or 8010/6085 design.) Then Xerox did a deal with Sun -- licensing the Star user interface to Sun for OpenLook, doing some kind of deal so Sun got the Dragon design and even the design team (temporarily) while Xerox got the right to resale Sun SPARC systems with a Xerox nameplate. Xerox also agreed to port Star to the Sun (SunOS) platform -- this was known internally as the Salient project. I've been told that the Dragon project was the precursor to Sun's current very successful clustering technology. Dave Curbow (I wrote a lot of Mesa) Xerox '83-'90 ###### From: huge@nospam.demon.co.uk (Huge) Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's) Date: 15 Jul 1999 13:02:05 GMT Organization: Piglet's Pickles and Preserves Message-ID: <7mkm4d$akp@axalotl.demon.co.uk> References: Reply-To: huge@nospam.demon.co.uk NNTP-Posting-Host: axalotl.demon.co.uk X-NNTP-Posting-Host: axalotl.demon.co.uk:158.152.24.143 X-Trace: news.demon.co.uk 932048953 nnrp-13:16112 NO-IDENT axalotl.demon.co.uk:158.152.24.143 X-Complaints-To: abuse@demon.net x-no-archive: yes Lines: 24 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed.mathworks.com!dispose.news.demon.net!demon!news.demon.co.uk!demon!axalotl.demon.co.uk!axalotl!usenet In article , curbow@best.com (Dave Curbow) writes: >I was recently told that some Xerox printers still use Mesa. (When Xerox AFAIK, DocuTech still does. I'm pretty sure that the UI reuses a lot of XPIW (although this is from observation, rather than from seeing the code). (I left 6 years ago, and this true then...) >Dave Curbow (I wrote a lot of Mesa) >Xerox '83-'90 Hugh. (I worked on Internationalisation of the Docutech UI, in InterLISP, on 1186s, and spent a lot of time waiting for 6085s to reboot. Xerox '81-`93) -- "The road to Paradise is through Intercourse." The uk.transport FAQ; http://www.axalotl.demon.co.uk/transport/FAQ.html [Substitute "axalotl" for "nospam" to email me] ###### From: bruce@hoult.actrix.gen.nz (Bruce Hoult) Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's) Date: Thu, 15 Jul 1999 19:19:17 +1200 Organization: The Internet Group Ltd Lines: 23 Message-ID: References: <7lassu$p2q$1@mail.pl.unisys.com> <7lccq8$tv$1@mermaid.ucc.gu.uwa.edu.au> <7ld5uj$2h2$4@autumn.news.rcn.net> <377A0DF8.1E1FAFE9@trailing-edge.com> <37829F73.8B3EEF9F@intel.com> <7lu9ab$n0k@dfw-ixnews7.ix.netcom.com> <7mjge5$pqt$1@cascade.cs.ubc.ca> NNTP-Posting-Host: macinnat.static.star.net.nz X-Newsreader: MT-NewsWatcher 2.4.4 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed.mathworks.com!nntp-out.monmouth.com!newspeer.monmouth.com!falcon.america.net!ihug.co.nz!bruce In article <7mjge5$pqt$1@cascade.cs.ubc.ca>, gillies@cs.ubc.ca (Donald Gillies) wrote: > a wilflower-class D-machine with the MESA instruction set, this might > have the ideal instruction set for a massively parallel FPGA > processor. The code density of this machine was something like 2-3 > byte-opcodes per high-level language statement. In other words, it > was probably the code density champ of all time. I'm not sure that that follows. What do you call a typical "high-level language statement"? I'd say that a = b and a += b and a = b + c are pretty typical. Assuming that everything is in registers (a good bet with VAX or 68K with 16 registers, almost certain with a typical RISC with 32) then those take 3/3/4 bytes on VAX, 2/2/4 bytes on 68K and 4/4/4 on most any RISC. Oh, and 2/3/4 bytes on Java VM. On x86 they are 1/1/2 bytes if everything is in registers, but that's reasonably unlikely in general, in which case I think it devolves to rather more than the RISCs (result/first operand in register and 2nd operand in memory are quite likely on x86, which gives numbers comparable to 68K or Java). -- Bruce ###### From: bruce@hoult.actrix.gen.nz (Bruce Hoult) Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's) Date: Thu, 15 Jul 1999 19:21:10 +1200 Organization: The Internet Group Ltd Lines: 15 Message-ID: References: <7lassu$p2q$1@mail.pl.unisys.com> <7lccq8$tv$1@mermaid.ucc.gu.uwa.edu.au> <7ld5uj$2h2$4@autumn.news.rcn.net> <377A0DF8.1E1FAFE9@trailing-edge.com> <37829F73.8B3EEF9F@intel.com> <7lu9ab$n0k@dfw-ixnews7.ix.netcom.com> <7mjge5$pqt$1@cascade.cs.ubc.ca> NNTP-Posting-Host: macinnat.static.star.net.nz X-Newsreader: MT-NewsWatcher 2.4.4 Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!isdnet!newsfeed.berkeley.edu!ihug.co.nz!bruce In article , curbow@best.com (Dave Curbow) wrote: > Interpress the precursor to Postscript was defined so that one (printed) > page was described completely independent from the others -- > unlike Postscript where you can describe an image and then use > multiple (Print? it's been too long since I wrote Postscript) to blast > the image. copypage/showpage. PostScript also carries state from one page to the next, unless you explicitly use save/restore around each page's contents -- which much machine-generated code, and anything following the DSC (Document Structuring Conventions), does. -- Bruce ###### From: Zalman Stern Newsgroups: alt.folklore.computers,comp.arch,comp.arch.fpga Subject: Re: Alto in an FPGA (was CPU's directly executing HLL's) Date: 15 Jul 1999 20:27:28 GMT Organization: Netcom Lines: 19 Message-ID: <7mlg7g$gt7@dfw-ixnews11.ix.netcom.com> References: <7lassu$p2q$1@mail.pl.unisys.com> <7lccq8$tv$1@mermaid.ucc.gu.uwa.edu.au> <7ld5uj$2h2$4@autumn.news.rcn.net> <377A0DF8.1E1FAFE9@trailing-edge.com> <37829F73.8B3EEF9F@intel.com> <7lu9ab$n0k@dfw-ixnews7.ix.netcom.com> <7mjge5$pqt$1@cascade.cs.ubc.ca> NNTP-Posting-Host: netcom15.netcom.com X-NETCOM-Date: Thu Jul 15 3:27:28 PM CDT 1999 NNTP-Posting-User: zalman User-Agent: tin/pre-1.4-19990517 ("Psychonaut") (UNIX) (SunOS/4.1.4 (sun4m)) Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news-ge.switch.ch!newsfeed.mathworks.com!howland.erols.net!ix.netcom.com!not-for-mail In comp.arch Bruce Hoult wrote: : In article , [Comments on "page independence" of Interpress and PostScript document descriptions.] Adobe's highend printing technology (Supra?) has a front end which converts PostScript to PDF, splits the PDF into page batches and submits the page batches to parallel printing subsystems. One of the motiviations of this is the page independence of PDF. From my point of view, this is about as elegant as translating of Java byte code to x86 code so it can be fed to an out of order front end and finally be executed by a pipelined datapath. But hey, elegance is pretty worthless as a metric of technology success. I also heard from someone outside Adobe who tried to use this technology that it had lots and lots of implementation problems. I'm not sure if there are shipping products based on it even now. (A few years after it was announced.) I thought Xerox was supposed to have one though. -Z-