Home | Projects | FPGA-PC | Board Spec XC2S600E

Spec for an XC2S600E based ATX size FPGA-PC Board

This is an preliminary spec, only for discussing of the feature set

At present no detailed pinouts or pin counts have been investigated. Dito no checks if all the desired functions will fit the available pins. Or any decisions about some functions sharing pins.

Situation
FPGA Section
Memory
Form Factor, Case and Power
Expansion and Bus
User IO Devices
Storage Devices
Configuring the FPGA

Situation

Prehistory

For my PDP-10 clone I was initially going to use an prototype board with self made memory and IO connector modules. This though has the problem of constructing and manufacturing these modules (time mainly), and people copying my design having to repeat all the manufacturing work.

Despite this I was not decided on making an custom hardware board, because this would be even more work/time, in designing, manufacturing and processing orders for the board. I was hoping to find someone who would come up and do this for me and the other users.

But in case I did decide to make an board, I had collected various ideas for this in the Hardware file. It turned out that there was actually not much PDP-10 clone specific stuff in that list, so I had alredy decided I wanted an general-purpose FPGA-PC board (named so after FPGA-CPUs), usable by other cloners, to save them wasting time repeating nearly identical design work.

Newer Development

Now I have recieved an offer from Andrew Grillet, who makes the Quickstart QST0201 board, to take part in co-speccing an XC2S600E based general-purpose FPGA-PC board that he is intending to design and sell.

It must definitely run ESAs Leon Sparc clone. This is Andrews reason for designing this board, and so as his part of the spec. But it can also run anything else which has no feature set conflicts with this. I see no collision for my PDP-10 or any other cloner.

The aim of this board is to be an "PC without an PC". That is for it to use, apart from the special board itself, only readily available and non/low-troublesome PC components, such as cases and power supplies, memory modules, disk drives, interfaces and peripherals (keyboard, mouse, monitor, etc), but to leave out the troublesome or disliked PC stuff, such as the x86 CPU and PC architecture chips (north-/southbridge), replacing these with the FPGA, which is used to implement the users desired CPU and system architecture. This makes it easy and cheap for users to put together an FPGA-PC system without soldering, possibly reusing existing components.

This shall be an general-purpose large systems (= 32/36/64bit, as opposed to small systems (= 8/12/16/18/20/24bit) computer cloning board. Possible uses for this board are:

This board is assumed to be usable for daily PC style work, with an full OS installed, modern memory size and speed, up to server disk space/speed, up to workstation level resolution/depth graphic output, and Ethernet access.

Small Board

While speccing this board I also had a few ideas for an smaller XC2S150 based FPGA-PC board, with smaller FPGA and small SRAM memory, for small (= 8/12/16/18/20/24bit) systems (or even simple low memory 32/36 stuff). [Andrew: This may be interesting for your PDQ8, instead of making an PDQ8 specific extension board for QST0201. Combine the work time for dual usable result. OTOH perhaps run PDQ8 on the large board, if that is not massive overkill. I also regard this complete FPGA-PC board as easier sellable than the QST0201 experimenting board, which has established competition]

This is NOT an project by Andrew Grillet, unless he adopts it when I tell him about it. In the mean time I have asked him and it seems unlikely (not enough sales volume, better concentrate on one board).

FPGA Section

Chip Selection

The plan ist to use an Xilinx Spartan-IIE chip, as it has lots of space at low cost. The board will use the top of range part, as the chip cost is small relative to the rest of the design (all other components, board making, assembly, handling). This choice of chip is also given by Andrew. It fits my plans, apart from needing to replace my non "E" capable software tools (something I had planned to replace for other reasons anyway, but do later on in the project).

Top of range, according to the Xilinx data sheet, is the XC2S600E part (with 2*48x2*72 LUTs and 6x12 BRAMs) largest in an FG676 case (gives 514 user IOs) using 3'961'632 (=4M PROM) config bits. This is in an (fine) ball grid case, but Andrews board manufacturer can process BGAs, which is better than anything I could design, layout or get built.

Problem with this chip is that the free Webpack software does not support this size. So it would require the ISE software, making using the board very expensive, too much for many users. So possibly fall back to the XC2S300E part (with 2*32x2*48 LUTs (less than 1/2) and 2x8 BRAMs (less than 1/4)) in an FG456 case (gives 329 user IOs (is 2/3)) using 1'875'648 (=2M PROM) config bits. XCV300E would be supported and doubles BRAMs, but only even 312 user IOs and far more expensive.

Using such an high pinout device suggests the requirement of an many-layer board, so other high pinout parts can also be designed in, but should be restricted to where neccessary (because of layout time, and possible higher board cost).

Clocking Circuits

The Spartan-IIE can take up to 4 global clocks. The board should offer the use of all 4 of these, for maximal flexibility. This costs only 4 user IOs, leaving 510 (or 325 for XC2S300E). That is less than 1% of 514, and provides for a lot of functionality. Drive them by:

Cooling

Most likely a naked FPGA of this size will get too hot in uses where lots of LUTs and FFs switch at high speed. So an set of cooling fins should be provided. At least make sure they can be user fitted (space for them, mounting?).

Preferably no ventilator as that makes noise. And even for 200MHz with todays FPGA transistor sizes and many static configuration memory transistors there should not be enough heat to require one (1993s 486DX/2-66s still ran at 66MHz without one).

Memory

Memory Word Width

For 32/36/64bit systems max n*36bit memory is needed. The extra 4 bits for 36bit can be (miss-)used in 32bit systems for other purposes, such as tagged memory (Lisp machine data types, memory management, debugging flags/traps).

This board can be used for serious workstations and servers, so memory words should be wide enough for at least parity, and better allow ECC.

Parity in 32/64bit systems requirs an 9th bit for every 8bits (gives same 36bits needed for 36bit designs. Ideal would be byte level parity with 4*9bit wide writes. But word-level writing is acceptable.

Parity in 36bit systems is most likely one bit for the entire word (or 1/2 or 1/4 word), giving 37/38/40bit words. Definitely word-level writing is acceptable.

ECC in 32bit or 36bit systems requires 8 bits more, depending on data bus width and algorithm. So 32bit needs 40bit and 36bit would require 44bit.

Memory Line Width

With todays fast processors memory has become the worst bottleneck in any system. Despite multi-level caches, Intel reports in its Hyperthreading documentation that a P4 runs actual work for about 5% of time! With the expected 100-200MHz of an FPGA-PC this problem will be smaller, but still existant. So wide memory lines for fast cache (re)loading is one thing that can be useful.

This is also ideal for cache killing large data set processing (servers). This can also be used to partially offset the slow clock frequency of FPGA-CPUs, at least in some applications.

And the load caused by on-board VGA will slow down the processor less on an wide data bus (shorter or less often stall time, and processor needs memory less often).

So go for quad wide 128/144bit data memory, with 16bit for parity or ECC, giving 144/160bit on XC2S600E or half of that on XC2S300E. [Andrew: What are your intended Leon uses? Is this relevant for them? I would like it. One of my users comments that wanting this shows that I am a ex-mainframe (actually ex-minicomputer) guy]

Memory Slots

As this is intended as an large processor FPGA-PC board for PC style uses, it should also support an large address space, and of this offer large memory chips. So it needs to use (S)DRAM.

There are multiple variants:

So it looks like DDR SDRAM is the way to go, as in all newer PCs, unless these are seriously more difficult, in which case normal SDRAM will do. But many newer prototyping boards have DDR, so it seems to be easy.

Possibly put in (banks*width) 2*2 = 4 slots. 3*2 = 6 is overkill, even modern medium level servers do not do that any more. For XC2S300E with one bank width, put in 2 or 3, not more. So there is no need for using the slower registered SDRAMs. Above pin counts are for 1 bank of DIMM slots. For 2 banks duplicate some of the control signals and most likely also S and CLK (which so grow with width and depth one set for each DIMM slot) (#lookup: which signals need duplicating).

Memory Cache

SDRAM is fairly slow to access, so cache it in SRAM.

The XC2S600E has 72 BRAMs (= 36kByte), usable for internal (L1) cache, data and tags, so that will allow max 32kByte size (so long no other use of BRAMs is intended!). This may be too small for an 100+MHz system, so the board may want to offer an external (L2) cache. The XC2S300E has got only 16 BRAMs (= 8kByte), so that tops out far lower.

OTOH PCs introduced L2 caches in the days of 486DX33es loading over an 33MHz*32/8 = 133MByte/s bus from FP DRAMs and with only 8kByte L1. In the days of 100-166MHz PCs these were 586-1xxes with 66MHz*64/8 = 528MByte/s bus from EDO DRAMs and 16kByte L1. Here we have 100-200MHz and memory up at 2128..6400MByte/s bus from SDRAM and can have up to 32k of L1.

And with an 4*4word cache reload granularity, that gives an setup wait of perhaps 4 to 8 clocks every 16 clocks, so only 1/5 to 1/3 slowdown when L1 fails and we need to load direct from SDRAM.

So perhaps save the cost (layout time, parts, board space and design complexity) of L2 cache. [Andrew: I don't believe that L2 cache is actually worth it. Better implement an good L1 cache. OTOH on an XC2S300E there is a lot less L1 space, and L2 on that is only half the width, so it may be worth it there]

If L2 cache does go in, there are 2 variants:

I suggest using 8 256kx18 (gives 4MByte at 128bit used), which are common on PC motherboards. Possibly only 4 128kx36bit (gives 2MByte) chips, but this is inferior, so do it only if space or cost crunched. [Andrew: bad cache to memory size ratio is one reason PCs are inferior to workstations or real servers, so make the cache large if possible. And if we put in cache, make it large to get the maximum from it]

This gives 144bit, which allows byte granularity parity on 32/64bit systems (the reason these chips were made 18bit), but 36bit systems will have to go without cache parity or take bits from the tag bits (reducing cachable address space), which is not too bad if one uses word granularity parity (costs only 1bit, only halves address space). ECC is senseless in cache, is too slow, and reload from main memory in case of parity error.

L2 cache will have to share the 128/144/160bit data pins (SDRAM is only used when the cache has failled to return data, and the cache loads what SDRAM delivers). But so DMA from in-FPGA video or disk controllers will kill same-pin L2 accesses also.

Many PC processors use separate frontside (main SDRAM) and backside (L2 cache) buses. There are not enough pins for 144/160+144bit (see the 320bit discussion above), so this would go down to 72/80+80bit. That gives slower L2 to L1 reloads, and also slower SDRAM to L2. So separate frontside and backside is only good for PC processors where only the 64+8bit SDRAM to L2 load over few external pins of small PG370 chip cases and the 128bit L2 to L1 is inside the chip. With FG676 and no chip internal L2 just one 160bit bus is faster (and simpler).

Tag bits for an simple set associative cache need to be address bits of main memory minus address bits of cache, so 30|32|36 - 17|18 = 12..19bits. Plus a few for "row valid" bits. So use an 9th 256kx18 chip, which gives 18 tag/control bits. For tag writing while still reading ECC on memory to L2 loading provide extra tag data pins, do not share ECC pins (of 160bit memory if used). Also separate tag address (18 or 17) and control (few) pins. (#lookup: 256kx18 pinout, in particular control signals) Tag bits for n-way set associative cache is n times larger, so too expensive for such an large cache.

This L2 cache is also enough memory space for some smaller-memory designs to be possible with no DIMMs at all. It is simpler to use for such designs (but how much?). If they want 160bit width for 144bit+ECC, missuse the now unused tag bits for this (gives them 162bit).

This L2 cache can also be used as microcode store for an horizontally microcoded processor, if the BRAMs are not enough for this (but even the larger model B KL-10 had only 2kx96bit microcode, which is 48 BRAMs, so that would fit into 3/4 of the BRAMs). This can also use the full 162 cache+tag bits. On XC2S300E again this is more likely to be usefull.

Form Factor, Case and Power

Board Type

This board should be an ATX format, and use an standard ATX power supply. This is given by Andrews part of the spec. There are multiple possible ATX based board formats: [Andrew: For me MicroATX or FlexATX are the preferrable sizes, depending on if space required. Slightly more preferred of these is FlexATX]

Power Supply

ATX supplies provide high-load 5V and 12V and lower-load 3.3V and a bit of small stuff (5V standby, -5V, -12V).

An on board voltage converter 5V to 1.8V (or 12V to 1.8V) is needed for VCCINT (FPGA core). Preferrably from 5V, leaving the 12V for drives. Do not use 3.3V to 1.8V, as 5V and 12V are on most newer ATX power supplies way more plentifull (about half of power each) and more stable (primary switching regulator input, 3.3V is often only auxillary power, just an simple linear regulator from 5V).

VCCO (IO) and memory 3.3V can be taken directly from the ATX power supply, it is good enough for that.

So this design should require only 12/5/3.3V parts. 1.8V only for VCCINT, else parts may interfere with the FPGA core operations, and they would require 1.8V availability away from the FPGA also.

The board should be capable of controlling the ATX power supply.

Front Pannels

There is no operators console / front panel (desirable for hardware debugging and teaching computing fundamentals) on an ATX case. So no loss from chosing this format here. This even makes the front pannel facultative and user selectable, saving cost for those users who do not want one.

Use LEDs and switches. To fit the 17 or 6 IO pin count, this will have to be an scanned matrix arrangement. In the FPGA grab all data lines and shift them out, 1 bit at a time. And provide parallel to these data bits their column number to an external TTL or CPLD based column decoder. Or drop the number and use an external column counter and reset/clock signals to drive the decoders. Or use an row of FFs with an bit travelling along them. At same time read in the switches into an shift register. And then transfer them to action circuits when all in.

Expansion and Bus

Peripheral Component Interconnect (PCI) Bus Slots

Expansion for cards will be PCI bus/slot(s). This is also given by Andrews part of the spec.

PCI allows both complete commercial cards and prototyping cards (both for simple random circuits or FPGA based ones) for user designed additional devices.

PCI is ideal for cloning computers that have PCI busses, as it allows the actual PCI cards that these systems use. PCI is also decent for any 32bit computer, this just requires adding PCI to its design, and gives a lot of cards in return for including this, just write drivers for them.

Wide (64bit) PCI is most likely not important enough (only used in high performance servers) to justify its inclusion in this board. OTOH it may be an good alternative to the CDEC below.

Fast (66MHz) PCI is not important enough (only used in high performance servers) to justify its inclusion in this board. Also it requires 3.3V which may be incompatible with the PCI bus driver chips (would need switchable dual voltage operatable parts).

Signal wise PCI has 32 multiplexed D and A signals, plus quite a few control signals. Some controls seem to be individual-slot. (#lookup: PCI signals, what needs to be individual-slot and what can be all-slots) Make at least 2 of each slots 4 INT pins individual-slot, for decent interrupt designs, real plug and play.

For 36bit PCI cards can be used, simply wasting 4 bits of the IO data bus, like the original KS-10 which used 16bit Unibus PDP-11 cards. Put an 8*36 to 9*32 converter in to the DMA circuits in the FPGA.

AGP is an single slot PCI variant, which is only usable for video cards. There are no prototyping boards available to missuse it for any self built stuff. [Andrew: Or do you know of such boards?] As it is such a large amount of single-purpose pins (unless shared with an ISA connector based CDEC, see below), better provide on-board VGA.

Industry Standard Architecture (ISA) Bus Slots

As it is most likely needed to define pins for ISA bus (for driving ATA), possibly also provide some actual ISA connectors in the PCI slot space, as they mechanically can share an slot with PCI. This allows also using the plentifully available ISA cards, including data aquisition stuff, some of which does not exist in PCI.

Signal wise these can share pins with the PCI bus anyway, as an IO instruction will only address one of these buses at any one time. Dual bus requires 2 sets of bus drivers, so that the ISA stuff does not slow down PCI with its capacitive load. Appart from that it is then just connector and wiring cost.

Signal wise ISA has 16 D and 24 A, and various controls. All ISA signals are all-slots, none individual-slot. So no per-slot separate wiring for decent interrupt designs. (#lookup: ISA signals)

Put ISA D0-15 on PCI AD0-15 to save FPGA internal multiplexers on data lines. One possibility is to (miss-)use PCI AD16-31 for ISA A0-15 and ev some PCI controls for ISA A16-23 (if included). But better latch the ISA A0-23 lines from PCI AD0-23 onto external registers in the A bus drivers (which is what what happens on the PCI cards anyway), as this saves FPGA internal different multiplexing of PCI and ISA address signals (2 times address in different places) and is so easier to use.

It would actually be nice if the board offered all once often used PC/AT "legacy" interfaces, such as ISA, PS/2, RS232, LPT, floppy, as this makes the board (with an x86/PC FPGA configuration) usable as an legacy PC, when the industry kills off these interfaces in its zeal to cut dumb user support costs at the cost of power/laboratory users. This creates an additional, possibly large, market for the board.

32bit ISA variants such as EISA and Vesa VLB are not important enough to warrant supporting them. Anything requiring that power will have to go over an PCI slot (and exist for PCI).

PC/104 is just an ISA variant that uses exactly the same signalling/pins, and can even share the ISA bus drivers, is just an different (IDC) connector. So adding this is low cost, so it may be sensible to include it, to making embedded systems usage easier. Definitely no PC/104-Plus, which is the upgraded PCI over IDC version, as next to no cards need it (less than 10% can even use it).

Own Custom Expansion Bus Slots

PCI and ISA on an FPGA-PC really only fix the connector type, the positions of hardwired pins (such as power) and the slot mechanics. They dictate that most/all signal lines are the same FPGA pin for all slots. Problem may be with buffers and in what sets they are direction switchable. (#lookup: bus buffering interfering)

So one can use any own/custom bus signalling on the PCI or ISA signal pins, so long the bus buffers don't get in the way. This should be good enough for any bus signalling design. Then simply use commercial PCI or ISA prototyping cards (which don't fix data pins either) and implement whatever device/interface one wants, possibly using an 2nd FPGA on the card. [Andrew: One of my users even suggests that this may create an aftermarket for add-in boards]

PCI has 6 "reserved" lines, of which 4 can be used for 32bit to 36bit extension. This use loses existing card compatibility, but allows an true 36bit "PCI-36" bus. So this board should pin out all 6 of these (4 all-slot as addresses, 2 individual-slot for flexibility, or all individual-slot if enough pins).

Custom Direct Expansion Connector

Any bus using shared lines for all slots on most pins requires decoder circuits on the cards, not in the FPGA. This makes minimal IO cards more work than just simple analog converters and connectors. But so long as only one card is used, one can ignore this issue and (miss-)use an single PCI or ISA connector as an set of direct (apart from bus drivers) IO pins from the FPGA to that one card. This usage type is quite possible with ATX on-board IO and FG676 reducing slot card usage/necessity. With the XC2S300E the loss of 185 IOs (and only saving 88 or 123 on quad->dual memory) the CDEC as 2nd independant bus is most likely the second victim.

But above kills off all the PCI and ISA slots, which is quite an high price to pay for one simpler card. And the "dangling slots" may disturb high speed signals. Also any PCI or ISA bus drivers will interfere.

Preferrably do the CDEC on the first slot, for shortest double wiring and no layouting of many traces through the rest of slot space. Possibly use an different colour PCI (or ISA) connector for the CDEC (unless this drives up part costs too much).

User IO Devices

User IO is an near infinite field, so too specialised stuff will need to go over PCI or ISA. But all standard stuff should be on-board, so that operation is possible with no need for add-in cards (for lower cost, less space and power usage).

The board should have all IO that an typical PC-experienced user would want to use. It should offer all stuff that is built into an typical ATX board, unless there are good reasons to override this. Typical is assumed to be:

All ATX area user accessible IO (PS/2, LPT), and ev all internal user wirable IO (LEDs&switches, speaker, so only 7 pins), that runs directly to the FPGA should have: anti-shorting resistors (against user misswiring) and over-/under-voltage protection (against ESD or user misswiring), to prevent the FPGA (and so the entire board) being destroyed.

IO with unsolderable/replacable components in between such as VGA (DACs), RS232 (drivers), Ethernet (chip), USB (chip), sound (DACs) can be without, if it adds too much work/cost. But preferred with (if no signal trouble, as may be at VGA speed). [Andrew: This is an user request, I second it] Also protection, so that they are hot pluggable without killing stuff. [Andrew: Annother user request]

PC case LEDs (Power/Turbo/HDD) & Switches (Turbo/Reset/Keylock)

Signals are simply 3 in and 3 out pins, running to an block of male IDC connectors. The LEDs and switches are in the case anyway, so provide this next to zero cost function. Can be used not just for their official functions, but also for any type of 6 single bit usages, such as using Turbo LED for network traffic, or even entirely strange stuff such as front pannel. [Andrew: One of my users commented this line with: "what is strange about a front-panel ;-)"]

PS/2 connectors

For console terminal keyboard and mouse. Signals are simply 2 bidirectional IO pins (data and clock) and 5V and GND pins. Are input when sending keys, outputs when PC resets/initialises device, or when setting keyboard LEDs. All FPGA prototyping boards with LVTTL capable chips seem to just wire these to 2 IO pins. (#lookup: what does PC do against collisions).

Very low cost, so no reason for putting in only 1. For these use one of the standard "vertically stacked" violet+green connector pairs.

VGA connector

For console terminal video monitor, or for PC/workstation type graphical video. Signals are 3 analog out channels and 2 out pins for H/V signalling and 3 in pins for monitor ID (these can be ignored, no one uses them today with multisync monitors). Blue HD15 connector. Increasing colour depth allows: This uses up 3*(2..12)+2 pins, but video output is essential to nearly any FPGA-PC design.

For the actual bits to signal DACs use one of (#lookup: available DAC chips):

VGA video out of the FPGA comes from main memory, so it is an UMA (unified memory architecture) system. But memory bandwidth is no problem here. The maximal 21" usable 1600x1200@70Hz gives (1600*5/4)*(1200+32)*70 = 172.48MHz pixels. At 24bit/pixel from minimal 128bit/cycle this gives max 172.48*24/128 = 32.34MHz memory. That is only 24.3% of PC133 SDR and 12.15% of PC2100 DDR and 8.1% of PC3200 DDR. It stays way under the 50% of memory bandwidth, that could kill off processor power. Note though that VGA memory access not only blocks main memory, but also the L2 cache, which uses the same data bus pins, so the 50% limit may be optimistic. [Andrew: I regard the 50% as realistic, as the processor only needs 25% of PC133 SDR wenn running worst-case cache missing code]

An alternative is to only provide an small video section (up to 4 or 5bit and <100MHZ) and use an premade video card (or the CDEC) for anything more. Problem is that PCI video card supply is not safe in todays AGP dominated video world, and AGP video cards require an dedicated AGP slot which is only good for them (unless paralleled with an ISA based CDEC). And losing CDEC for just getting VGA is not good either.

VGA is not in the list of "typical" ATX stuff above. So this will require dropping other connectors. Game is the most likely, sound also possible. Preferrably not one of the RS232 ports.

2nd VGA connector

An 2nd VGA connector would be good for dual headed display, or showing an operators console / front panel on an 2nd monitor (instead of special panel hardware), or even together with (miss-)using the mouse PS/2 for an 2nd keyboard to make 2 terminals (one can then still use 2 RS232 mice for the 2 terminals with full KVM user IO). [Andrew: I would like dual video. But this is quite specialised stuff, perhaps too much. One user would like tripple video (has multiple smaller monitors), and I would not mind that either (I have got 3 monitors)]

This costs an second full set of 3*(2..12)+2 pins and the stuff behind them. Because of the high transfer rate they are not sharable. RAMDAC programming pins, if needed, are sharable at least. Memory bandwidth in now also doubled, to max 48.6% of PC133 SDR or 34.3% of PC2100 DDR or 16.2% of PC3200 DDR. And there needs to be ATX space for 2 VGA connectors.

Alternative would be to use the CDEC if users want 2nd VGA. If making such an expansion board (not in base board price any more), it should provide 2 (2nd/3rd) VGA connectors on it. This means the CDEC connector will need at least 2 times 3*(2..12)+2 pins (or more if RAMDACs, but these 2 RAMDACs can share most programming pins) on it, which is the case if it is an "2nd PCI bus" or an "ISA connector CDEC parallelling AGP". [Andrew: This looks like the better version. Single VGA as standard, 2nd/3rd as extension for those users that want them]

PC Beep Speaker

For terminal bell/beep. Signals are simply 1 out pin with simple transistor amplifier via LEDs&switches IDC to speaker. The speaker is in the case anyway, so providing this function is next to zero cost.

RS232 Serial Interface(s)

For external terminal(s) for system console or user. Or for talking to any random RS232 device. Also use for simulating paper tape reader/puncher with an PC at other end or even for real PTR/PTP devices.

Signals are simply 2 data lines and GND pin. Strongly preferred not just 3-wire, but with full pinout, with control lines (= 3 outs, 5 ins, so this only uses 8 pins), so that modems and RS232 mice can work. Drive data and control lines by 1488+1489 or MAX232/MAX233 or similar. Using an commercial U(S)ART chip is not worth its cost, and fixes register interface making historical cloning difficult. For this use the standard turquoise/cyan female DE9 connectors (DB25 uses too much connector space, converters exist).

Is an 2nd RS232 useful? Today with PS/2 mice, and switching to ADSL/CATV Internet the RS232 interfaces are less and less being used. OTOH, even newest commercial motherboards still have 2 RS232 on them (unless they have on-board VGA displacing one) and RS232 uses only an small connector and 8 IO pins. And the historic computing people want RS232 a lot. So put 2nd in if space and pins allow. [Andrew: Actual user request that 2 go in. I second it. I would prefer the on-board VGA to displace the standard ATX game or sound connectors, so that 2 RS232 can stay]

LPT/Printer Port/General User Port

For printing. Signals are simply 17 straight IO pins and 8 GND pins. For this use one of the standard red "printer up top, above RS232/VGA" DB25 connectors (36pin is way outdated, converters exist).

This is also (miss-)usable as an general user port (see parallel port disk drives). This is actually an very universal port, because no special circuits (not even buffers) behind it, just 17 pure (LV)TTL signals direct from the FPGA. It is sometimes also called the geek port for this.

For an general user port it lacks power supply pins, so possibly make some of the pins switchable (FPGA-driven relay or power transistors or just DIP switches?) to become power pins. I suggest 1*12V + 2*5V + 2*3.3V. [Andrew: 2 users have commented on this feature being a great idea, both users that definitely want to buy the board, so it needs to go in]

Using 5 of the 8 GND pins saves losing IO pins, but risks shorting by plugging in a standard LPT cable, if it combines all 8 GNDs (a PC power supply should just switch off). Using some of the IO pins can not short and so not crash the FPGA-PC via power faillure, but loses them as IOs. [Andrew: I prefer using GND pins, as IOs are too valuable, that is also the reason for 1+2+2 power pins, so that 3 GND pins remain usable]

[Andrew: Such an user port may also be an good alternative for your PDQ8 and Minibus ideas. It has 17 data lines (one less than 18), without wasting another 17 pins on duplicates (more for the PDQ8s local IO). Comes on an DB25 male plug, ideal for an long 25pin RS232 cable to an DB25 female on the (first) target peripheral, with there an 2nd DB25 male for further targets (like external SCSI cabling, but smaller cables/plugs). And then just an DB25 female plug with terminators in it (if the bus needs terminating) so no SCSI like trouble. And it can even provide power for small peripherals, so saving power wiring mess]

Additional LPT/Printer Port/General User Ports

Given the universality of such an user port, possibly put in multiple of then. These are not part of typical ATX, so sacrifice other connectors: [Andrew: At the moment the 2 user port version, sacrificing the game port for 2nd, VGA taking the sound ports, seems to be the best]

Real Time Clock (RTC)

For setting the system clock at boot time, as any computer designed in the past 15 years can do. [Andrew: look around what exists in battery powered clock or time counter chips, I really do not know enough about the available parts here]

Signals, if it needs separate data/address pins connect A to the ISA A bus, after its bus driver chip (and so decoded address), if common A/D connect it direct to the common PCI/ISA FPGA pins, data anyway connect there, separate just chip select and alarm interrupt pins. [Andrew: There exist quite a few I2C (only 2 pins) based clock devices. But I2C is patented, which would require organising and paying per-copy license costs, either by any person implementing the FPGA-side I2C driver in their design (impossible for open source, as no per-copy money to pay it with) or by you as some form of "cover all uses" license. As no other I2C using chips (temperature or voltage sensors) are being used, this makes I2C not worth its costs, unless they are very low]

For simplicity power the RTC from an replacable Lithium battery, no rechargables and charging circuits.

Battery backed CMOS RAM

For storing setup data. The RTC chip CMOS RAM is limited in size (usually 128-512bytes). Even in newer PCs additional CMOS (extended setup) had to be added, because the RTC CMOS was too small. Macs have large ROM patch CMOSes.

Such an large battery backed CMOS SRAM can also be used to pretend that it is an ROM or flash ROM, for designs that want to see such an chip for firmware, be that BIOSen (as in PCs) or operating systems in ROM (as in older Macs or Amigas or Atari ST) or comfortable boot monitors (as in newer Macs, or in Suns and Vaxen).

So put in an larger battery backed CMOS SRAM.

Signals needs separate data/address, so connect A to the ISA bus, after its bus driver chip, data direct to PCI/ISA FPGA pins, separate just chip select pins. But better wire the CMOS parallel to the main memory data and adresses, gives no collisions with IO (good for CMOS as main memory), but this may require multiplexers that slow down DIMM memory accesses.

For simplicity power the CMOS from the same source as the RTC.

Sound

This would have to be CD quality (2*44.1kHz*16bit) to make it worth doing. Circuits just for phone quality 8kHz*8bit are definitely not worth it. And for simple mono sound effects one can use the already provided PC beep speaker. [Andrew: Are your users expected to want sound? One of mine wants to listen to music, is thinking of making an noise-less (no disk and no ventilators) bedroom .mp3/.ogg/.shn box, with an cron job as alarm clock replacement!]

Standard PC usage today seems to be line-out DAC (light green), line-in ADC (light blue) and mic-in (pink) jacks. So in addition to DAC+ADC perhaps also put in an mic amplifier and analog switch before the ADC.

[Andrew: Sound is the place where I am the least decided whether it should go in or not. This has been put into the "Rejected" section twice, and retrieved again from there both times. I am slightly against it, preferring the more FPGA-ish 2nd printer/user port variant]

Ethernet

Good for network/remote connections, can be used just for an full NIC interface, or for simulated RS232 terminal network logins on old non-Ethernet systems. Also today the increasingly preferred way to the Internet, needed for ADSL or CATV but also often used for ISDN or even POTS. Can also be used to connect Ethernet-accessible peripherals (printers, disks, custom stuff). Ethernet was invented as "the successor to RS232".

This is quite a specialised interface, hardware only usable for networking, And not every user will use this, but many will do: workstation users, X terminals, stationary MP3 players. [Andrew: Do you intend your Leons to be networked? I want it, for workstation and server usage. Above sound user wants it to fetch his music. An other user wants it to connect real historic tape drives with small Ethernet/microcontroller/controller combos in them!]

Definitely 10Mbit/s, but also 100Mbit/s as the chips for this are not much more expensive. Gigabit Ethernet is too exotic for this board, if it is wanted for an server use an PCI card. The old 10base2 Ethernet, or even worse AUI/10base5, is today seldom to bother, if required, use PCI or ISA card or external converters.

[Andrew: I prefer the PHY Ethernet version. As it allows both PHY (on board) and full (on PCI card) Ethernet variants, without users needing to do soldering. Full on board adds work and cost to the board without much real advantage (PCI cards are cheap and plentifull, so only space saving) and loses PHY without soldering (unless a pre-made module is offered). No Ethernet also loses PHY without soldering, and needs PCI, and does not save much, so no gain either, is worst variant]

2 or 3 (or even more, slot hole fits about 6) Ethernet interfaces would allow making routers or bridges or even switches. This would make an noiseless (runs 24h!) router. But this is very specialised, just like 2nd/3rd VGA connectors. So put in only 1 interface and do others as CDEC or PCI. [Andrew: One user is thinking along these lines (router)]

Universal Serial Bus (USB)

Used by quite a few newer low speed peripherals. Can replace PS/2 for keyboards and mice, RS232 for modems, LPT for printers, dedicated sound and even support many non-traditonal devices, such as connections to cameras, flash or harddisk drives, smartcard readers, etc.

But like Ethernet this is an specialised interface. Requires an specialised chip, and connector can only be used for USB, not anything else. But also some/many users will want this. [Andrew: I presently do not use any USB. One of my users commented that I have too many devices, and that he regards sound, Ethernet and USB as candidates for removing. An other users wants USB (and suggests even dropping PS/2 and just using USB). I think that PS/2 and USB should go in, PS/2 being easier and cheap, USB doing lots]

USB 2.0 seems to have no use outside of fast flash/disk interfaces. OTOH USB 2.0 is most likely only insignificantly more expensive, so put this in, unless it is too much more expensive.

Put in 2 ports, as many uses for them. Together with Ethernet use one of the standard "Ethernet above 2 USB" connector sets. If the USB chip offers 4 ports put the other 2 USBs on internal connector(s), for wiring them to the case front connectors increasingly common these days. (#lookup: what internal connectors are standard for this)

[Andrew: Here I strongly prefer the full USB version. Only none if also no Ethernet and going for "4 printer/user port with still VGA" design (which is unlikely). The PHY USB version is IMHO inferior on all counts, user will need to solder their own module if they want this unlikely configuration]

Rejected Interfaces

Analog (DB15, PC) Game/Joystick/MIDI connector(s): Users that want joysticks are seldom used and their users can use USB devices. Also DB15 joysticks interfaces are simple. Signals just a few pins and a bit of analog stuff, and can be implemented as an user port plug-on module, by anyone who wants one.

But better alternative is to use the limited ATX connector space to provide the far more useful and flexible combination of on-board VGA and 2 RS232 and still having 2 printer/user ports. Use the "above sound" space for an 2nd "on top" printer/user port connector. [Andrew: What is the exact target audience for your Leons? Do they game? 2 printer/user ports and full 2 RS232 and an-board VGA are all way more important than joysticks or sound for me, and my users]

Digital (DB9, Atari/C64) Joystick connector(s): Users that want digital joysticks can missuse the printer port (without module), as these are only 5 switches and open/closed input pins. This is the normal way on an PC anyway (see the Linux joystick driver "multisystem" joystick driver for the wiring).

S/P-DIF digital sound out/in connectors: Normal PCs don't have these either. Also I do not know what circuits/signals these require, as I have never seen one. We do not have ATX connector space for them. And there exist PCI cards, which can be used. And this also can be done as an printer/user port plug-on module. [Andrew: Are serious sound processors to be expected to be among your users? Mine not]

MIDI audio control connectors: Would be part of the DB15 game/joystick/MIDI connector on an PC. Signals are a few 3-wire RS232es as far as I know (never done MIDI). PC usage actually requires an external "break out" cable (as the game DB15 is shared with the joystick) as far as I know. This can also be done by missusing an user port connector.

Oscilloscope display output: These were used in computers before 1970s. Have a program cyclically plotting the picture, no video memory (then expensive) to be scanned out. Signals X/Y coordinate and Z or ZR/ZG/ZB intensity. Simplest is to missuse existing VGA connectors RG as 8bit X/Y and B an 8bit Z. But this gives only 256x256 positions and B/W 256 gray levels. If 3*10..12bit VGA the positions go up, but colours stay B/W only. For full 256^3 colours use VGA RGB for ZR/ZG/ZB and then add 2 separate 10..12bit X/Y via the CDEC or an printer/user port module.

Composite or SVHS video output: This would be for standard TV/video monitor output. Users that want any of these will have to use an PCI card, with also the needed analog stuff and video modulator on it. Same also for any video input digitiser and tuner. Or this also can be done as an user port plug-on module.

IEEE1384 (Firewire): Today only of importance for digital video in/out. Possibly also for fast external disks, like USB2. If someone wants this specialised feature then use the PCI bus and put a card in there.

IEEE488/IEC625 connector or other industrial stuff: Used too seldom for standard users. Use an PCI (or is ISA needed for this old stuff?) card if one wants any of these, or use the CDEC to directly drive such an system. Alternatively use an user port plug-on module (are 8 shielded data and 8 control signals IIRC).

Modem or ISDN or other comms equipment: Very specialised, badly documented signalling, regulatory problems. Existing devices are plenty and good enough, and can be connected by existing RS232 or Ethernet.

Storage Devices

Storage devices are even an worse infinite field of designs than user IO, in particular for people cloning historical systems. So the board should just offer an small and sensible set, usable in many designs. For all the rest use the PCI bus, an printer/user port or use the CDEC.

Floppy connector

Still the standard drive for small amounts of user data, low volume install and backup, and cheap media for giving away. This is also usable to load an FPGA-CPUs microcode (and that is what floppies were invented for), but that is only needed for microcode in external memory, as internal BRAM memory is loaded together with the FPGA configuration.

Also use the floppy for simulating DECtape or even paper tape reader/puncher or similar small media.

Signals are simply 17 straight IO pins (2,4,..,34) and 17 GND pins (1,3,..,33) with a little bit of analog stuff. Specialised FDC chips are not advisable, as they fix data format, bad for GCR or non-2^n bit systems.

40pin ATA connector(s)

For ATA disks or ATAPI CD/tape drives. Cheaper drives than SCSI. [Andrew: important for some users, including some of mine] Also possibly more similar to some old controllers, easier to clone them at register level. And is of course what PCs and many others today expect.

Signals not investigated yet, but less than 40. (#lookup: ATA signals, after looking at ISA) Sensibly put in 2 ATA connectors, for 2*2 drives, additional signals are most likely just 2 separate select pins.

The original ATA was just an set of bus driver chips coming from the ISA bus, ATA just being an ISA bus subset. So it can share FPGA pins with the ISA bus slots.

Later for higher speed moved to PCI bus with address latch and DMA with cycle combining (2*16 to 32bit). So it can share FPGA pins with the PCI bus slots, preferrably still same layout as ISA as it needs separate address lines and only 16 data lines.

ATA needs separate set of bus drivers, so that ISA and PCI are not hindered by the long ATA cable, so PCI/ISA/ATA need 3 driver sets. Also clocking at ISA 4MHz limits bandwidth to 2*4 = 8MByte/s (original ATA), and at PCI 33MHz limits to 2*33 = 66MByte/s (ATA66) speed operation, so driving ATA at higher clock to get up to full ATA133 also requires separate bus drivers.

Also provide control signals (if any) so that UDMA mode is possible, as ATA is really slow without this (PIO mode). (#lookup: what does UDMA use, is it just PIO with ATA adapter doing the reads, or special DMA control signals)

Serial ATA (SATA) is too new and not widespread enough to make it worth including. Also SATAs announced 150MByte/s transfer rate requires 1200MBit/s and Virtex-E/Spartan-IIE FPGA IOs top out at 625MBit/s. SATA will require Virtex-IIpro style RocketIO to implement, or using an dedicated SATA chip. Better use an PCI card if this is required.

68pin SCSI-W connector

For disks, CD/tape drives and nearly everything else. Allows up to 15 devices, and high-power (but also high-price) ones, and better for servers.

Signals are AFAIK 1/2 of the 68, so 34, the other half being GND or inverted signal. (#lookup: SCSI signals and the various standards)

Up to SCSI-UW there were 2 incompatible single ended (SE) and differential signallings. SCSI-U2W and further up have an 3rd LVD signalling. The oldest 8bit narrow SCSI offers no advantage (and there exist 16 to 8bit converters anyway). [Andrew: This is a mess. Best provide switchable the more common slow SE and the fast LVD, as most modern cards seem to do]

A SCSI card on PCI bus would block that bus during transfers, so this can also share data pins with PCI, ISA and ATA, just needs its own control signals. The same caveats about bus speeds as with ATA apply, and already hitting from 2*40 (SCSI-U2W) operation on. Needs high power bus drivers different to ATA because of driving terminators. (#lookup: SCSI driver chips) So as SCSI also needs its own bus drivers, we need one driver set for each PCI/ISA/ATA/SCSI bus, using common data/address bus FPGA pins.

If an external SCSI-W connector is wanted, run internal cable to an slot. Either second end of cable, which allows hardwired terminator resistors on board, or far better extended first end, which requires removable or better automatic switchable terminator resistors. (#lookup: terminators in driver chips? Tekram DC-390U3W has 3 UCC5930AMWP chips on LVD/SE and 3 DS21T07S on SE, later is by Dallas and are 9bit active SE terminators, former UCC unknown but Dallas has some UCC chips in its "replacement" list)

[Andrew: I know you like SCSI, like I do. But what about doing here? Are you aiming at client or server or embedded or multiple of these uses? What users will want the quite considerable cost of multi-drivers and support stuff? SCSI today is minority stuff. Perhaps make SCSI only available via an CDEC plug-in board (like 2nd/3rd VGA or Ethenet)? Or better go for an commercial PCI SCSI card (leaves CDEC for 2nd/3rd VGA), which is the least work (none), and no cost for the majority of ATA users]

2nd independant ATA and/or SCSI-W buses(s)

With 2 separate sets of bus pins and drivers, good for RAID 1 arrays (disk mirroring) with bus mirroring (else bus contention on writes loses speed). [Andrew: Are your users interested in high power redundant servers?]

For ATA this gives 2*(2*2) drives, for SCSI 2*15. Costs 2nd full set of independant pins for the 2 buses, drivers and connectors (and SCSI also termination).

[Andrew: I prefer the 2nd/3rd ATA or SCSI using the CDEC, not on board, as it is too specialised, like 2nd/3rd VGA or Ethernet. Have 1 each of VGA/Ethernet/ATA on board, CDEC for 2nd/3rd of one type. No SCSI on board (use PCI) and so CDEC still free for 2nd/3rd VGA is also OK]

Rejected Interfaces

MFM, RLL, SMD, ESDI or similar direct HD interfaces: These are only interesting for implementing old controllers down to the register set and sector format level. Would be the hard disk equivalent to how PS/2, VGA, RS232, LPT, Ethernet and floppy are implemented (just analog stuff, rest all in FPGA). And actually quite similar to floppy to implement. But such drives are difficult to get (only antique ones) and slow (factor 100 below todays), so ATA or SCSI is a lot better. If wanted, use an old ISA based controller card (no PCI cards for this anyway). Or use the CDEC connector for an own card. Or even some remote controller via printer/user port, RS232, or even USB or Ethernet.

Any form of direct tape or cassette interface: Too specialised, even more so than direct HD interfaces. Use ISA, printer/user port, CDEC, etc.

PCMCIA or CompactFlash (CF) slots: Used for memory cards or flash disks or microdrives. Would allow an fully solid state system, compact and noiseless operation, good for embedded systems (board with just CF flash disk and small power supply, for full IO capable X terminal or stationary MP3 player). These are basically yet annother ISA bus variant with an different plug.

Offering only PCMCIA requires an adapter card for putting in CF cards, while offering only CF disallows PCMCIA cards entirely (but CF is today far wider used) and offering both adds cost and layouting work for 2 seldom used sockets.

But it is a complex and costly connector, particularly if one uses one of the ejectable ones. And connector plus slot space takes up quite a bit of room/boardspace. Commercial motherboards do not have them, they are mainly notebook stuff. And there do exist ISA or PCI PCMCIA adapter cards, even ATA ones. Also there exist USB adapters for PCMCIA or CF. And embedded systems can use ATA flash disks or an USB "flash plug".

SSFD/SmartMedia or MemoryStick or Smartcards: No importance outside of data transfer to/from portable devices (such as cameras, MP3 players) or authentification stuff. Use RS232 or USB transfer. Or an USB based card reader, if the device has no interfaces. Or an PCI PCMCIA adapter with then an adapter for them.

FibreChannel or SSA or Firewire or any other fast serial device bus: While these are nice, they are difficult to do. And they have bit rates the FPGA IO can not do, the same case as against SATA. If Firewire (most likely the largest demand) were to go in, it would have to be an specialised chip, like for USB. But unlike USB here PCI cards are easy to get and will stay so for a while, so no reason to design one in. And they are unlikely to be used, as they are high end stuff. If any of these is wanted, use PCI cards.

Configuring the FPGA

External Configuring from Development System

Like any normal prototyping board this board shall offer direct configuring from an development systems parallel port.

The programming circuit should be built in (like in QST0201, good feature) with just an connector for the cable to the development system. Ideally the same (or expanded) printer port pinout and signalling as QST0201, to reuse the same configuration/control software.

But the config connector should not be in the ATX connector area, to save scarce space there for user IO ports, and make the FPGA-ness invisible to non-developer users and admins. So the connector should be internal, direct on the board. Preferrably use an 26pin IDC header like that used for pre-ATX LPT slot mount ports for on-board printer ports, not DB25.

If possible from the printer port pin usage, the programming circuit interface should offer support for debuging of the large FPGA and the complex designs it allows. See the software/debug clock 3 signal section above. Ditto remote-forcing the M0-2 mode to "slave" and programming the clock 0 frequency would be good. [Andrew: What capabilities does the QST0201 circuit actually have in this respect, what printer port pins are used up? Same as in the Xilinx Parallel Cable III?]

For seeing configuration happen, put in green "done" and red "error" LEDs. And also parallel them with IDC connectors for missusing the PC case Power and Turbo LEDs for this (these are separate from the "Power/Turbo as output of the user design" IDC connectors, users can plug the LEDs for their preferred function).

Internal Configuring from EEPROM

Once the user has an running design, he will want to run it independantly of any development system. This requires an board-local FPGA config mechanism. Simplest is an config EEPROM. For XC2S600E the 4MBit XC18V04 is the right part, for XC2S300E the 2Mbit XC18V02 is right.

For switching external/internal config put FPGA M0-2 pins on DIP switches, like on QST0201, as fumbling and losing jumpers is annoying. Make clock 0 also settable by DIP switches (or use clock chip with internal EEPROM?). Switches should be socketed (DIP socket is cheap), so that they can be replaced with an DIP-on-IDC plug and cable for automated external setting. [Andrew: Socket is actual user request, I second it. Or if they can all be set from the external config IDC 26, then no need for this]

Users may want to have multiple configurations. Or even may want use the FPGA-PC as development system to improve itself, generating experimental configs and needing fallback to working ones. Both of these require config time selection of which config to use.

Simplest is to swap the config EEPROM. This doesn't require any more support than socketing it, so use an PC44 case for it. But this requires chip pulling, and so stresses chips and sockets. And for loading from the running FPGA-PC even pulling the EEPROM used for configuring while the system is running

Possibly include an second EEPROM socket for an 2nd EEPROM for programming it. With 2 sockets make only 2nd writable, or at least the 1st write protectable. Protects main design from any accidents or sabotage.

This should be combined with an switch to select which socket to config from. Also put in IDC connectors to allow such switching from PC case Turbo switch (or Keylock) and triggering reconfig from Reset switch (these are also separate from the "Turbo/Reset as input of the user design" IDC connectors, let user decide on which use). Possibly put in circuits for the FPGA to itself demand an reconfig. This is apparently possible, somehow have some IOs trigger some of the config signals.

Alternatively missuse the parallel port and an external adapter for EEPROM writing, for more access to its socket (but is this needed with EEPROMs?). Such an parallel port EEPROM writer is also sharable with development PC or an other FPGA-PC board (but only small insignificant cost saving). But this may require more complicated switching stuff to config from there, so that other printer/user port usage is not blocked by it. So preferrably not.

Internal Configuring from Flash by Microcontroller

Large config EEPROMs are expensive, even worse if multiple EEPROMs are needed for multiple configurations. Alternative is to offer config from an cheap standard flash chip. Such an chip needs to be addressed/controlled externally, so use an microcontroller (uC, such as 8051 or 68HC11 or AVR), or an CPLD. Preferably not an CPLD, as their speed is unneeded and no open source tools for programming them exist, nor data for reverse engineering.

Either use an 29Fxxx parallel flash chip, as used for PC BIOSes, which would require an 40pin uC to drive it. Or use an 8pin 45Fxxx serial flash, plus an 8pin uC. The later was mentioned on c.a.f to be <20% of the cost of the same size XC18xx EEPROM, for the 1Mbit case. [Andrew: I note that the Xess Virtex boards both use standard parallel flash (with an CPLD to config), while their older XC4000 used OTP PROMs. They avoid the expensive EEPROMs entirely. We should also]

Either use an minimal 4MBit (= 512kByte) flash chip. Or better put in an larger 16MBit (= 2MByte) flash chip, and offer space for 4 configurations with only one flash chip (and the uC).

Because in later 4 configs space, no multiple flash chips are needed, no sockets or switches. Use the uC address initialisation software to select which config to then use, no hardware for this either. Swapping flash or socketing the flash chip is not needed either. To control selection use 2 switches. Same as above with separate IDC connectors for the PC Turbo switch (or Keylock) and triggering reconfig from the Reset button. No need for FPGA driven self-reconfig either, simply ask the uC to trigger stop and reconfig. This also allows selecting any of the 4 configs by the FPGA, with the right signalling pins.

The FPGA can now also write the configs via the uC (just requires some signalling pins). Use the uC firmware to write protect or allow writing any of the 4 configs, without any hardware needed for this either.

In addition to configuring the FPGA, the uC should also be able to set M0-2, and also control the clock 0 programmable oscillator, and also be able to drive the development system drivable FPGA clock 3 would be nice. So best offer the uC access to the same circuits that the development systems printer port would use. So use tri-stateable pins for this, or better an parallel port driven multiplexer, which can override an badly crashed uC.

[Andrew: I prefer this usage to specialised EEPROMs, more configs, less hardwired, more flexibility, for less cost. Just needs uC software writing, once and then it is done. What could be better?]

Parallel to the interface uC to FPGA for configuring, have also an one from FPGA to uC for run-time downloading of initial or updated firmware to the uC internal flash. The FPGA will initially or after failled uC upgrade need to be configurated from an development systems parallel port. Alternative have the uC only reloadable from the IDC 26 config connector, but this is inferior.

Possibly, run all the Turbo/Reset/Keylock switches from only one set of IDC connectors to the uC, and have an uC controllable IDC/uC switch for the lines to the FPGA. So the FPGA can get these direct or filtered by the uC. Same Power/Turbo/HD LEDs from FPGA to uC and then switch FPGA/uC to the IDC connectors, for filtered output. This allows changing uC vs FPGA usage by reloading uC software, without opening case and replugging. [Andrew: I am not really sure this feature is worth the extra work]

Internal Configuring from Devices via the FPGA by Microcontroller

With above ability of an running config to write new configs, together with the ability to demand reconfig from that config, it is possible to make an minimal config that just fetches an second actual config from any FPGA accessible storage device, writes it, and then reconfigures with it. This allows unlimited configs, and fast switching to experimental ones.

This also allows using any FPGA accessible user IO device to offer the user an selection of configs, as in an boot manager.

As rewriting flash too often (every config!) breaks the chips, possibly put in an single config 4MBit (= 512kByte) SRAM chip, for such temporary configs. Such SRAMs are not available in serial, so need an 40pin uC, so definitely also use an parallel flash chip.

If 256kx18 chips were used for L2 cache or CMOS, then possibly use one of those here, if reducing part list size saves cost. Use this one definitely in 9 bit wide wiring to save uC pins, as only 8bit data are used anyway.

As there is battery backed CMOS SRAM on this board, possibly battery back this SRAM also. This suggests an alternative flash-less design, with only SRAM for 4 configs. But this requires 16MBit of SRAM, which will cost 4 chips, unless 16MBit (2048kx8 or 1024kx18) SRAM chips now exist.

Partial Reconfigurability

FPGAs can be reconfigured, even while in operation (short freeze while loading). This can also be an partial reconfig, which is even faster, and leaves the main part of the running design in a state where it can continue running from. This can be used like loading kernel modules in an OS. Particularily IO device circuits could be loaded/unloaded this way, leaving an FPGA-CPU running, including the OS on top of it. Parallel/user port being the obvious candidate for this. [Andrew: One of my users even had the idea of an variable section of the FPGA that gets swapped by the OS on every process reshedule, giving every process its own FPGA extension! I managed to convince him that performance is too low for that, and that loadable modules for one or a few (2^n-1) processes, and an OS driven switch for using them is better]

In Virtex family chips this is done by replacing vertical stripes of CLBs and the IOs at the top/bottom of them. This requires the FPGA to give the config circuit the new CLBs and then the config circuit to stop the FPGA, write the stripe, and restart. This requires an uC+SRAM design, config EEPROMs will not suffice for this.

Replacing the CLBs also replaces the IOs above and below them. So possibly arrange IOs so that replacing one of multiple reserved "module" stripe(s) of CLBs does not hit important IOs, such as memory, as this loses their IO FF contents. But such an replacable stripe gives trouble with routing going through it to the IOs on the other vertical side of the FPGA.

This can be solved by sending them around the VersaRing lines in the IOs. But even this requires "saving" the IOs and adding them to the new stripe of CLBs to be loaded. So this must also save the IO FFs. So it looks like partial reconfiguration is an non-issue, as far as IO layout goes. And designers can use any arrangement of replacable CLB sripes.

Loading IO circuits also requires routing from them to the IOs used. So the IOs can not be copied without changes. Same also happens with connecting the circuits to the rest of the design. Replacing just these routes, but not the rest may be a problem for the uC. Pre-routing dummy routes just to the first CLB of the stripe should solve this. Or use global lines from outside into the circuit.


[Andrew: This section and all following were designed in the 2.5 months (2002.12.08-2003.02.23) before I came up with the flash+uC and then its flash+uc+SRAM-via-FPGA boot manager variant.

Up until this time it was either: EEPROM without uC (no multi-design space for "reload and boot" logic), or uC with only devices and no flash (no place for an boot manager design to be loaded from, so needs direct device access).

Since coming up with the uc+SRAM combination, all the uC direct access to devices and user IO stuff is not needed any more. But I have left it here in the file, just for you to see what was long intended. Also perhaps there is something interesting in here.

If someone still wants an FEP style design, they could put an "small board" inside the PC case, with an normal 26pin cable from small boards printer port to this boards config connector. They would then need to use SCSI drives with both boards on an single SCSI cable for shared disks. This would require adding SCSI to the small board (by add-on card?)]

Internal Configuring from Devices by Microcontroller direct

Offer to configure the FPGA from some storage device (disk or flash). For all these forms of configuring the uC must access the device and write the configuration into the still dead FPGA. Doing this requires an uC, for this CPLDs are not powerful enough.

FPGA config is actually similar in concept to loading CPU microcode (which is what floppies were invented for), so offer floppy use for this. But also allow config from ATA/SCSI drives for faster performance and more comfort.

Share access to the same devices of these types that the FPGA has access to, so no need for separate devices and the FPGA can place configs there for the uC to then use. This suggests that the uC should actually have pins shadowing the FPGAs floppy and 1st ATA/SCSI pins one-on-one. This requires an large 64pin or even 84pin uC. This also requires the first SCSI to be on board and not on the CDEC.

Only one of FPGA and uC can use the devices at any one time (the other must tri-stat its shared pinse), with the exeption of SCSI where both can share an bus (as switched tri-state is in the bus definition), so long as they do not modify the same file system or use some other exclusion system. The FPGA is tri-stated after power up or reconfig pin, so the uC needs to switch to tri-state before letting the FPGA run.

Facultative, if there are enough pins available on the uC (unlikely), also add control pins to access any other FPGA PCI/ISA bus device. If SCSI is only PCI, then this is needed. In this case, if there is on-board full chip USB, also add access to it. Is good as USB is external (unlike ATA/SCSI), and so allows the use of an pluggable USB disk or even an "flash plug".

Even more facultative, in todays world of Ethernet OS boots, also possibly Ethernet FPGA configs, if on-board full chip Ethernet is included. Also allow access to that, if enough pins and code space in the uC to use it (assuming then an DHCP+TFTP-only single connection UDP/IP client is enough). But this is really over the top, while USB may still be useful.

For selection have standard config to come from fixed ATA/SCSI and the experimental configs from removable floppy or USB, with the ATA/SCSI as fallback. But this is bad for multiple standard configs. Or simply have 1 config = 1 floppy or USB, but this loses using ATA/SCSI HDs, which may already be available.

Front End Processor for Selecting of Configurations

Unlike FPGA config EEPROMs which are one config per chip, and so can be selected by swapping/switching chips, devices can contain multiple (or even many) configs in one, and in the case of HDs are un-swappable. So selecting between these is desirable. So need to provide an selection interface.

The simplest method would be an set of 8 DIP switches (again no fumbly jumpers, and also DIP socketed for control remotely), readable by the uC, for selecting the config device type/number, and partition/file number on it, and the fallback policy if an removable media device is empty. Like the external config connector, having these DIP switches externally accessible would cost ATX connector area and show FPGA-ness, so make them internal.

But internal switches make selecting difficult to access on every config, which is often the case while experimenting. Better than DIP switches would be to have the uC offer the user some sort of prompt via the standard user output ports/devices and then take input, auto-configuring on some timeout. This is like an PCs boot manager for selecting BSD or Linux kernels or other operating systems.

Unlike boot managers which run on the same processor and IO as the final OS and kernel will, the FPGA-PC has at this time only an unconfigured "dead" FPGA, not an runnable processor. [Andrew: here the above "boot manager config" hits, making the FPGA usable for fetching and of course also for selection]

This is an equivalent situation to an microcoded processor without microcode loaded. Historically this problem was solved with an separate hardwired front end processor (FEP) computer, like an PDP-11/40 (in the KL-10) or an 8080 microprocessor (in the KS-10). Use the microcontroller, or even perhaps an full microprocessor (uP, such as Z80/Z180) for the FEP, depending on the functionality desired.

To do this, the FEP needs its own access to user I/O. For this also share an subset of the FPGA-PCs input and output devices with it.

With the FEP on 1st PS/2+VGA and the main system using 2ndPS/2+2ndVGA (if it exists), one can even have dual independant consoles, mouseless or with RS232 mice. But this requires 2 keyboards and monitors (and mice), which many users will not want to provide, same space problem as terminal.

For merging FEP and main system output onto the same VGA display (FEP as status line at the bottom, or even as window) have FEP VGA output also go to an FPGA input. This allows an digitiser (like BT848/878 type devices) be implemented, writing into FPGA video memory. This gives both outputs at same time, at full main system VGA quality, on one monitor, but this adds more FPGA design work.

This also allows the FEP to "flip" 2/3/4 screen data sets at 60HZ to the digitizer for having 2/3/4 FEP windows, display updated at 30/20/15Hz. With "flipping" the FEP becomes even more complex.

For getting the larger amount of software needed for all this into the FEP, instead of using one large flash part for firmware, have only an minimal boot ROM and RAM for software in the FEP. Have it boot its software from the same devices that it can later take FPGA configs from. This can be an full OS (such as CP/M if a Z80/Z180 is used). Also use the OSes file system for storing the config files. The FEP has become an full (micro-)computer, like the historical FEP computers actually were.

Pluggable Front End Processor

The above FEP has become an quite complex fullgrown 8bit microcomputer. Its design/layout size/complexity and parts cost time and money. In particular with the desirable full power VGA and RAM OS and in-window additions, it is outgrowing the rest of the board in design/layout time. And it is also an hard-wired, non user alterable, thing. Design bugs from its complexity are also hardwired and expensive to fix.

Possibly make the FEP an facultative, pluggable component. This gives users the choice of simple on-board config by EEPROM (or still uC and devices), or using pluggable FEP for complex/expensive user selectable configuring. [Andrew: How many users do you think would want an FEP? And those that don't want an FEP, how many do you think would still want to use uC+devices vs just go for EEPROMs?]

Easiest variant is to just provide all signals any FEP would want, run to one (or an set of) IDC connectors and leave it at that. This though requires non-EEPROM/device users to make an special FEP computer design, hardwired or FPGA based.

Better offer something to be plugged in as FEP. But that brings back design time cost, even it parts costs are now optional for the user.

FPGA based Front End Processor

Alternative to an uP based FEP and an growing amount of hardwired logic around it, use an small FPGA for the FEP. This reduces board complexity, layout time and unmodifiable inflexible circuits. This FPGA would be fixed configured out of EEPROM. As alternatives to EEPROM the FEP can only be configured from an development system or the running main FPGA system. [Andrew: This is analog to your "small FPGA for floppy config" module for the QST0201 idea]

Such an FPGA FEP allows any user-selectable FPGA-CPU in it. For Sparcs one may use 8051 or Z80 or even PDQ8. For PDP-10s one could use an PDP-11/40 as in the real KL-10 ones, or even an minimal PDP-10 like the KA-10, as used for the Foonleys. So the FEP FPGA should be no smaller then XC2S150, which can still get by with an small XC18V01 1MBit EEPROM. Sensibly use something derived from the QST0201 design for this.

FPGA configuration includes either FEP firmware in BRAMs (but that is size limited) or give the FEP an bit of external RAM and use its BRAMs only as boot ROM. This would result in FEPconfig(EEPROM) then FEPboot(disk-direct) then main-config(disk-via-FEP) then main-boot(disk-direct) start up sequence. This is fairly involved and quite a bit of (too much?) programming work, but ultimately (and unneccessary?) flexible with little hardware layouting and fixing (definitely good). [Andrew: One of my users says he regards it as too much work for each user to first get/install the FEP FPGA config, then get/install an FEP OS and only then actually be able to get/install the main FPGA design. I regard the "main FPGA from EEPROM" variant as minimal "start immediately" operation, and the FEP as only facultative. But perhaps deliver the FEPs with an simple FEP config and non disk based OS pre-installed in their EEPROMs? This lets the users treat the FPGA FEP as if it were an hardwired uC/uP FEP, or modify it if they want to]

An FPGA FEP allows any complexity FEP IO options, so long as the needed devices are wired in. With QFP208 (and so 140 user IO pins) this allows floppy and PCI/ISA and ATA/SCSI and USB for config file fetching and LEDs&switches and 2*PS/2+VGA+PCspeaker and RS232 for config file selection.

This would also allow serious VGA level output in the FEP, for an good console terminal in the FEP. Also allows the FEP, when not displaying directly, to send data custom formatted to an simpler "digitizer" in the main FPGA.

Pluggable FPGA for the Front End Processor

Of course this FPGA FEP still adds the 2nd FPGAs cost to the board. So better simply provide an set of IDC connectors and then an 2nd FPGA FEP board to plug in.

One possibiliy is inserting the already existing QST0201 board for FEP usage. Then simply wire the various memory, IO devices and storage devices to the QST0201, fitting its 3*40+50 IDC connector pins. This separates out the FPGA and all its development/EEPROM config stuff, and it reduces layout work (but how much?) by reusing QST0201.

QST0201 still needs the large board to provide it with clocks (share the main FPGA ones?), memory (suggested 1 265kx18 as that part is already used for cache) and all the FEP drive and user IO analog stuff (share some?). So this only provides non-optimal (but quite good) facultative part cost.

But the QST0201 has an DB25 for configuring, not an IDC for data from the main FPGA, and needs an power brick plug, not power from some IDC. [Andrew: These are the only weaknesses I have found so far in QST0201. It may be sensible to change them in a future revision of QST0201, to make it an true pluggable component, delivered with DB25 to IDC programming cable and an powerbrick to IDC power cable adapter]

Pluggable FPGA based Front End Processor

Alternative to using QST0201 is to use an separate full XC2S150 based FPGA-PC board, in its pluggable industrial/embedded variant. On this large board provide counterparts for all IDCs to connect up to. It deliberately has the right feature set for this (although design predates this usage, but it later was influenced by this possible use).

This gives optimal facultative part cost. And side effect of doing this variant is that the FEP IO stuff design and layout time is shared with getting an separate small FPGA-PC board (and so an small+large board series), so double use of the design and layout time.

As the small board has all male IDCs, so it needs female IDCs on the main board. So for small board male IDC for printer port, the main board config connector needs to be female IDC (do there exist male on-cable IDCs for config from developement system cable?). With small board config the same female IDC, the main FPGA must use male IDC for the internal printer port to config the small board. Without the FEP board inserted this male 26pin can also be used with an standard slot mounted LPT connector as further printer/user port, so also give it an set of power switching stuff.
Home | Projects | FPGA-PC clone | Board Spec XC2S600E

This page is by Neil Franklin, last modification 2003.06.07