AVR CPU Businterfaces to Microcoded IO Cards author Neil Franklin, last modification 2012.05.11 IO Slots/Cards/Devices simulate these also by using microcontrollers, also microcoded, like CPUs possibly reinforced with further IO hardware, TTL or PAL/GAL or special further developement of Microcontroller_CPU_Emulator files ideas enables random 1-chip IO chips, to be addet to CPU memory example video display as done by SoftVGA project, just controller+DACs same possible keyb/mouse, joyst/paddles, sound, floppy, RS232, LPT, USB for simple 3-AVR design key/mouse(/joyst/paddle)+CPU+video can use normal RS232 in this Variant should fit all onto one 160x100 euro card jump cable to in-built programmer, USB and FTDI, reuse existing tools first stage bitbang Arduino-style bootloader, second stage fast stuff for more need to connect open-ended amount of multiple peripheral AVRs will be placed on separate IO cards, plugged into slots, for HW expandability or even als external units, plugged into an expansion port(s) or chain as devices VGA, sound, 2 * PS/2, 2 * joyst/paddle, 2 * CF or SD cards experimentally/developement make card by card, use flexibility then possibly for non-soldierers an derived prebuilt single board preferably geographic addressing as used in AppleII or HP75 no switches or jumpers setting addresses, needing planning, else collisions each card own AVR programming connector, for external plug-on programmer for this each vard local reset, and from bus global reset, button on CPU Data Transfer old traditional way would be parallel system bus use CPU controller PIOs D and A busses like for memory accesses just different control signal PIO bits for IO access vs memory access but this requires registers (or register-like structures) on each card would be possible with 8042-like bus interface, register-like of 2 PIOs other controllers do not have this, requires an real PIO on each card doubles chip count, hardwires specific PIO chips and address decoding better use an emulated approach to IO device addresses and registers for this an underlying AVR to AVR communications infrastructure and then on top of this an byte stream protocol like SCSI or USB no external many-pins parallel bus, no multi-chip TTL or PIO interfaces also requires a lot of wiring, wide slot connector plugs today better use the already existing serial infrastructure in AVRs for main controller to card controllers use one of the on-AVR serial busses this can be done with SPI, USART with MPCM or I2C/TWI interfaces SPI, separate wire full duplex byte exchange ring runs fast synchronous MOSI/MISO/SCK signalling can SCK master at fosc/2, but slave max below fosc/4 (metastable sync) requires at east guaranteed 2 fosc clk per SCK clock phase, better 3 fosc max 20MHz/(2*3) = 3.33Mbit/s, @ 8bit/byte = 0.4MByte/s 2/5 of parallel CPU mem access, about 5 times parallel CPU data transfer possibly entire system one quarz, then 20/(2*2)=5Bit/s=0.6MByte/s requires an per slot/card /SS signal, forces geographic addressing of cards this requires an "motherboard" style design, not passive bus backplane because each slot/card has an private select line going to it or better if bus backplane, one with one larger slot for the CPU card for all the separate lines to go to the backplane also fixes standard in/out (keyboard/video) cards addr to standard slot IDs and also other "only one of these, do default connection without asking" this in not a large problem, unless differnt fixed cards start colliding requires many /SS, one each for each slot, not CPU controller pins for this better save/reuse pins, use 1or2 138 for 8or16 slots + 1 IO to G for /SS rest of 3or4 138 select signals reuse normal A2or3..0 or A10or11..8 pins this is as addressing an slot/card, either als 1 or 256 address blocks will allow backplane with 8 or 16 addresses use these for CPU card incl 1 IO, and 7 or 15 IO cards using such /SS analog to memory /CS, the SPI PIO port can also be data bus this saves pins for more control bus oder even address bus above 16bit alternative broadcast adress to all slots, plus select on card, ev slot ID but this requires more hardware, comparator, n 2-XOR + n-OR SPI is always bidirectional exchange, may impact on user visible protocol and so fail to be as simple to use as old home computer CPU busses but this may also be hidable, by in microcode protocol, user layer above open is by what means card->CPU signal handshake when byte digested requires status continuously placed in SPI register, timing limits? or software knowing how long to wait, may also impact on user visible USART, separate wire full duplex separate lines either 8bit only data, card select by card controller PIO or INT pin allows geographic addressing (if wanted) without SPI exchange behaviour switching is independant of actual transfer, overhead may cost time or MPCM 9bit in-card address decoding, card knows address, what to respond to if multiple identical cards needs switches or jumpers to select basically same addressing as I2C/TWI or PDP-8 IOT I8..3 or ISA A9..4 alternative use SlotID pins, read by PIO pins, compare address with that slower asynchronous U(S)ART baudrate max fosc/8 (precise clock) or even fosc/16 (any clock) but async can be replaced with TxD/RxD + XCK sync US(A)RT variant can XCK master at fosc/2, but slave max below fosc/4 (metastable sync) therefore US(A)RT clock and speed issues identical with SPI this may be the best variant, as also RxD usable as card "ready" handshake wire RxD also to ext INT, to fast detect "FF byte" startbit as handshake rest of byte and stop bit will be finished before next byte handshake strictly TxD/RxD PIO port can also be data bus, but reconfigures cost time I2C/TWI, common wire half duplex pure bus requires in-card-controller address decoding, as part or every block transfer is built directly into protocol, no possibilty of geographic addressing if multiple identical cards needs switches or jumpers or provide the address by SlotID pins read by PIO pins very slow 400kbit/s bit rate, despite sync transmission, and quite complex there exist defined faster speeds, but AVR does not support over fosc/16 max 20MHz/16 = 1.25MBit/s, @ 1+8+1bit/byte = 0.125MByte/s 1/8 of parallel CPU mem access, 1.25 times parallel CPU data transfer for command/data byte stream is still far faster than enough possibly no esimulated VRAM copy-on-write over it or possibly own software bitbanged protocol, exactly what system wants but slower than USART/SPI, though possibly still faster than I2C/TWI and succeptible to problem with non-interruptable IO card AVRs which work solidly for entire video lines (>30us) or even frames (>10ms) requires bit level handshaking instead of only byte level but that most likely only hits the first bit, after all in one go prefer hardware shifter, entire byte there, only wait handshake but how to allow doing something else, in particular other bus accesses Interrupts for device controllers to main controller interrupt signalling and the consequent discovery where the interrupt came from simplest method one interrupt line, and then software polling of all cards serial data bus no possibility to do 8-card parallel poll, because bit timing so needs to poll all cards serially, select each and ask, this takes time in particular when waiting for read from if non-interruptable card and worst behaviour card dominates worst-case interrupt latency can be solved by "pinging" each card, until interrupter frees the line but this results in "last to be pinged" of multiple taking interrupt can be reduced by round robin pinging, start one after last found one can also be reduced by only asking potential interrupt sources and this sorted by known usual response times, still blocks interupts alternative have interrupt block bus after current transaction then signal that, anyone demanding interrupt must remain responsive this is built directly into protocol for I2C/TWI, with multimastering at least after card found, in-card ask which device it was, that is parallel gives an 2-level search, serial poll only cards, not all devices in system better have private interrupt request lines from each slot/card CPU can directly look which card is demanding something this also requires "motherboard" style design, or large slot for CPU card is same issue as with having an private select line to each slot/card so this fits best with SPI or 8bit non-MPCM USART designs and also both fixed std in/out and std "one of these" cards to one slot not enough ExtINP pins for each slot, so all /INT via wide OR to CPU ExtINT requires addit special hardware on motherboard to read which slot/card either 8or16:3 encoder + 3 tristate w 1 IO to test 1 /INT line at a time or 1or2 138 + 3 tristate with 1 IO to test 1 /INT line at a time or 1or2 8 tristate 240/244 + 1or2 Gate to test all /INT lines at once or alternatively separate common-INT and slotprivate-ID lines, saves OR read card details via this 240/244 onto data bus, then priority process and so slightly violates entire Soft CPU idea, no specialised hardware but only small amount and a lot better, so it is acceptable here Slot Signals just the few serial data bus signals (SPI 3, U(S)ART 2, US(A)RT 3, I2C/TWI 2) ev plus handshake ready/acknowledge (0 or (SPI 1, UART|USRT 2, I2C/TWI 0|1)) plus card/slot select (geographic 1, in-device addressing 0) plus interrupt (geographic 1, ev 2nd INT ACK, multimaster 0) plus IO card to CPU reset (1) and possibly NMI/monitor/warmstart (0 or 1) can be used by keyboard or terminal RS232 cards for system control Alt-SysRQ for NMI/monitor/warmstart Ctrl-Alt-SysRq for reset plus power supply GND and 5V (2) in all cases power supply to main/CPU board, from there to card/IO boards as only one CPU card is possible in system, from bus protocol all on around 5-10 pin single-file jumper pins explicitely no slot pins for SPI+reset for card controller flashing for this better on each card an separate standard 10pin or 6pin ISP interface allows flashing from external PC or from CPU of other microcoded system and then on CPU an single ISP driver interface, cable to relevant card/sys Bus Connection geographic addressing, slot decoder 138 allows 8 slots, 2 138 allow 16 slots reserve 1 for CPU-internal IO devices or other stuff, same also interupt gives therefore 7 for slots, should be enough for decent systems while at it also use signals for 8th slot for accessing SPI S-Flash module or better standard 10pin for above ISP, for SPI S-Flash missuse reset->/SS S-Flash can run full fosc/2, max 20MHz/(2*2) = 5Mbit/s, @8bit = 1.25MByte/s allows copying max module size 32kByte in 0.025s, 1/4 of max allow 1/10s pure bus, no card specific signals, address decoding on card wiring as backplane board (or band cable), or stackable plugs like in PC104 or even as external chain of devices bus, each cable and further port Minimal System 1 Euro format card with CPU + terminal + diskspace + expansion 1 microcoded 16bit addr 8bit data CPU + 32kx8 ROM + 32kx8 RAM + select 2*8pin Addr, 1*8bit Data, few mem control from SPI 8bit, SPI bus, 138, 240 1 microcoded SoftVGA videogenerator, behind auxillary Buffer/Beep/Keyboard 1 microcoded CF card controller as diskspace 6 unused 138 driven slot (or better port) connectors User Software Interface for device selection something resembling an write/read register for device communication either something resembling device registers data write and read (with ev long waitstate) + interrupt direct from device or something resembling an handshaked PIO port data write and read PIO (fast) + handshake status registers with interrupt or better have device communication protocol as byte streams/blocks at functional level of device driver interfaces, driver executed in device result will look somewhat like SCSI or USB device byte block protocols Device Drivers no ROMs on IO cards available by this way, as no parallel bus to read them but inside controllers there is Flash ROM, store driver code image there provide protocol to read from them into read-only "ROM" space in RAM but this makes device-internal ROM images CPU type dependant ev multiple CPU type images, select them by index or CPU independant format, boot/load time compile it or byte streams standardised, only generic drivers in CPU needed Multi Device Cards appart from multiple slots/cards and addressing, also sub-addressing inside allows one card/controller to offer multiple devices, multifunction cards have user software see an 8 cards * 8 devices * 4 registers/functions or perhaps more sensible 8 cards * 4 devices * 8 registers/functions or alternatively split it to 8|16 cards * 32|16 devices, and 256 functions this would then require an devsel(var) and functsel(const) instructions or make this 8 cards * 16 devices, and 128 functions, 1 instr(only-var) result will look somewhat like SCSI device+LUN or USB device+endpoint addr ROM images in controller, if present, can be read as one of the cards devices or even add multiple images, ev as mult devices or mult funct in one device