16Bit Stack+nRegister Computer - 8bit 0Address Instruction Set
author Neil Franklin, last modification 2008.10.30


design features

similar to 1+3*5bit Instruction Set but more traditional n*8bit
  gives larger opcodes, slower, but compensate with more opcodes, less instr
stacks+specialregisters+ALU 16bit data
  0-address system without adressable registers
  data stack for evaluation and call parameters / return values
    hardware stack in CPU, not in memory, wraparound, hidden stack pointer
  return stack separate from this, with push>R and R>pop
    hardware stack in CPU, not in memory, wraparound, hidden stack pointer
  not a pure stack architecture, special purpse registers
    this avoids the typical performance killing stack top thrashing
    1 16bit count register for DJNZ style loops, with >C and C>
      decrement-and-jump, like Z80 and 8048/8051 DJNZ, save separate DEC
        in particular fast block copy, together with auto-postincrement
memory 16bit address
  16bit programm counter, instructions must be alligned to every second address
    all branches (conditional) w 8bit or 16bit offset, fitting jumps (uncond)
  full 16bit direct/absolute addressing, and ev 8bit short addressing
  2 16bit address registers, with >S and S> and >D and D>
    mult reg indirect memory accesses, to prevent slow memory-indirect addr
      and also prevent pure-stack architectures stack thrashing
    address registers with autoincrement, save separate INC/DEC time
      possibly always/only with autoincr, as non-inc addr not used any more
  1 16bit locals base register, addresses with 8bit offset
    unless TOR used for this, gives automatic push, new from malloc
IO space 8bit address
  separate IO space, to not cut off small stuff from large memory
instructions 8bit opcodes
  bytewide, no packed bundle of 1+5+5+5bit, less compact, but more traditional
    gives more opcodes, more variants, can save some instructions and time
    in particular LDI plus operation combinations save stack push and pop
  derived from 16bit_StackReg_1+3*5bit_0Addr, but more traditional 8bit instr
    because of this also use traditional byte addressed memory, half the size
    in particulat better if implemented as an bytewide microcode machine
result still somewhat similar to Charles Moore Forth stack processors
  but with various register processor things where those are better


registers and stacks

n*16bit ? data stack, implemented with separate TOS register
n*16bit R return stack, possibly implemented with separate TOR
16bit S and D source and destination address registers
16bit C loop counter register (or better use TOR for this)
16bit L locals base/frame pointer (unless TOR used for this, or none)
16bit PC programm counter
1bit F flag(s), are/is 0:C(carry), changed when Arith/Shift op


memory

32k*2*8bit memory, byte addressed, possibly auto-ByteSWAP
  word accesses with A0=1 are automatic byte swap, 2nd byte just invert A0
  unlikely 64k*16bit general purpose program and data memory, word addressed
    gives 128kByte memory without separate 64k+64k code and data address space
256*16bit separate IO space, only 8bit immediate addressing


instructions

8bit, facultatively with 8bit or 8+8bit constants following
0ooooooo
  Variant 1 (compact constants)
    000ooooo = 32 5bit instructions, without const or with following 8/16bit
      actual instructions same/similar as in 16bit_StackReg_1+3*5bit_0Addr
    001ccccc = LDI, load with 5bit signed immediate constant
      all larger constants as call with constant after it
    010ooooo = JMP, jump with 5bit signed offset
    011ooooo = BRZ, branch on zero with 5bit signed offset
  Variant 2 (merged constants)
    000ooooo = 32 5bit instructions using only stack
    001ooooo cccccccc cccccccc = same as 000 but with direct 16bit constant
      or alternatively only cccccccc 8bit signed, larger with call and const
    010ooooo = JMP, jump with 5bit signed offset
    011ooooo = BRZ, branch on zero with 5bit signed offset
  Variant 3 (variable constants, no small branches)
    000ooooo = 32 5bit instructions using only stack
    001ooooo cccccccc cccccccc = same as 000 but with direct 16bit constant
    010ooooo cccccccc = same as 000 but with direct 8bit zeroextended constant
    011ooooo cccccccc = same as 000 but with direct 8bit signextended constant
  Variant 4 (less variable constants, more instructions)
    000ooooo = 32 5bit instructions not using stack or only top of stack
    001ooooo = further 32 5bit instrustions using 2 top and second of stack
    010ooooo cccccccc = same as 001 but with direct 8bit extended constant
    011ooooo cccccccc cccccccc = same as 001 but direct 16bit constant
  Variant 5 (less variable constants, DUP combi instructions)
    000ooooo = 32 5bit instructions using only stack
    001ooooo = same as 000 but implied DUP (DUP+DROP=NOP, possib this 000ooooo)
    010ooooo cccccccc = same as 000 but with direct 8bit extended constant
    001ooooo cccccccc cccccccc = same as 000 but direct 16bit constant
1aaaaaaa aaaaaaaa = CALL, with 15bit address (*2) in 2 Bytes
or possibly reversed as ooooomm0 and aaaaaaa1
  has advantage of no 1bit shift of Ireg6..0+Mem7..0 to PC15..9+PC8..1 needed