[large thread snipped, but this is nifty] ###### From: Terje Mathisen Newsgroups: comp.arch,alt.folklore.computers Subject: Re: "Bootstrap" Date: Sat, 24 Mar 2001 00:49:18 +0100 Organization: Hydro Lines: 97 Message-ID: <3ABBE0FE.2B869F9F@hda.hydro.com> References: <998tlv$ja1$1@taliesin.netcom.net.uk> <99apn2$gao$1@news.btv.ibm.com> <99av49$2f5a8$1@fido.engr.sgi.com> NNTP-Posting-Host: iabrahmspc.nho.hydro.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.75 [en] (Windows NT 5.0; U) X-Accept-Language: en Path: chonsp.franklin.ch!pfaff.ethz.ch!news-zh.switch.ch!news.ifi.unizh.ch!news.imp.ch!psinet-eu-nl!newsfeeds.belnet.be!news.belnet.be!news.tele.dk!148.122.208.68!news2.oke.nextra.no!nextra.com!hydro.com!not-for-mail Xref: chonsp.franklin.ch alt.folklore.computers:77392 Peter Moylan wrote: > > John Francis wrote: > >In article <99apn2$gao$1@news.btv.ibm.com>, hack wrote: > >>The way a mainframe computer (e.g. IBM S/390) performs IPL (Initial Program > >>Load, IBMese for Bootstrap) is a good illustration. > > > >Not really. A better example would be the technique for starting a more > >primitive computer (such as a PDP-1) that didn't have dedicated hardware > >to do DMA from sophisticated bus-mastering I/O channels. Everything had > >to be done (or at least initiated) under direct control of the CPU. > > > >What you (or, rather, I) did was: > > > > Load in a few instructions (around ten or so, depending on system). > > This was usually done by means of the switches on the front console. > > It was quite common for those of us who did this often to be able > > to do it from memory - otherwise there was generally a piece of paper > > taped to the console giving the necessary information. > > > > This was an extremely rudimentary loader (DEC terminology was "RIM > > loader", for Read-In Mode), which could just read a short program > > (typically from the paper tape reader) and transfer control to it. > > The PDP-8 one was most ingenious, since it didn't have any obvious > way of transferring control to the thing being loaded in. It was > just a little loop transferring bytes from paper tape to memory, > apparently forever. If you looked at the instructions, it was > not at all obvious that this little program could do anything > useful -- it didn't even have a way to stop. > > The trick lay in the fact that what was being loaded in was > overwriting the original loader. After a certain point, then, the > instructions being executed weren't the ones you had put in > by hand. > > Another trick lay in the fact that what was on the paper tape > was in two different formats. The first little section was the > code for a primitive loader, which then went on to load a > better loader on the latter part of the same tape. This setup is actually quite close to the code I wrote some years ago, before the Internet, to allow me to send arbitrary binary data to users that only had basic email, no attatchments or ftp transfers: The primary bootstrap was written using only those 70+ ascii characters that the MIME standard declares as universally portable, even across different language code pages, or even a visit to an IBM mainframe (with a temporary conversion to EBCDIC and back. This first part uses self-modification, based on the fact that Dos .COM programs are always loaded with a 16-bit zero on top of the stack, to allow a RET opcode to exit the program, which is a 'feature' Dos inherited from CP/M. After modifying a single instruction, the code then does a forward jump into the (now functional) secondary loader, still using only MIME ascii chars. This secondary loader then goes on to generate the third step by combining pairs of input chars into arbitrary binary values, before jumping into this code. The third layer is a MIME Base64 decoder that takes groups of 4 input chars (in the [A-Za-z0-9+/] set), converting each group into 3 output bytes. (This code is almost certainly the shortest Base64 decoder on the x86 platform!) After converting the actual payload to its original format, the last step of the bootstrap process will then either save it to disk, or execute the code in place, after relocating it so it overwrites the bootstrap code. Here is the two first parts: ZRYPQIQDYLRQRQRRAQX,2,NPPa,R0Gc,.0Gd,PPu.F2,QX=0+r+E=0=tG0-Ju E= EE(-(-GNEEEEEEEEEEEEEEEF 5BBEEYQEEEE=DU.COM=======(c)TMathisen95 Terje PS. Just to make it more interesting, the code above is also somewhat self-relocating, in that it is possible to reflow/reformat the code, and it will still work: The CRLF combination that joins the two lines above can be replaced with any two, one or zero-character string, and it will still work. You must not split the lines at any other location though. The rest of the bootstrap code can tolerate arbitrary reformatting, since all whitespace/control characters are skipped. PPS. If anyone would like to check this out, I'd be happy to send you the source code (in C) of the encoding program. -- - Using self-discipline, see http://www.eiffel.com/discipline "almost all programming can be viewed as an exercise in caching"