My teen years: Porting Small-C to transputer and developing my operating system

SOM32 running on transputer
This isn't your modern C compiler booting on a recent processor. Instead, you'll read about how I bootstrapped a tiny C compiler on a 1987 transputer processor, while providing it with a basic operating system, a text editor, and an assembler! This happened in 1995, when I was age 16.
This article is possible because I was so proud of my achievement that I saved most of the original files and source code in a single floppy disk. In fact, I had forgotten completely about it, except for my fond memories about porting the C compiler and my 32-bit operating system that grew fairly sophisticated, until recently I made a transputer emulator and I was looking at my archives. To my complete surprise, I found a pretty early version of my toolset. But let's go in parts.

C is very hard

El lenguaje de programación C. Book cover picture.
Nowadays a common Reddit question. I learned the hard way that jumping from BASIC language to a more modern language wasn't so easy as I thought! I had great trouble going from the line-oriented BASIC language to structured programming with the Pascal and C languages.
Accustomed to read the source code in a top to bottom fashion, it took me a lot of time to understand that Pascal starts by the end, while in C the main() function can appear anywhere. Worst, the free form style of both languages drove me nuts because I was too fixed in having a full IF statement in a single line.
I tried many times reading The C Programming Language book by Brian Kernighan and Dennis Ritchie. Maybe so early as 1989 when I was barely age 10. Although I could understand the "Hello World" program, the next program they chosen to showcase the language was absolutely terrible for learning (the conversion table from Fahrenheit to Celsius) just because it was too much info for my young mind to process. It doesn't help the next example put the same program together in only two lines, an early example of obfuscated C.
It wasn't until I understood Pascal that I could get back to C and I finally got enlightened. The curly braces are Pascal's BEGIN and END, the syntax diagrams don't need to be in neat boxes (like Pascal) but can be in lines neatly ordered with recursive references (in turn this made awesome sense of the appendix A of The C Programming Language)

The Small-C compiler

Dr. Dobb's Journal Vol. 5, picture of book cover
My father in the eighties basically bought every single good technical and programming book available at our local American Book Store located in Ciudad Satelite, a town located just outside the north of Mexico City. They brought USA technical books into Mexico, even though their main business line was English student texts for local schools. Remember there was no Internet at the time, the knowledge dissemination was exclusively through books, magazines, and bulletins. If you wanted to learn something about programming, there's a chance there was a book in my father's library with enough information to direct you to learn. Once I know that English language was a requirement, I self-taught myself, reading entire articles of Compute! Magazine (available at Sanborns chain stores), and using an English-Spanish dictionary for reference.
Around the same I was discovering the similarities between Pascal and C, I found the article "A Small C compiler for the 8080s" in a book curiously named Dr. Dobb's Programming Journal volume 5. Who was Dr. Dobb? No idea, but I found fun the idea that a guy called Dr. Dobb wrote his programming developments, but curiously there was no published article by Dr. Dobb. Instead the C compiler was signed by Ron Cain (who wrote a great article about his Small-C inspiration) Of course, Dr. Dobb was a running joke of the magazine.
My mind was blown. This was a C compiler written in the C language, and it was able to compile itself! It wasn't anything like the BASIC language. BASIC interpreters are written in assembly language, and to this day I don't have heard of BASIC compilers written in BASIC. Furthermore, the idea of portability: Writing a program in C and running it in multiple machines, it was like someone was telling me that all my programs could be written to run on all the world's computers, an authentic holy grail (but of course, pretty misleading)
It throw me in a crusade for running this C compiler in the homebrew Z280 computer built by my father. This computer was just brought in working order around 1991. At this point in time, I had a primitive operating system written in Z280 machine code, providing a directory, allocation table, and it could read and write whole files.
Being more able in programming, my first attempt was typing the whole of the C compiler into an old Televideo computer running MS-DOS and Turbo C. It took me days because the font used for the source code was unreadable at times. Once it was working, I proceeded to write a text editor into the Z280 computer using machine code, and also a Z280 assembler to process the Small-C generated code.
At some point in early 1992, I managed to get running the Small-C compiler in the Z280 computer, complete with editor and assembler. I was kind of disappointed, because the unoptimized generated code bloated the output, also I needed to write support routines for everything. Worst, it barely had compatibility with any other C code because it missed a lot of syntax of the real C language.

Transputer Summer

It was 1992 when the 32-bit transputer board add-on was built by my father. I ported the Tiny Pascal compiler to it, and inspired by Small-C, I rewrote the compiler in Pascal, and went so far as to make an almost full Pascal compiler. The transputer had the big advantage of having a linear memory space, easing the development of compilers. Furthermore, I just had dominated the expression tree generation so I could have for the first time a reasonable optimized code. You can read here a complete article about this Pascal compiler made in 1993.
So with Pascal fully dominated, I decided I could provide the transputer with a C compiler, so I turned back again to Small-C. I decided I had lost already too much time (it was 1995 already), so I went for a quick&dirty conversion to transputer assembler language. Given my previous experience, I made this in two days, again using Turbo C in MS-DOS as base.
I changed all the int word sizes to 4 bytes, and replaced all the 8080 assembler code with transputer assembler code. Here is a brief example of how I changed the assembler target:
Small-C original code excerpt

    /* Add the primary and secondary registers */
/*      (results in primary) */
zadd()
{       ol("add");
}
/* Subtract the primary register from the secondary */
/*      (results in primary) */
zsub()
{
 ol("rev");
 ol("sub");
}
/* Multiply the primary and secondary registers */
/*      (results in primary */
mult()
{       ol("mul");
}
/* Divide the secondary register by the primary */
/*      (quotient in primary, remainder in secondary) */
div()
{       ol("rev");
 ol("div");
}
The next step was running it into the transputer. This first version was big and slow (although 21 kilobytes of code wasn't so big thanks to the optimized RISC instructions of the transputer), but I wasn't bothered, because I knew I could implement the expression tree generator and optimizer similar to the one I've developed in my Pascal compiler to reduce the final compiler size. It feel wonderful when I managed to get the C compiler in 16 kilobytes.
This article could have ended here, I could have made the Small-C compiler the same as my Pascal compiler, feeding it source code through the transputer channel, and receiving the assembler code through the other channel. However, this history takes a very different path here.

A Dutch Operating System

Cover of Sistemas Operativos: Diseño e Implementación. Andrew S. Tanenbaum
Flashback to 1993, I was talking with a friend about how my C compiler for Z280 was too basic, and I didn't understand how to implement the bigger features like struct and union, and I was looking for a book in order to help me learn how to write C compilers.
He brought me Operating Systems: Design & Implementation, by Andrew Tanenbaum, and... I was disappointed: The book didn't contain a C compiler, but after reading it, I found something so awesome it opened a whole window in my mind: A full operating system written in C language. The code shown for MINIX was way beyond the things I could understand at that moment, but it taught me how operating systems worked, their basic concepts, and the difference between monolithic and kernel-like systems. Tanenbaum is a great teacher.
At the time, my transputer development focus was based on a driver program running in the Z280 host machine. I was kind of tired of redoing my driver program, as each time I needed to adapt it to the program I was going to run (similar to how the Eniac II was rewired for each new task) I had a driver for the Ray Tracing program, another for the Pascal compiler, and another for compiled programs.
I was also writing a CP/M emulator around 1994. The CP/M operating system has a concept that I liked very much: The operating system was the same, only the basic input/output routines (BIOS) changed or were provided by the computer manufacturer. Pretty instrumental for the early success of CP/M as this separation made possible for Digital Research to sell CP/M unchanged to every manufacturer, charging $70 USD for each copy, and these manufacturers were in charge of writing the BIOS at no extra cost for Digital Research.
These three things mixed into my mind in 1995. I was writing a C compiler running into the transputer board, but I could make an operating system with my current subset of the C language, it would invoke the basic input/output routines through the transputer link channels, and the host computer would provide everything. The BIOS wouldn't be only separated from the operating system, but instead running in another computer.
This way, the host computer (the Z280) would provide disk access, video output, graphics output (everything I had handled in separated driver programs), and the transputer would be the operating system.
Of course, many people around the world had this idea to bootstrap systems, but for me, it was an absolute revelation!

Bootstrapping an operating system

I started to code a new driver program to provide the host services through the transputer links, and I coded my first 32-bit operating system using the adapted Small-C compiler. This operating system would handle memory management, its own File Allocation Table system, file names still using 8.3 syntax, and provide file handling services to the user programs. Technically, I was simply making yet another monolithic operating system, but I didn't stop to think about it.
As I was creating my operating system, I also made up a new text editor written in C language, and decided for an ANSI screen driver to allow colors, and cursor displacement over the screen. Furthermore, I recoded the transputer assembler in C language, in a testament to my ingenuity, the assembler is almost unchanged from that time.
It took me a few weeks, but I was delighted when I booted up successfully the first iteration of this standalone operating system. One where you could edit your source code, compile it, and update the operating system without resorting back to compile things in the host machine.

Archeology on your own disk

I have the floppy disk dump with the files for this early iteration of my 32-bit operating system. Barely any documentation, so some archeology work on it is required because I had forgotten how it was built. These are the files in the floppy disk:
The transputer files from my operating system development disk
Jun/02 to Jul/12 was barely one month to do all this. For booting my first operating system disk, probably I wrote myself the first sectors of the 1.44 mb floppy disk using the monitor program I had in the Z280 machine. But anyway, these are the steps I planned for bringing this back to life:
  1. The transputer emulator will be running MAESTRO.CMG (MAnejo de Entrada y Salida de Transputer por Oscar, or simply input/output handling for transputer by Oscar)
  2. Setting up a 1.44 mb. disk image file: dd if=/dev/zero of=disk.img bs=1 count=1474560
  3. Copy ARRANQUE.CMG (the boot sector) into the first 512 bytes of the disk image.
  4. Build an initial FAT system in sector 2 with 160 bytes (one for each track, 80 tracks x 2 sides)
  5. Build an initial directory image of ten sectors. Each entry measuring 32 bytes.
  6. Add an initial entry SOM32.BIN that contains the basic operating system (SOM stands for Sistema Operativo Mexicano, or Mexican Operating System)
  7. Add a second entry INTERFAZ.CMG containing the command-line processor. Problem here: The compiled version is missing from my archive.
  8. Add the files EDITOR.CMG, ENSG10.CMG and TC2.CMG to complete the environment. Again I'm missing the EDITOR.CMG file but I've the C source code.
  9. Add all the source files so these catn compile from the inside of the environment.
Turns out it isn't so easy. I started by developing a buildboot.c program to create the disk image file in the format of my operating system, because I would be rebuilding several disk image files as I progressed in making these able to run again.
Basic structure of a disk image of my transputer operating system
First three sectors of a working disk image for my transputer operating system. Notice my boot signature $12 $34 at offset 510.
The internal structure of the disk is graciously deducted from the source code of my operating system. Now the current state of things:
  1. I need to modify my transputer emulator to provide access to the disk image file.
  2. I need to add several transputer instructions and some "paralellism" because it depends on the transputer switching processes.
  3. I need to rebuild the missing INTERFAZ.CMG file to provide the command-line interface for the operating system.
I couldn't start in the step 1 because it required step 2 to be working in order for step 1 to be tested. A kind of chicken-egg case. It happens process management in the transputer is one of the best guarded secrets, and that's the cause I didn't emulate earlier my operating system. However, Michael Brüstle, maintainer of transputer.net, was kind enough to publish the T414 internal documents where I could see how the internal structures for processes where handled.
Process is a kind of misleading word, in the transputer this is more like threads. Anyway, thanks to these documents I could emulate the process handling (namely the instructions startp, endp, runp, and stopp), along with the tin instruction in a simplified implementation. The transputer has a very complicated code to sort the timers, and I only did the bare minimum because I only have two timers in my operating system.
After a quick glance at the source code, I saw I have four processes in this basic operating system:
  1. Screen refresh: Takes a buffer, checks for changes, sends line updates to the host screen.
  2. Middle-level BIOS: processes the BIOS calls, and sends back a slew of messages to the host system, probably because I wanted to isolate changes in the driver program.
  3. Main execution: This boots the operating system and calls it.
  4. Sleeper: A low level process that simply sleeps.
Is it easy? Not really. I got stuck three days trying to get the basics to boot in my transputer emulator. For something so primitive, it is amazingly complicated. I managed for it to read the boot sector and then get stuck, and that after discovering the startp instruction doesn't start a process, instead it adds the process to the list of processes to run, in order to run it later.
I still was stuck until I noticed this debug sequence:

Iptr=800002f3 A=ffffffff B=ffffffff C=ffffffff Wptr=8001ddb0 ldlp 16
Iptr=800002f6 A=8001ddf0 B=ffffffff C=ffffffff Wptr=8001ddb0 ldc 510
Iptr=800002f7 A=000001fe B=8001ddf0 C=ffffffff Wptr=8001ddb0 bsub
Iptr=800002f8 A=8001dfee B=ffffffff C=ffffffff Wptr=8001ddb0 lb
Iptr=800002fa A=00000034 B=ffffffff C=ffffffff Wptr=8001ddb0 eqc 18
Iptr=800002fc A=00000000 B=ffffffff C=ffffffff Wptr=8001ddb0 cj 22
It read the first byte of the boot signature in order to check against $12, instead it got $34. So the sector data was off by one byte, because it expects the first byte to be the status of the disk read. I misunderstood myself! Once that was solved, it jumped correctly into the boot sector, and it proceeded to read sector -1... So, another bug, but not. It turns I inserted myself that to reset the internal floppy disk cache on my host system.
Also, there was a striking difference between MAESTRO.LEN and ARRANQUE.LEN, one was based on track/head/sector, and the other was based on logical sectors. However, the SOM32.c source code I have still follows the convention of track/head/sector, and as I wanted to compile the operating system unchanged. I found that MAESTRO.LEN contained the LEESECTOR (read sector) subroutine that I needed for doing the conversion to track/head/sector.
After modifying and reassembling ARRANQUE.CMG with the rolled back code (and in the process modifying my transputer assembler to support strings between single quotation marks, and making yet another disk image), it was able to read sector 2 (containing the directory) I now realized the first file I added was called SOM32.CMG not SOM32.BIN as required, I had to make yet another disk image.
I was delighted to see it reading the sectors composing the operating system, and the emulation ended because it asked for another service code I hadn't yet implemented: 08. It just sets the cursor shape, very common in an age of CGA/EGA/VGA graphics cards, where you could set the text cursor to be a big rectangle, or just a blinking underline. Also implemented along the service 07 (setting cursor position), and after doing these I expected to see something on the screen (because the command-line processor file INTERFAZ.CMG wasn't still in the disk image), but I noticed it crashed while trying to load again the operating system, and then tried to get a sector -12.
The error could wait, but I knew something should appear on the screen. The refresh screen process wasn't working. I noticed the timer comparison was unsigned int, but in the C language an unsigned int comparison can never be less than 0. This was solved using a typecast to int. And once the timer was working, everything else ceased to work.
I thought I made a function for reading the 32-bit memory (read_memory_32) in the emulator. Instead, it was a macro, and it wasn't parenthesized. So it executed v | v << 8 | v << 16 | v << 24 - value. Can you see the mistake? I added parenthesis to the macro result, and I was greeted with this message:

    I?serte u? disc? c?? siste?a ?perativ?? y pu?se u?a tec?a???
Et voila! It was working finally. I was mystified by the wrong letters, but then I discovered my hexadecimal routine from 1995 never was a real hexadecimal routine. It only converted values from 0 to 15 to ASCII 48-63, and I was expecting real ASCII hexadecimal in my emulator driver software. The next test ran right!

    Inserte un disco con sistema operativo, y pulse una tecla...  
Or "Insert a disk with operating system and press a key". As I feed it a zero'ed disk image file, the code was expecting the boot signature. Now I could feed the proper operating system, and it worked! But it expected the missing INTERFAZ.CMG for the command-line interpreter. How to compile it?
With a little trick. I built a disk image with INTERFAZ.C and the compiler TC2.CMG renamed as INTERFAZ.CMG, along an actual copy of TC2.CMG, and ENSG10.CMG. This way the operating system would run the C compiler, allowing me to generate INTERFAZ.LEN (the compiled transputer assembler file)
However, the operating system refused altogether to load INTERFAZ.CMG. After reading more dumps, I discovered the source file I had for the operating system was old, and the binary was slightly more recent, it took the directory from the second block of the disk, instead of the sector 3. I adjusted properly my disk image generator, and after reading two sectors of INTERFAZ.CMG it got stuck. After peering some thousands of debug lines, I found that executing an "in" instruction triggered another "in" instruction incorrectly; I was able to solve this by counting better the internal timers as the service routine caused a race condition.
Finally, the transputer emulator generated a 143mb long file of debug lines, loaded the complete C compiler into the memory, and crashed because a mishandled service.

The big surprise

It was a bucket of cold water on my head when I noticed the compiler was designed to run just like the Pascal compiler: Without operating system.
It needed another driver program to handle opening files, that's the cause it was sending a different protocol over the link channels, and the crash of the operating system. Incidentally, it also explained why I had three STDIO libraries, one was for running it with the host computer using TC.CMG, the second one the same but for TC2.CMG (because the label prefix was changed to q instead of qz), and the third one is the bootstrapped one to run programs inside my transputer operating system.
After I wrote the driver for the C compiler in my transputer emulator, I was able to recompile everything. Including SOM32.c because the binary apparently had trouble running Interfaz.c. This is the way to invoke the C compiler from the host side:
    ./tem -cc os/tc2.cmg
    ./tasm os/tc2.len os/cc.cmg os/stdio3.len
It will ask for the input and output file names. The assembly is done with tasm. Still, it loaded interfaz.cmg and crashed. But now I had a disk image file with everything ready! I could now figure how I bootstrapped the transputer C compiler into my own transputer operating system:
  1. Create the service handler to interface with the host machine (MAESTRO.LEN)
  2. Create the boot sector in transputer assembler code (ARRANQUE.LEN)
  3. Compile the C compiler using Turbo-C on MS-DOS.
  4. Compile itself in MS-DOS and assemble the generated file (TC.CMG) with the STDIO.LEN library.
  5. Create a driver program in the host Z280 machine to handle C files for the transputer board.
  6. Enhance the C compiler to use tree expression generator (TC2.CMG), and assemble it with STDIO2.LEN.
  7. Compile SOM32.c and assemble with MENSAJES.LEN. Use this as SOM32.BIN inside the OS.
  8. Compile Interfaz.c and assemble with STDIO3.LEN. Use this as INTERFAZ.CMG inside the OS.
  9. Compile Editor.c and assemble with STDIO3.LEN. Use this as EDITOR.CMG inside the OS.
  10. Compile TC2.c and assemble with STDIO3.LEN. Use this as CC.CMG inside the OS.
  11. Compile ENSG10.c and assemble with STDIO3.LEN. Use this as ENSG10.CMG inside the OS.
  12. Now you can do development inside the transputer OS.
The main question is why I didn't have all these files ready to rebuild a floppy? The answer is easy, probably I overwritten the files while I was bootstrapping the operating system. Because once you get to the last step, you don't need anymore the host files because you just have found how to fly free.
I'm pretty impressed by how my young myself figured that STDIO.LEN was a Basic Input Output/System in its own, and could be modified from directing the driver in the host machine to calling services from the operating system inside the transputer.
Again I found a difference of opinion between the boot sector code, the disk image file, and the operating system. A small correction, and I was so glad to read this on the screen:

    Sistema Operativo Multitarea de 32 bits. v1.00                      
    >>>>> (c) Copyright 1995 Oscar Toledo G. <<<<<
    
    A>                                                                          
I was only missing the keyboard input code. I coded it in the emulator so fast as I could, and tried to type D (for DIR), and it crashed after displaying the letter. Ready to give a look at a 182.5 mb debug file, you can say *sigh* or you can continue working singing Save Your Tears (Remix)
I saw you across the room, one hour later (I mean the bug) The timer interruption could happen in the middle of an instruction with prefixes. I had to add a complete variable to give this information before accepting timers, and the command-line come to life, but no command was accepted. Everything was interpreted as a file name to execute.
While I was looking for this, I did a regression test in the emulator to make sure everything was working properly, and I tried to compile the Pascal compiler again. It didn't work... I noticed I added time slicing to the j and lend instructions, and the emulator worked again when I disabled the time slicing. After a careful look, again a C macro was the culprit. I invoked the macro write_memory_32 that expanded into 4 separated memory assignments, but the macro reference wasn't inside curly braces, so only the first byte was updated, and the next three assignments overwrote the process list. This is the code before fixing:

#define write_memory_32(addr, word) memory[(addr) & MEMORY_MASK] = (word) & 0xff; \
memory[((addr) + 1) & MEMORY_MASK] = ((word) >> 8) & 0xff; \
memory[((addr) + 2) & MEMORY_MASK] = ((word) >> 16) & 0xff; \
memory[((addr) + 3) & MEMORY_MASK] = ((word) >> 24) & 0xff;

    #define Enqueue(ProcPtr, Fptr, Bptr) \
        if (*(Fptr) == NotProcess_p) \
            *(Fptr) = ProcPtr;\
        else \
            write_memory_32(*(Bptr) - 8, ProcPtr);\
              \
        *(Bptr) = ProcPtr;
Once the emulator was compiled, this time I could do DIR in my operating system:

Sistema Operativo Multitarea de 32 bits. v1.00                                  
>>>>> (c) Copyright 1995 Oscar Toledo G. <<<<<                                  
                                                                                
A>dir                                                                           
                                                                                
SOM32   .BIN         4,143  23-ene-19<5 10:59:00                                
INTERFAZ.CMG         2,369  23-ene-19<5 10:59:00                                
CC      .CMG        16,952  23-ene-19<5 10:59:00                                
INTERFAZ.C           5,885  23-ene-19<5 10:59:00                                
EDITOR  .C          17,093  23-ene-19<5 10:59:00                                
ENSG10  .C          28,617  23-ene-19<5 10:59:00                                
EDITOR  .CMG         5,295  23-ene-19<5 10:59:00                                
ENSG10  .CMG        10,876  23-ene-19<5 10:59:00                                
                                                                                
            8 archivos.                                                         
    1,336,320 bytes libres.                                                     
                                                                                
A>                                                                              
Yes! Yes! Yes! The operating system I developed 30 years ago was working again. Of course, my command-line directory list was suffering of the year 2000 problem, and the month was incorrectly built by my disk image tool.

Some polish required

The transputer emulator gets the color screens from my operating system and replicate it using ANSI escape sequences on the terminal. You'll see my text editor is a homage to Turbo-C. You can edit the source files, compile these inside the operating system, assemble these, and re-run the programs. For easier usage, I've translated the terminal keys (macOS, Linux, and Windows) to the internal codes used in my operating system for function and arrow keys.
Alternatively, you can compile the source files using the transputer emulator, and get the output directly into your directory.

Using the programs

You can boot the predesigned disk using this command line or run_os.sh:

./tem -os maestro.cmg disk.img    
Alternatively it is provided build_disk.sh to create a disk with your preferred file setup.
None of the programs use command-line arguments. The comamand-line interpreter allows to use some commands like DIR, VER, MEM, and FIN, and invoke executables simply by typing their name.
Text editor running on transputer
Text editor running on transputer
The editor EDITOR.CMG is completely visual, if you are using macOS you'll be able to edit easily the files, and press Fn+F1 to see help. This is F1 for Windows, and your mileage varies with Linux because sometimes F1 is reserved for help in the windows manager.
For the C compiler TC2.CMG, you should say N two times (no error stop, and no C source code displayed), and then enter the input file name (for example, TC2.C), then the output file name (for example, TC2.LEN), and then press Enter to exit back to the operating system.
For the assembler ENSG10.CMG, you should enter the input file name (for example, TC2.LEN), then the library file name (STDIO3.LEN), press Enter alone, and now the output file (for example, TC2.CMG)
Instead of typing all these file names (I did it in 1995), you can use the file assemble_os.sh that will assemble all the files using my tasm assembler.

Final thoughts

This early version of my operating system was incredibly small. The operating system was composed of 793 lines, the command-line interface barely 298 lines, the editor 830 lines, the assembler 1473 lines, and the C compiler the biggest with 3018 lines.
It was my biggest achievement for that year, a pure 32-bit operating system working with service communication, and self-sustained integrated applications. Far beyond from only hosting a Pascal compiler and a ray tracer.
Of course, this history continued with my operating system growing bigger with support for CD-ROM access using High-Sierra and ISO-9660 standards, my C compiler grew to full K&R C (even I could compile some obfuscated C!), my editor got syntax coloring, but that will be another history.
The source code is released at https://github.com/nanochess/transputer
Enjoy it! Did you like this article? Invite me a coffee on ko-fi!

Related links

Last modified: Feb/23/2025