Building Z80 ROMs with Rust: A Modern Approach to Retro Computing
There's something deeply satisfying about watching a nearly 50-year-old CPU execute code you just compiled. The Z80 processor, introduced by Zilog in 1976, powered everything from the TRS-80 to the ZX Spectrum to countless CP/M machines. With roughly 8,500 transistors, it's almost incomprehensibly simple by modern standards; a high-end Intel i9 has around 17 billion. Today, thanks to projects like the RetroShield, you can plug one of these vintage processors into an Arduino and run real 8-bit code.
But here's the thing: actually writing Z80 programs is painful. Traditional approaches involve either hand-assembling hex codes, wrestling with decades-old assemblers that barely run on modern systems, or writing raw bytes into binary files. I wanted something better. What if I could write Z80 programs in Rust, using a fluent API that generates correct machine code without the mental overhead of remembering opcode encodings?
The result is the retroshield-z80-workbench, a Rust crate that powers three substantial retro applications: a dBASE II database clone, a WordStar-compatible text editor, and a VisiCalc-style spreadsheet. The workbench emerged from patterns I discovered while building earlier projects like a C compiler and LISP interpreter. This post explains how it works and what it's enabled.
The Problem with Traditional Z80 Development
I first encountered Z80 assembly in the 1990s, writing programs on a TI-85 graphing calculator. The process was painfully tedious: hand-assemble each instruction to hex using a reference card, type the bytes into the calculator's memory editor, run it, watch it crash, and start over. There was no debugger, no error messages, just a frozen screen or a memory clear if you were unlucky. I spent more time looking up opcodes than thinking about algorithms.
Writing Z80 assembly by hand means memorizing hundreds of opcodes. LD A, B is 0x78. JP NZ, addr is 0xC2 followed by a 16-bit address in little-endian format. Conditional returns, indexed addressing, and the various Z80-specific instructions like LDIR and DJNZ all have their own encodings. One wrong byte and your program jumps into garbage.
Traditional assemblers solve this, but they come with their own problems. Many only run under CP/M or DOS. Modern cross-assemblers exist, but they're another tool to install, another syntax to learn, another build step to manage. And when you're generating code programmatically, like when building a compiler that targets Z80, an external assembler becomes a significant complication.
There are also modern C compilers for the Z80, most notably SDCC (Small Device C Compiler), which is actively maintained and produces decent code. But when your goal is to generate Z80 machine code from Rust, perhaps as the backend of a compiler or code generator, you want something that integrates directly into your Rust toolchain.
What I wanted was the ability to write something like this in Rust:
rom.ld_a(0x42); // LD A, 0x42 rom.call("print_hex"); // CALL print_hex rom.ret(); // RET
And have it emit the correct bytes: 0x3E 0x42 0xCD xx xx 0xC9.
The Workbench Architecture
The retroshield-z80-workbench crate is built around three core concepts: emit, label, and fixup.
Emit: The Foundation
At the lowest level, everything is just bytes being appended to a buffer:
pub struct CodeGen { rom: Vec<u8>, labels: HashMap<String, u16>, fixups: Vec<(usize, String)>, config: RomConfig, } impl CodeGen { pub fn emit(&mut self, bytes: &[u8]) { self.rom.extend_from_slice(bytes); } }
Every Z80 instruction ultimately calls emit(). The ld_a() method is just:
pub fn ld_a(&mut self, n: u8) { self.emit(&[0x3E, n]); // Opcode 0x3E is LD A, n }
This pattern scales to cover the entire Z80 instruction set. The crate provides over 80 instruction helpers, from simple register loads to complex block transfer instructions.
Labels: Named Positions
Labels mark positions in the code that can be referenced by jumps and calls:
pub fn label(&mut self, name: &str) { let addr = self.config.org + self.rom.len() as u16; self.labels.insert(name.to_string(), addr); }
When you write rom.label("main"), the current position gets recorded. Later, when you write rom.jp("main"), the crate knows exactly where to jump.
Fixups: Forward References
The clever part is handling forward references. When you write rom.call("print_string") before print_string is defined, the crate can't know the address yet. Instead, it records a fixup:
pub fn call(&mut self, label: &str) { self.emit(&[0xCD]); // CALL opcode self.fixup(label); // Record that we need to fill in this address } pub fn fixup(&mut self, label: &str) { self.fixups.push((self.rom.len(), label.to_string())); self.emit_word(0x0000); // Placeholder }
At the end, resolve_fixups() walks through all recorded fixups and patches in the correct addresses:
pub fn resolve_fixups(&mut self) { for (pos, label) in &self.fixups { let addr = self.labels.get(label) .expect(&format!("Undefined label: {}", label)); self.rom[*pos] = *addr as u8; self.rom[*pos + 1] = (*addr >> 8) as u8; } }
This simple mechanism enables natural code organization where you can reference routines before defining them.
Building Blocks: The Standard Library
Raw instruction emission is powerful but verbose. The workbench includes pre-built routines for common tasks that any Z80 program needs.
Serial I/O
Our modified RetroShield firmware emulates an MC6850 ACIA for serial communication (the official RetroShield uses an Intel 8251). The standard library provides blocking read/write routines:
pub fn emit_getchar(&mut self) { self.label("getchar"); self.in_a(0x80); // Read status register self.and_a(0x01); // Test RX ready bit self.emit(&[0x28, 0xFA]); // JR Z, -6 (loop until ready) self.in_a(0x81); // Read data register self.ret(); }
This generates a 10-byte routine that any program can call with rom.call("getchar"). The character comes back in the A register, exactly as you'd expect from a standard library function.
Similar routines handle putchar, print_string (for null-terminated strings), and newline (CR+LF).
VT100 Terminal Control
Every program I've written needs cursor positioning, screen clearing, and other terminal operations. The standard library includes VT100 escape sequences:
pub fn emit_clear_screen(&mut self) { self.label("clear_screen"); self.ld_hl_label("_cls_seq"); self.call("print_string"); self.ret(); } // Later, in data section: rom.label("_cls_seq"); rom.emit_string("\x1B[2J\x1B[H"); // ESC[2J ESC[H
The cursor_pos routine is more complex, converting binary row/column values to the ASCII digits that VT100 expects. It's about 50 bytes of Z80 code that no one wants to write more than once.
Math Routines
The Z80 has limited math capabilities, especially for 16-bit operations. The standard library provides:
-
print_byte_dec: Convert and print A register as decimal (000-255) -
div16: 16-bit division with remainder -
negate_hl: Two's complement negation
These become critical building blocks for anything involving numbers.
Pseudo-Assembly as Building Blocks
The real power emerges when you combine these primitives into higher-level constructs. Instead of thinking in individual Z80 instructions, you start thinking in chunks of functionality.
Consider implementing a text editor. You need a routine to insert a character at the cursor position. In pseudo-assembly, this is:
- Get the current line pointer
- Shift all bytes from cursor to end of buffer right by one
- Insert the new character
- Update cursor position
- Redraw
Each of these steps becomes a Rust method that emits a sequence of Z80 instructions:
fn emit_insert_char(&mut self) { self.label("insert_char"); // Save the character to insert self.ld_addr_a(TEMP_A); // Get current line pointer self.ld_a_addr(CURSOR_ROW); self.call("get_line_ptr"); // HL = line start // Add cursor column offset self.ld_de_addr(CURSOR_COL); self.add_hl_de(); // HL = insert position // Calculate bytes to shift... // (many more instructions) // Use LDDR for the actual shift self.emit(&[0xED, 0xB8]); // LDDR // Insert the character self.ld_a_addr(TEMP_A); self.ld_hl_ind_a(); // Update counters and redraw self.call("increment_cursor"); self.call("draw_current_line"); self.ret(); }
This method generates about 80 bytes of Z80 machine code. By building up from primitives to routines to complete functions, complex programs become manageable.
Programs Built with the Workbench
The real test of any framework is what you can build with it. Here's what's running on the RetroShield today.
kz80_db: A dBASE II Clone
dBASE II was the database that launched a thousand businesses in the early 1980s. Before SQL became dominant, dBASE gave microcomputer users their first taste of structured data management. My clone implements the authentic 1981 file format: 8-byte headers, 16-byte field descriptors, fixed-length records with delete flags.
The file format is documented in the code itself:
DBF Header (8 bytes): Byte 0: Version (0x02 for dBASE II) Bytes 1-2: Number of records (16-bit little-endian) Bytes 3-4: Month, Day of last update Bytes 5-6: Year of last update Byte 7: Record length (including delete flag) Field Descriptors (16 bytes each, terminated by 0x0D): Bytes 0-10: Field name (11 bytes, null-padded) Byte 11: Field type (C=Character, N=Numeric, L=Logical) Byte 12: Field length Byte 13: Decimal places (for N type) Bytes 14-15: Reserved
The implementation includes:
- CREATE to define new database structures with up to 16 fields
- USE to open existing .DBF files from the SD card
- APPEND to add records interactively
- LIST to display all records in columnar format
- EDIT to modify existing records with field-by-field prompts
- DELETE and PACK for soft-delete and physical removal
- GO TOP/BOTTOM and GO n for record navigation
- DISPLAY STRUCTURE to show field definitions
The generated ROM is about 4KB, fitting comfortably in the RetroShield's 8KB ROM space. It reads and writes real .DBF files that you can open in modern database tools like LibreOffice Calc or even current versions of dBASE.
Building this required implementing a command parser that handles the dot-prompt interface, string comparison routines for command matching, file I/O through the SD card interface with seek operations, and the full dBASE command set. Each command is a Rust method that emits the appropriate Z80 code:
fn emit_list_command(&mut self) { self.label("cmd_list"); // Check if database is open self.ld_a_addr(DB_OPEN); self.or_a_a(); self.jp_z("no_db_open"); // Print column headers from field descriptors self.call("print_headers"); // Loop through all records self.ld_hl(1); self.ld_addr_hl(CURRENT_REC); self.label("list_loop"); self.call("read_record"); self.call("print_record"); // Increment and check against record count self.ld_hl_addr(CURRENT_REC); self.inc_hl(); self.ld_addr_hl(CURRENT_REC); // ... 150+ more lines }
The SD card interface deserves special mention. The RetroShield includes an SD card reader accessible through I/O ports. Commands like open, read, write, seek, and close are sent through a command register, with data transferred byte-by-byte through a data register. The workbench makes this tolerable by wrapping the low-level port operations in reusable routines.
kz80_ws: A WordStar Clone
WordStar defined text editing for a generation of writers. George R.R. Martin famously still uses it. The diamond cursor movement (^E ^S ^D ^X arranged like arrow keys on the keyboard), the block operations (^KB ^KK ^KC), the search functions, the word wrap, the careful attention to 80-column displays: all of this became muscle memory for millions of users.
The clone implements:
- Full cursor movement with ^E/^S/^D/^X and ^A/^F for word movement
- Insert and overwrite modes with ^V toggle
- Block operations: mark begin (^KB), mark end (^KK), copy (^KC), delete (^KY)
- File operations: save (^KS), save and exit (^KD), quit without saving (^KQ)
- Search (^QF), word wrap at configurable right margins
- Line operations: delete line (^Y), insert line break (^N)
- Quick movement: top of file (^QR), end of file (^QC), line start/end (^QS/^QD)
- VT100 terminal output with proper status line showing line/column/mode
The memory layout is carefully designed for the 8KB RAM constraint:
RAM (8KB): 0x2000-0x201F State variables (cursor, view, margins) 0x2100-0x21FF Input buffer 0x2200-0x22FF Filename buffer 0x2800-0x3BFF Text buffer (5KB) 0x3C00-0x3DFF Line index table 0x3E00-0x3FFF Stack
The word wrap implementation is particularly satisfying. When the cursor passes the right margin (default column 65), the editor scans backward to find the last space, then uses the Z80's LDDR instruction to shift the buffer and insert a CR/LF pair. The cursor repositions on the new line at exactly the right column to continue typing the wrapped word. All of this happens fast enough that the user just sees smooth text flow.
The screen update strategy matters on a 4MHz processor. Rather than redrawing the entire screen on each keystroke, the editor tracks what changed and only redraws the affected line. The VT100 "clear to end of line" escape sequence handles trailing garbage. This keeps the interface responsive despite the hardware limitations.
kz80_calc: A VisiCalc-Style Spreadsheet
VisiCalc was the "killer app" that made personal computers business tools. Dan Bricklin and Bob Frankston's 1979 creation turned the Apple II from a hobbyist toy into something accountants would buy. My version brings that experience to the Z80:
- 1024 cells (16 columns A-P by 64 rows) in 6KB of RAM
- 8-digit packed BCD arithmetic for accurate decimal math
- Formula support with cell references (A1+B2*C3)
- Operator precedence (* and / before + and -)
- Range functions: @SUM, @AVG, @MIN, @MAX, @COUNT
- Automatic recalculation when cells change
- Arrow key navigation and GOTO command for jumping to cells
- Cell types: numbers, labels, formulas, and repeating characters
The cell storage format uses 6 bytes per cell:
Cell format (6 bytes): byte 0: type (0=empty, 1=number, 2=formula, 3=error, 4=repeat, 5=label) byte 1: sign (0x00=positive, 0x80=negative) bytes 2-5: 8-digit packed BCD (d7d6 d5d4 d3d2 d1d0)
The BCD math was the hardest part. Binary floating-point would give wrong answers for financial calculations (the classic 0.1 + 0.2 != 0.3 problem). Packed BCD stores two decimal digits per byte, and the Z80's DAA (Decimal Adjust Accumulator) instruction handles single-byte addition correctly. But building 32-bit multiplication and division from 8-bit DAA takes hundreds of carefully sequenced instructions.
The formula parser handles expressions like =A1+B2*C3-@SUM(D1:D10). This required implementing recursive descent parsing in Z80 machine code, which the workbench made tractable by letting me focus on the algorithm rather than opcode encodings. The parser breaks formulas into tokens, builds a simple AST in memory, and evaluates it with proper operator precedence.
Beyond the Workbench
The workbench proved its value for these three substantial applications. But I've also built other Z80 projects that predate the workbench or use their own code generation approaches:
- kz80_c: A C compiler with its own emit infrastructure, developed before the workbench was extracted as a reusable crate
- kz80_lisp: A LISP interpreter with mark-and-sweep garbage collection
- kz80_prolog: Logic programming with unification and backtracking
- kz80_ml: An ML compiler with Hindley-Milner type inference
- kz80_fortran: FORTRAN77 subset for scientific computing nostalgia
- kz80_lua, kz80_smalltalk, kz80_chip8: Various interpreters and emulators
The experience building these earlier projects is what led to extracting the common patterns into the workbench. The emit/label/fixup pattern appeared independently in several codebases before I recognized it as a reusable abstraction.
Looking back at kz80_c, for instance, I can see the proto-workbench emerging. There's a CodeGen struct with an emit() method, a labels hashmap, and fixup resolution. The same pattern appears in kz80_lisp. Eventually it became clear that this infrastructure should be its own crate, tested once and reused everywhere.
The workbench also benefited from hindsight. Early projects had ad-hoc solutions for things like unique label generation (essential for compiling nested control structures) and relative jump calculation. The workbench handles these correctly from the start, saving debugging time on every subsequent project.
The Hardware: RetroShield Z80
For those unfamiliar with the RetroShield project, it's worth a brief explanation. The RetroShield is an Arduino shield designed by 8BitForce that lets you run real vintage CPUs. You plug an actual Z80 (or 6502, or 6809, or 8085) into a socket on the shield. The Arduino provides clock, reset, and memory by intercepting the CPU's bus signals.
The Z80 variant gives you:
- ROM at 0x0000 (size depends on your binary)
- 6KB RAM at 0x2000-0x37FF
- MC6850 ACIA for serial I/O at ports 0x80-0x81
The original RetroShield Z80 emulated the Intel 8251 USART for serial communication. In 2023, with help from RetroShield creator Erturk Kocalar, I added MC6850 ACIA emulation to run John Hardy's Forth interpreter. The MC6850 is what most CP/M software expects, making it the better choice for running vintage software. The Arduino sketch with MC6850 emulation is available in my RetroShield firmware collection on GitLab.
I added an SD card interface at ports 0x10-0x15, which isn't part of the standard RetroShield but integrates cleanly with the Arduino firmware. This gives the dBASE and WordStar clones persistent file storage.
This constrained environment is actually liberating. You can't reach for a 100MB framework or spawn threads. Every byte matters. The programs you write are complete, self-contained, and comprehensible. The entire WordStar clone is about 4KB of machine code. You can read a hex dump of the ROM and, with patience, trace exactly what every byte does.
The RetroShield connects to an Arduino Mega via two rows of 18 pins, or alternatively to a Teensy 4.1 using a special carrier board. Either way, you interact with your Z80 programs through a terminal emulator over USB serial. The VT100 and VT220 escape sequences that the workbench's terminal routines emit work perfectly in modern terminals like iTerm2 or the venerable screen command, connecting 1970s display protocols to 2020s software.
Why Rust?
Rust brings several advantages to this domain:
Type Safety: The compiler catches mistakes like passing a label where an address is expected, or using the wrong register size. This matters when generating machine code where a single wrong byte corrupts everything.
Zero Runtime: The generated ROMs contain only Z80 code, no runtime, no garbage collector. Rust's abstractions compile away completely.
Excellent Tooling: Cargo handles dependencies, testing, and publishing. The workbench is on crates.io; adding it to a project is one line in Cargo.toml.
Performance: Code generation is fast. Even the complex projects compile in under a second.
Expressiveness: Rust's type system lets me encode Z80 concepts cleanly. A label is a String, an address is a u16, and the compiler keeps them straight.
Lessons Learned
Building the workbench and using it for real projects taught me several things:
Start with the primitives right: The emit/label/fixup core hasn't changed since the first version. Getting the foundation solid paid dividends.
Standard library matters: Having I/O and terminal routines ready to call eliminated boilerplate from every project. I probably use call("print_string") a hundred times across all the projects.
Let the host do the work: Complex string manipulation, parsing, and data structure management happen in Rust on the host computer. The Z80 code just handles the runtime behavior. This split makes everything easier.
Readability over brevity: A Z80 program written in the workbench is longer than the equivalent hand-assembled hex, but it's readable and maintainable. When I need to fix a bug in the WordStar word wrap routine, I can read the Rust code and understand it.
Getting Started
The workbench is available on crates.io:
[dependencies] retroshield-z80-workbench = "0.1"
A minimal program:
use retroshield_z80_workbench::prelude::*; fn main() { let mut rom = CodeGen::new(); rom.emit_startup(0x3FFF); rom.call("clear_screen"); rom.ld_hl_label("msg"); rom.call("print_string"); rom.halt(); rom.label("msg"); rom.emit_string("Hello from Z80!\r\n"); rom.include_stdlib(); rom.resolve_fixups(); rom.write_bin("hello.bin").unwrap(); }
Load hello.bin onto a RetroShield (or run it in a Z80 emulator), and you'll see the greeting on your terminal.
Conclusion
The Z80 is nearly 50 years old, but it's still fun to program. The retroshield-z80-workbench brings modern development practices to vintage hardware: type-safe code generation, proper dependency management, fast iteration, and readable source.
Whether you want to build a clone of classic software, implement your own programming language for 8-bit hardware, or just understand how computers work at the machine code level, having the right tools makes all the difference. And there's still nothing quite like watching your code run on a chip that predates most programmers alive today.
The code for the workbench and all the kz80_* projects is available on GitHub under BSD-3-Clause licenses. PRs welcome.







