<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>TinyComputers.io (Posts about projects)</title><link>https://tinycomputers.io/</link><description></description><atom:link href="https://tinycomputers.io/categories/cat_projects.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2026 A.C. Jokela 
&lt;!-- div style="width: 100%" --&gt;
&lt;a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"&gt;&lt;img alt="" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/80x15.png" /&gt; Creative Commons Attribution-ShareAlike&lt;/a&gt;&amp;nbsp;|&amp;nbsp;
&lt;!-- /div --&gt;
</copyright><lastBuildDate>Mon, 06 Apr 2026 22:12:59 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Building DirtScout: A Land Acquisition Platform with Claude Code</title><link>https://tinycomputers.io/posts/building-dirtscout-a-land-acquisition-platform-with-claude-code.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/building-dirtscout-a-land-acquisition-platform-with-claude-code_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;24 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Nine years ago, I built something similar.&lt;/p&gt;
&lt;p&gt;It was 2017, St. Louis County, Minnesota. I wanted to find raw undeveloped land from the county's delinquent tax rolls. The county had an ArcGIS service, but the APIs were primitive compared to what they offer now. I stood up a PostgreSQL database with PostGIS extensions, wrote Ruby scripts to scrape parcel data from the county's map server, geocoded addresses, and built a Ruby on Rails frontend to browse the results. The whole thing lived on a single VPS. It worked for one county. The data model was rigid, the scraping was fragile, and every time St. Louis County changed their GIS service, something broke.&lt;/p&gt;
&lt;p&gt;That project died the way side projects do: I got what I needed from it and moved on.&lt;/p&gt;
&lt;p&gt;In March 2026, I came back to the idea. The landscape had changed. ArcGIS REST APIs are now standardized and reliable. Wisconsin publishes a statewide parcel dataset covering all 72 counties through a single endpoint. Minnesota counties expose delinquent tax data through queryable feature services. AWS Lambda and DynamoDB mean I don't need to manage a database server. And I had a tool that didn't exist in 2017: &lt;a href="https://claude.ai/claude-code"&gt;Claude Code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;DirtScout is the result. It's a full-stack land acquisition platform at &lt;a href="https://dirtscout.land"&gt;dirtscout.land&lt;/a&gt; that searches delinquent tax parcels across 21 Minnesota counties and browses raw land across 72 Wisconsin counties. It has AI-powered investment analysis, environmental and soil assessments, a deal pipeline with offer letter generation, tax forfeit auction tracking, and automated monitoring with email alerts. The codebase is about 29,000 lines across Python, TypeScript, and infrastructure-as-code.&lt;/p&gt;
&lt;p&gt;I built it with Claude Code. Not "Claude Code assisted me" or "Claude Code helped with the boilerplate." Claude Code wrote the code. I directed the architecture, made the decisions, and did the debugging when things broke. But the actual lines of code came from conversations, not from me typing in an editor.&lt;/p&gt;
&lt;h3&gt;The Architecture&lt;/h3&gt;
&lt;p&gt;The 2017 version was PostgreSQL + PostGIS + Ruby on Rails on a single server. The 2026 version:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Frontend:&lt;/strong&gt; Next.js 16, static export, Tailwind CSS, react-leaflet for maps. Hosted on S3 behind CloudFront. The entire frontend is pre-rendered HTML and JavaScript; there's no server-side rendering. CloudFront serves it from edge locations. A URL rewrite function handles dynamic routes for deal detail pages and shared parcel links.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Backend:&lt;/strong&gt; Python FastAPI running on a single AWS Lambda function behind API Gateway. Mangum adapts the ASGI app to Lambda's event format. Every API request hits the same Lambda, which cold-starts in about 3.5 seconds and handles subsequent requests in under a second. The function has 512MB of memory and a 5-minute timeout.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Data:&lt;/strong&gt; Two DynamoDB tables. The main table stores user data, flagged parcels, deals, preferences, notes, attachments, saved searches, tax list imports, and auction tracking. The cache table stores land cover analysis, environmental data, soil analysis, and geometry with TTLs. No PostgreSQL. No PostGIS. No database server to manage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Infrastructure:&lt;/strong&gt; AWS CDK in Python. One &lt;code&gt;cdk deploy&lt;/code&gt; command creates the Lambda, API Gateway, DynamoDB tables, S3 buckets, SQS queues, EventBridge schedules, Route 53 records, CloudFront distributions, and ACM certificates. The entire infrastructure is version-controlled and reproducible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On-premises worker:&lt;/strong&gt; A service running on a local AMD Strix Halo machine (Ryzen AI MAX+ 395, 128GB RAM) processes delinquent tax list PDFs using pdfplumber for text extraction and a local Qwen3 32B model via Ollama for structured data extraction. It polls an SQS queue for jobs.&lt;/p&gt;
&lt;p&gt;This is a fundamentally different architecture than what I could have built in 2017. No servers to patch aside from the Strix Halo. No database to back up. No PostGIS extensions to compile. The Lambda handles the compute, DynamoDB handles the storage, and the on-prem machine handles the jobs that need a real browser or a local LLM.&lt;/p&gt;
&lt;h3&gt;What Claude Code Actually Did&lt;/h3&gt;
&lt;p&gt;I want to be specific about this because the "AI-assisted development" conversation is usually vague. People say "I used AI to help me code" and it could mean anything from autocomplete suggestions to full application generation. Here's what actually happened.&lt;/p&gt;
&lt;p&gt;I started with a Rust TUI. The original project was a terminal application that queried a handful of Minnesota county ArcGIS services and displayed delinquent parcels in a text interface. It had county configurations, a query client, land cover analysis via USGS NLCD, and a flagging system. Claude Code built this from my descriptions of what I wanted: "query this ArcGIS service for parcels where the delinquent flag is set, filter by acreage and land use, show me the results in a table with navigation."&lt;/p&gt;
&lt;p&gt;Then I decided to make it a web app. I described the architecture I wanted: FastAPI on Lambda, Next.js on S3, DynamoDB for storage. Claude Code ported the Rust query logic to Python, built the FastAPI routes, created the React components, wrote the CDK infrastructure, and handled the deployment. Each feature was a conversation: "add Google OAuth," "add a deal pipeline with stages - make it look like Kanban," "generate offer letter PDFs," "add an AI investment summary using the Claude API."&lt;/p&gt;
&lt;p&gt;The codebase grew to 29,000 lines across 113 files in the initial commit. Later sessions added another 60 files and 5,000 lines for Wisconsin support, soil analysis, tax list imports, auction tracking, spatial search, and saved searches.&lt;/p&gt;
&lt;p&gt;I didn't write these lines. I directed them. There's a difference, and it matters.&lt;/p&gt;
&lt;p&gt;When I say "directed," I mean I made every architectural decision. I chose DynamoDB over PostgreSQL because I didn't want to manage a database. I chose Lambda over ECS because I didn't want to manage containers. I chose static export over SSR because I didn't want to manage a Node.js server. I chose to use a local LLM for PDF parsing instead of Claude API because the parsing is structured data extraction that doesn't need frontier model quality.&lt;/p&gt;
&lt;p&gt;Claude Code implemented these decisions. When something broke, I described the symptom and Claude Code diagnosed the cause. When I wanted a new feature, I described the behavior and Claude Code wrote the code. The feedback loop was: describe what I want, review what I get, deploy, test, describe what's wrong, iterate.&lt;/p&gt;
&lt;p&gt;Some things broke in interesting ways. DynamoDB doesn't accept Python floats; you have to convert everything to Decimal. The county field maps are reverse-keyed from what you'd expect (ArcGIS field names are the keys, common names are the values). Google OAuth redirect URIs need a trailing slash. CloudFront caches aggressively and you have to invalidate after every deploy. The Census TIGER API for county boundaries is painfully slow, so we downloaded the GeoJSON once and serve it as a static file. Each of these was discovered in production and fixed in conversation.&lt;/p&gt;
&lt;h3&gt;The Data Sources&lt;/h3&gt;
&lt;p&gt;The interesting part of DirtScout isn't the web framework. It's the data integration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Minnesota parcel data&lt;/strong&gt; comes from 11 different county ArcGIS REST services, each with its own field names, query syntax, and data quality. St. Louis County has &lt;code&gt;DELINQUENT_TAX_FLAG&lt;/code&gt; and &lt;code&gt;BAL_DUE&lt;/code&gt;. Aitkin has &lt;code&gt;DELINQUENT_FLAG&lt;/code&gt; (text: "YES"/"NO") and &lt;code&gt;BALDUE&lt;/code&gt;. Hennepin stores acreage in square feet (divide by 43,560). Goodhue stores acreage as a string that requires CAST in the SQL WHERE clause. Each county is a separate configuration with field mappings, WHERE clause templates, and normalization logic.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Minnesota tax lists&lt;/strong&gt; come from 15 county PDFs and Excel files. Itasca County publishes an Excel file updated monthly. The rest publish PDF legal notices. The PDFs are processed by either the Claude API (Haiku model, cheapest tier) or a local Qwen3 32B running on the Strix Halo machine. The AI extracts parcel IDs, owner names, delinquent amounts, and addresses from the unstructured PDF text and returns structured JSON.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wisconsin parcel data&lt;/strong&gt; comes from a single statewide ArcGIS feature service maintained by the State Cartographer's Office. One endpoint, all 72 counties, standardized fields. Owner names, mailing addresses, assessed values, acreage, property class. No delinquent tax data in the GIS, but we supplement with 9 county-level PDF lists of tax-delinquent and tax-forfeited properties.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Environmental analysis&lt;/strong&gt; layers: FEMA NFHL for flood zones, NWI for wetlands, NHD for water bodies, Minnesota DNR County Well Index for well data, MPCA for contamination sites. Each is a separate ArcGIS REST service query using the parcel's centroid.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Soil analysis&lt;/strong&gt; comes from the USDA Soil Data Access REST API (SSURGO). A SQL query with the parcel's centroid returns soil components, drainage class, hydric rating, slope, farmland classification, and capability class. We compute a "buildability" score from these factors.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Land cover&lt;/strong&gt; comes from the USGS MRLC WMS service, querying the NLCD 2021 Land Cover layer. We sample the parcel area and return a breakdown by cover type: forest, agriculture, water, wetlands, developed, barren.&lt;/p&gt;
&lt;p&gt;Each of these integrations was built in conversation with Claude Code. "Add flood zone analysis using FEMA's service." "The NWI wetlands query needs table-prefixed field names." "The SSURGO soil query needs a WKT point geometry."&lt;/p&gt;
&lt;h3&gt;What Changed Since 2017&lt;/h3&gt;
&lt;p&gt;The PostgreSQL + PostGIS + Ruby on Rails stack I used nine years ago was the right choice for 2017. PostGIS let me do spatial queries locally. I had to store the parcel data because the ArcGIS services weren't reliable enough to query in real time. Rails rendered server-side because that's what Rails did.&lt;/p&gt;
&lt;p&gt;None of that is necessary anymore. The ArcGIS services are fast and reliable enough to query live. DynamoDB handles the persistence without a schema to manage. Lambda eliminates server management. Static export means the frontend is just files on S3.&lt;/p&gt;
&lt;p&gt;There's a personal angle here too. In graduate school, I spent an entire semester manually developing land cover classifications for a final project — hand-labeling training data, running supervised classification algorithms, validating results against ground truth. It was weeks of work for one study area. For DirtScout, I told Claude Code "add a buildability score based on soil drainage, hydric percentage, slope, and capability class" and had a working assessment in minutes. The SSURGO soil data query, the scoring logic, the frontend panel with color-coded ratings — all from a single conversation. The knowledge that took a semester to develop is now a commodity you can describe and deploy.&lt;/p&gt;
&lt;p&gt;But the bigger change is the development process. In 2017, I wrote every line of Ruby and SQL by hand. I designed the PostGIS schema, wrote the scraping scripts, built the Rails views, configured the Nginx proxy, set up the SSL certificates, and wrote the systemd service files. It took months of evenings and weekends for a single-county tool.&lt;/p&gt;
&lt;p&gt;In 2026, I built a two-state, multi-service platform with AI analysis, auction tracking, deal management, and offer letter generation in a series of conversations over a few days. The code isn't hand-crafted. I'm not interested in hand-crafted code when that's not the point. The point is finding undervalued rural land from delinquent tax records and making offers to motivated sellers. The code is the means. Claude Code made the means faster.&lt;/p&gt;
&lt;h3&gt;The On-Prem Angle&lt;/h3&gt;
&lt;p&gt;A tax list import worker runs on a Bosgame M5 mini PC in my basement.&lt;/p&gt;
&lt;p&gt;The worker exists because I didn't want to pay for Claude API calls to parse 24 county PDFs every week. The AMD Strix Halo has 128GB of RAM and runs Qwen3 32B through Ollama. The worker downloads each PDF, extracts text with pdfplumber (a Python library that does the PDF-to-text conversion locally, no model needed), then sends the extracted text to the local Ollama instance for structured JSON extraction. Each 2-page chunk takes 5-7 minutes on the 32B model. It's slower than a cloud API. It's also free.&lt;/p&gt;
&lt;p&gt;The worker is a systemd service that starts on boot and polls an SQS queue continuously. A weekly systemd timer enqueues an "import all" message every Monday morning.&lt;/p&gt;
&lt;p&gt;This is the &lt;a href="https://tinycomputers.io/posts/the-economics-of-owning-your-own-inference.html"&gt;economics of owning your own inference&lt;/a&gt; in practice. The frontier model handles the quality-sensitive work (AI investment analysis, parcel chat). The local model handles the batch extraction work. The split happens naturally based on the task requirements.&lt;/p&gt;
&lt;h3&gt;What It Does Now&lt;/h3&gt;
&lt;p&gt;The production site at dirtscout.land:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Searches delinquent tax parcels across 21 Minnesota counties (8 Tier 1 with full ArcGIS data, 2 Tier 2 with partial data, 2 Tier 3 with minimal data, 9 Tier 4 with imported tax list data)&lt;/li&gt;
&lt;li&gt;Browses raw land parcels across all 72 Wisconsin counties via the statewide parcel service&lt;/li&gt;
&lt;li&gt;Scores each MN parcel on a 0-100 scale (grades A through F) based on financial opportunity, road access, environmental factors, and land character&lt;/li&gt;
&lt;li&gt;Generates AI investment summaries using Claude Sonnet with full context: parcel data, land cover, environmental analysis, soil data, owner's other properties, and attached documents&lt;/li&gt;
&lt;li&gt;Tracks deals through a pipeline (prospecting, offer sent, negotiating, under contract, closed, dead) with offer letter PDF generation using three templates&lt;/li&gt;
&lt;li&gt;Monitors for new delinquent parcels daily via EventBridge-triggered Lambda scans, with email alerts&lt;/li&gt;
&lt;li&gt;Tracks tax forfeit auction dates across 8 Minnesota counties, with a floating widget showing upcoming auctions&lt;/li&gt;
&lt;li&gt;Imports delinquent tax lists from 15 MN and 9 WI county PDFs/Excel files weekly&lt;/li&gt;
&lt;li&gt;Provides environmental analysis (flood zones, wetlands, water bodies, wells, contamination), land cover classification (NLCD 2021), and soil analysis (SSURGO) for each parcel&lt;/li&gt;
&lt;li&gt;Shows parcel boundaries on satellite imagery, with an interactive explore map that loads parcel shapes at high zoom levels&lt;/li&gt;
&lt;li&gt;Manages parcel notes, file attachments (via S3 presigned URLs), shareable parcel links, and saved searches&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;What I'd Do Differently&lt;/h3&gt;
&lt;p&gt;I'd add geometry caching earlier. Every map view that shows parcel boundaries makes a live ArcGIS query with &lt;code&gt;returnGeometry=true&lt;/code&gt;, which is slower than querying attributes only. Caching the geometry in DynamoDB with a TTL would make the explore map significantly faster.&lt;/p&gt;
&lt;p&gt;I'd standardize the county configurations into a more declarative format. Right now each county is a Python dataclass with hand-tuned field mappings. A JSON configuration file that Claude Code could modify more easily would reduce the friction of adding new counties.&lt;/p&gt;
&lt;p&gt;I'd separate the frontend into a proper monorepo with shared types between the API client and the backend models. The current setup has TypeScript interfaces in the frontend that mirror Pydantic models in the backend, and they get out of sync when fields are added.&lt;/p&gt;
&lt;p&gt;But these are optimizations, not regrets. The system works. It finds land. It makes the research process faster. And it was built in conversations, not in sprints.&lt;/p&gt;</description><category>ai</category><category>arcgis</category><category>aws</category><category>claude code</category><category>dynamodb</category><category>fastapi</category><category>gis</category><category>infrastructure</category><category>land investing</category><category>leaflet</category><category>minnesota</category><category>next.js</category><category>python</category><category>react</category><category>real estate</category><category>wisconsin</category><guid>https://tinycomputers.io/posts/building-dirtscout-a-land-acquisition-platform-with-claude-code.html</guid><pubDate>Thu, 26 Mar 2026 01:00:00 GMT</pubDate></item><item><title>Redesigning a PCB with Claude Code and Open-Source EDA Tools (Part 1)</title><link>https://tinycomputers.io/posts/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;20 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;div class="sponsor-widget"&gt;
&lt;div class="sponsor-widget-header"&gt;&lt;a href="https://baud.rs/youwpy"&gt;&lt;img src="https://tinycomputers.io/images/pcbway-logo.png" alt="PCBWay" style="height: 22px; vertical-align: middle; margin-right: 8px;"&gt;&lt;/a&gt; Sponsored Hardware&lt;/div&gt;
&lt;p&gt;This project was made possible by &lt;a href="https://baud.rs/youwpy"&gt;PCBWay&lt;/a&gt;, who sponsored the fabrication of the redesigned GigaShield v0.2 level converter board. PCBWay offers PCB prototyping, assembly, CNC machining, and 3D printing services, from one-off prototypes to production runs. If you have a PCB design ready to go, check them out at &lt;a href="https://baud.rs/youwpy"&gt;pcbway.com&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img id="pcb-top-img" src="https://tinycomputers.io/images/giga-shield/giga-shield-v02-top.png" alt="GigaShield v0.2 PCB top view: routed two-layer board with 9 SN74LVC8T245PW level shifters, generated with Python and autorouted with Freerouting" style="float: right; max-width: 420px; margin: 0 0 1em 1.5em; border-radius: 4px; box-shadow: 0 4px 12px rgba(0,0,0,0.15); cursor: zoom-in;"&gt;&lt;/p&gt;
&lt;div id="img-modal" class="modal" onclick="this.style.display='none'"&gt;
&lt;span class="close" onclick="document.getElementById('img-modal').style.display='none'"&gt;×&lt;/span&gt;
&lt;img class="modal-content" id="modal-img"&gt;
&lt;div id="caption"&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;script&gt;
(function() {
    var img = document.getElementById('pcb-top-img');
    var modal = document.getElementById('img-modal');
    var modalImg = document.getElementById('modal-img');
    var caption = document.getElementById('caption');
    img.onclick = function() {
        modal.style.display = 'block';
        modalImg.src = this.src;
        caption.textContent = this.alt;
    };
    document.addEventListener('keydown', function(e) {
        if (e.key === 'Escape' &amp;&amp; modal.style.display === 'block') {
            modal.style.display = 'none';
        }
    });
})();
&lt;/script&gt;

&lt;p&gt;In January, I &lt;a href="https://tinycomputers.io/posts/fiverr-pcb-design-arduino-giga-shield.html"&gt;spent $468 on Fiverr&lt;/a&gt; to have a professional design an &lt;a href="https://baud.rs/poSQeo"&gt;Arduino Giga R1&lt;/a&gt; shield with level shifters. It was a good design. Nine &lt;a href="https://baud.rs/y9JJt9"&gt;TXB0108PW&lt;/a&gt; bidirectional level translators, 72 channels of 3.3V-to-5V shifting, a clean two-layer board ready for fabrication. And then I started testing it with the &lt;a href="https://baud.rs/87wbBL"&gt;RetroShield Z80&lt;/a&gt;, and the auto-sensing level shifters fell apart.&lt;/p&gt;
&lt;p&gt;The TXB0108 is a clever chip. It detects signal direction automatically, so you don't need to tell it whether a pin is input or output. For most applications, that's a feature. For a Z80 bus interface, it's a fatal flaw. During bus cycles, the Z80 tri-states its address and data lines. The outputs go high-impedance. They're not driving high or low, they're floating. The TXB0108 can't determine drive direction from a floating signal. It guesses wrong, or it doesn't drive at all, and the Arduino on the other side sees garbage. The board was blind to half of what the Z80 was doing.&lt;/p&gt;
&lt;p&gt;The fix was clear: replace the TXB0108s with &lt;a href="https://baud.rs/zQqo34"&gt;SN74LVC8T245PW&lt;/a&gt; driven level shifters. The SN74LVC8T245 has an explicit DIR pin: you tell it which direction to translate, and it does exactly that, regardless of whether the signals are being actively driven. No guessing, no ambiguity, deterministic behavior during tri-state periods. The trade-off is that you need a direction control signal for each shifter IC, but that's a small price for reliability.&lt;/p&gt;
&lt;p&gt;What wasn't clear was how to execute the redesign. I could go back to Fiverr for another $400-500. I could spend weeks learning KiCad properly. Or I could try something that had worked surprisingly well on a &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;previous project&lt;/a&gt;: use AI and open-source command-line EDA tools to design the board from a terminal, without ever opening a graphical PCB editor.&lt;/p&gt;
&lt;p&gt;This is part one of a two-part series. This piece covers the design and toolchain: how I used &lt;a href="https://baud.rs/Z6Oq4k"&gt;Claude Code&lt;/a&gt;, the gEDA ecosystem, pcb-rnd, and &lt;a href="https://baud.rs/bdZw62"&gt;Freerouting&lt;/a&gt; to go from a failed design to production-ready Gerber files. Part two will cover the physical boards, assembly, and testing against the Z80.&lt;/p&gt;
&lt;h3&gt;The Toolchain Problem&lt;/h3&gt;
&lt;p&gt;The original Fiverr design was done in KiCad 9.0. My first instinct was to modify it directly: swap the TXB0108 footprints for SN74LVC8T245, update the pin mappings, add the DIR control header, and re-route. But there was a problem. My preferred command-line PCB tool, &lt;a href="https://baud.rs/1J64T5"&gt;pcb-rnd&lt;/a&gt;, is version 3.1.4 on Ubuntu. KiCad 9.0 uses a file format version (20241229) that pcb-rnd's &lt;code&gt;io_kicad&lt;/code&gt; plugin doesn't support. When I tried to open the KiCad PCB:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;unexpected layout version number (perhaps too new)
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Hard stop. No conversion path exists from KiCad 9.0 to pcb-rnd. The formats aren't just different versions. KiCad's S-expression format and pcb-rnd's text-based format are fundamentally different syntaxes.&lt;/p&gt;
&lt;p&gt;I could have started KiCad and used its GUI. But I'd already proven to myself with the &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;dual Z80 RetroShield project&lt;/a&gt; that text-based, AI-assisted PCB workflows are not only possible but sometimes preferable. The gEDA/pcb-rnd file format is human-readable. AI can parse it, reason about it, and generate it. A Python script can manipulate it. You can &lt;code&gt;diff&lt;/code&gt; two boards and see exactly what changed. None of that is true for a graphical-only workflow.&lt;/p&gt;
&lt;p&gt;So the plan became: extract everything useful from the KiCad source files, then rebuild the board from scratch in pcb-rnd's native format using Python. Sound insane? It kind of is. But it worked.&lt;/p&gt;
&lt;h3&gt;Extracting the DNA&lt;/h3&gt;
&lt;p&gt;Even though pcb-rnd couldn't read the KiCad files directly, the KiCad files contained all the design intelligence I needed. Component positions, net assignments, pin mappings, board dimensions. It was all there, just in a format I couldn't import.&lt;/p&gt;
&lt;p&gt;KiCad's CLI tools (&lt;code&gt;kicad-cli&lt;/code&gt;) could export what I needed:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Component positions (X, Y, rotation for each part)&lt;/span&gt;
kicad-cli&lt;span class="w"&gt; &lt;/span&gt;pcb&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pos&lt;span class="w"&gt; &lt;/span&gt;AlexJ_bz_ArduinoGigaShield.kicad_pcb&lt;span class="w"&gt; &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;giga_pos.csv

&lt;span class="c1"&gt;# Netlist connectivity&lt;/span&gt;
kicad-cli&lt;span class="w"&gt; &lt;/span&gt;pcb&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;ipc2581&lt;span class="w"&gt; &lt;/span&gt;AlexJ_bz_ArduinoGigaShield.kicad_pcb&lt;span class="w"&gt; &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;giga_netlist.d356
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The schematic file (&lt;code&gt;AlexJ_bz_ArduinoGigaShield.kicad_sch&lt;/code&gt;) was an S-expression text file I could parse to extract the signal mappings: which Giga pin connects to which 5V header pin through which level shifter channel. This was the most critical piece: getting the net assignments wrong would mean the board physically connects but logically doesn't work.&lt;/p&gt;
&lt;p&gt;This is where Claude Code earned its keep. I described the KiCad schematic structure and asked it to help me parse out the signal mappings. The KiCad schematic uses hierarchical sheets with positional net connections, which isn't the simplest format to work with manually, but straightforward for an AI that can read S-expressions and track net names across sheets. Within an hour, I had a complete mapping of all 72 signal channels across the 9 shifter ICs.&lt;/p&gt;
&lt;h3&gt;Generating the Board with Python&lt;/h3&gt;
&lt;p&gt;With positions and nets extracted, I wrote &lt;code&gt;build_giga_shield.py&lt;/code&gt;, a single Python script that generates the entire pcb-rnd board from scratch. No GUI involved. Every component footprint, every pin, every net connection is defined programmatically.&lt;/p&gt;
&lt;p&gt;The script is structured around four generator functions:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;tssop24_element()&lt;/code&gt;&lt;/strong&gt; generates the SN74LVC8T245PW footprint. TSSOP-24 is a precise geometry: 0.65mm pin pitch, 6.4mm pad-to-pad span, 24 pins. The function calculates pad positions mathematically: 12 pins on the left, 12 on the right, with pin 1 marked as square per convention. Getting the pin numbering right was critical. The SN74LVC8T245's datasheet shows pins 1-12 on the left (DIR, A1-A4, GND, A5-A8, OE#, GND) and pins 13-24 on the right counting bottom-to-top (B8-B5, VCCB, B4-B1, VCCA, VCCA, VCCB).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;pin_header_element()&lt;/code&gt;&lt;/strong&gt; handles through-hole pin headers with rotation support. The Arduino Giga R1 has an unusual form factor: the long pin headers run along the board edges horizontally, not vertically. In the original KiCad design, these were placed with 90-degree or -90-degree rotation. Without matching that rotation, a 26-pin header at y=84mm would extend 63.5mm downward to y=148mm, well past the 90mm board edge. The rotation transform was simple once identified:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;rotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rot&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;rot&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;smd_0603_element()&lt;/code&gt;&lt;/strong&gt; creates the 0603 footprint shared by all 27 decoupling capacitors and 9 pull-down resistors. Small SMD parts, simple geometry.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;mounting_hole_element()&lt;/code&gt;&lt;/strong&gt; places the four 3.2mm mounting holes that align with the Arduino Giga's standoff positions.&lt;/p&gt;
&lt;p&gt;The coordinate system was the trickiest part. KiCad uses an arbitrary origin; in this design, x=106mm, y=30.5mm. pcb-rnd uses (0,0). Every KiCad coordinate had to be translated:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;KX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;106.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;30.5&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;kpos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ky&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kx&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;KX&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ky&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;KY&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;build_pcb()&lt;/code&gt; function ties everything together: place components, assign nets, build the symbol table, generate the layer stack, and write out a valid pcb-rnd &lt;code&gt;.pcb&lt;/code&gt; file. Running the script produces a complete, unrouted board: components placed, netlist defined, silkscreen text positioned, board outline drawn. Ready for routing.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;$&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;build_giga_shield.py
Generated&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb
Board:&lt;span class="w"&gt; &lt;/span&gt;155mm&lt;span class="w"&gt; &lt;/span&gt;x&lt;span class="w"&gt; &lt;/span&gt;90mm
9x&lt;span class="w"&gt; &lt;/span&gt;SN74LVC8T245PW&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;TSSOP-24&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;level&lt;span class="w"&gt; &lt;/span&gt;shifters
DIR&lt;span class="w"&gt; &lt;/span&gt;control&lt;span class="w"&gt; &lt;/span&gt;via&lt;span class="w"&gt; &lt;/span&gt;J11&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;1x10&lt;span class="w"&gt; &lt;/span&gt;header&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;The Format Wars&lt;/h3&gt;
&lt;p&gt;Getting pcb-rnd to actually accept the generated file was its own adventure. pcb-rnd's parser is strict about things that look optional in the documentation, and its error messages are sometimes misleading. An error in an Element definition might be reported as a syntax error in the Layer section fifty lines later.&lt;/p&gt;
&lt;p&gt;Three format issues bit me hardest:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The &lt;code&gt;"smd"&lt;/code&gt; flag.&lt;/strong&gt; I initially generated elements with &lt;code&gt;Element["smd" "TSSOP24" "U1" ...]&lt;/code&gt;, which seemed logical for surface-mount parts. pcb-rnd rejected it with "Unknown flag: smd ignored," which cascaded into a complete parse failure. The fix: use an empty string &lt;code&gt;Element["" "TSSOP24" "U1" ...]&lt;/code&gt;. The SMD-ness is implicit from using &lt;code&gt;Pad[]&lt;/code&gt; entries instead of &lt;code&gt;Pin[]&lt;/code&gt; entries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bare zeros.&lt;/strong&gt; pcb-rnd is inconsistent about whether &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;0nm&lt;/code&gt; are interchangeable. In some contexts, bare &lt;code&gt;0&lt;/code&gt; works fine. In others, it causes a silent parse error that manifests as a syntax error dozens of lines later. The defensive fix: always use &lt;code&gt;0nm&lt;/code&gt;, never bare &lt;code&gt;0&lt;/code&gt;, everywhere.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Missing flags on Layer lines.&lt;/strong&gt; The &lt;code&gt;Line[]&lt;/code&gt; entry inside Layer blocks needs 7 fields, not 6. The seventh is a flags string like &lt;code&gt;"clearline"&lt;/code&gt;. My generator omitted it, producing &lt;code&gt;Line[x1 y1 x2 y2 thickness clearance]&lt;/code&gt;. The parser's error message: &lt;code&gt;syntax error, unexpected ']', expecting INTEGER or STRING&lt;/code&gt;, reported at the layer definition, not at the malformed line.&lt;/p&gt;
&lt;p&gt;I found these bugs using a binary search approach, truncating the file with &lt;code&gt;head -N&lt;/code&gt; and testing each truncation point until I isolated which section introduced the failure. It's crude but effective when error reporting is unhelpful. Claude Code helped enormously here. I'd paste the error and the surrounding file content, and it would spot the structural issue faster than I could.&lt;/p&gt;
&lt;h3&gt;The pcb-rnd Ecosystem&lt;/h3&gt;
&lt;p&gt;For anyone unfamiliar with the tools involved, a brief orientation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;gEDA&lt;/strong&gt; (GNU Electronic Design Automation) is a suite of open-source tools for electronic design. The original project dates to the late 1990s and includes &lt;code&gt;gschem&lt;/code&gt; (schematic capture), &lt;code&gt;pcb&lt;/code&gt; (PCB layout), and various utilities. The file formats are text-based and human-readable, a deliberate design choice that makes them scriptable and version-control-friendly. The original &lt;code&gt;pcb&lt;/code&gt; program is now deprecated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;pcb-rnd&lt;/strong&gt; is the actively maintained successor to gEDA's &lt;code&gt;pcb&lt;/code&gt; program. It reads and writes the same text-based PCB format, but adds modern features: more export formats, better plugin support, and critically for this project, command-line export of Gerber files, PNG renderings, and Specctra DSN files. It runs on Linux (packaged for Ubuntu) but not macOS, which is why I ran it over SSH on a remote machine throughout this project.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Freerouting&lt;/strong&gt; is a Java-based autorouter that speaks the Specctra DSN/SES interchange format. You feed it a board definition with components and nets but no traces, and it computes the copper routing, finding paths for every net while respecting design rules for trace width, clearance, and via placement. It's the open-source standard for PCB autorouting and has been used in production for decades.&lt;/p&gt;
&lt;p&gt;The workflow chains these tools together:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;build_giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pcb&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pcb&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;rnd&lt;/span&gt; &lt;span class="n"&gt;DSN&lt;/span&gt; &lt;span class="n"&gt;export&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                     &lt;span class="n"&gt;giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dsn&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                   &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Freerouting&lt;/span&gt; &lt;span class="n"&gt;autorouter&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                     &lt;span class="n"&gt;giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ses&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
              &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pcb&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;rnd&lt;/span&gt; &lt;span class="n"&gt;SES&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;Gerber&lt;/span&gt; &lt;span class="n"&gt;export&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                    &lt;span class="n"&gt;Production&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Every step is a command-line operation. Every intermediate file is text. Every transformation is reproducible. Change a component position in the Python script, re-run the pipeline, get new Gerber files. This is the power of text-based EDA: the entire design is version-controlled, diffable, and automatable.&lt;/p&gt;
&lt;h3&gt;Autorouting: The Machine Does the Tedious Part&lt;/h3&gt;
&lt;p&gt;With the board generated and validated in pcb-rnd, the next step was routing: connecting all 308 nets with actual copper traces across a two-layer board. This is where Freerouting comes in.&lt;/p&gt;
&lt;p&gt;The pipeline starts with exporting the unrouted board to Specctra DSN format. pcb-rnd handles this in batch mode on the remote Linux machine:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;dsn&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The DSN file contains the board geometry, component placements, pad definitions, and netlist, everything the autorouter needs to compute a routing solution. One subtlety I learned the hard way: the DSN's &lt;code&gt;(structure)&lt;/code&gt; section needs explicit &lt;code&gt;(rule)&lt;/code&gt; and &lt;code&gt;(via)&lt;/code&gt; definitions. pcb-rnd's DSN exporter puts the design rules inside the net class section, but Freerouting also expects them in the structure section. Without them, the router can see the nets but can't figure out what trace widths and via sizes are legal, and it silently fails to route most connections. A two-line addition fixed this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;(via pstk_1)
(rule
  (width 0.254)
  (clearance 0.254)
)
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Freerouting itself is a Java application with both GUI and command-line modes. On my machine, I'm running a custom build from source. The current &lt;code&gt;main&lt;/code&gt; branch had a few issues I had to fix (a missing &lt;code&gt;static&lt;/code&gt; on the main method, a null pointer on &lt;code&gt;maxThreads&lt;/code&gt; in the GUI initialization, and a Gradle build compatibility issue). The v1.9 codepath was more reliable for headless routing:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;java&lt;span class="w"&gt; &lt;/span&gt;-jar&lt;span class="w"&gt; &lt;/span&gt;freerouting-1.9.0-executable.jar&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-de&lt;span class="w"&gt; &lt;/span&gt;giga_shield.dsn&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-do&lt;span class="w"&gt; &lt;/span&gt;giga_shield.ses
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The autorouter loaded the 308-net board, ran through its passes, and produced a Specctra Session file containing 2911 wire segments and 172 vias. Every net connected. Every design rule satisfied. The routing took about 10 seconds for initial placement followed by optimization passes.&lt;/p&gt;
&lt;video controls autoplay loop muted playsinline style="max-width: 100%; border-radius: 4px; box-shadow: 0 4px 12px rgba(0,0,0,0.15); margin: 1em 0;"&gt;
  &lt;source src="https://tinycomputers.io/images/giga-shield/routing-traces.mp4" type="video/mp4"&gt;
&lt;/source&gt;&lt;/video&gt;

&lt;p&gt;Importing the routes back into pcb-rnd was the final step. pcb-rnd can import SES files through its batch mode:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;--gui&lt;span class="w"&gt; &lt;/span&gt;hid_batch&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="s"&gt;ImportSes(giga_shield.ses)&lt;/span&gt;
&lt;span class="s"&gt;SaveTo(LayoutAs, giga_shield_routed.pcb)&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The result: a fully routed PCB with 2911 traces and 172 vias, ready for Gerber export.&lt;/p&gt;
&lt;h3&gt;Running pcb-rnd Over SSH&lt;/h3&gt;
&lt;p&gt;One of the more unusual aspects of this project is that all pcb-rnd operations happened on a remote Ubuntu 24.04 machine accessed over SSH. pcb-rnd isn't available on macOS via Homebrew (I tried; there's a deprecated &lt;code&gt;pcb&lt;/code&gt; package but no &lt;code&gt;pcb-rnd&lt;/code&gt;), and building from source on macOS looked like a rabbit hole I didn't want to enter.&lt;/p&gt;
&lt;p&gt;The remote workflow was straightforward:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Upload the PCB&lt;/span&gt;
scp&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27:/tmp/

&lt;span class="c1"&gt;# Export DSN for routing&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcb-rnd -x dsn /tmp/giga_shield.pcb"&lt;/span&gt;

&lt;span class="c1"&gt;# Import SES and export gerbers&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'pcb-rnd --gui hid_batch /tmp/giga_shield.pcb &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="s1"&gt;ImportSes(/tmp/giga_shield.ses)&lt;/span&gt;
&lt;span class="s1"&gt;SaveTo(LayoutAs, /tmp/giga_shield_routed.pcb)&lt;/span&gt;
&lt;span class="s1"&gt;EOF'&lt;/span&gt;

&lt;span class="c1"&gt;# Export production files&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcb-rnd -x gerber --gerberfile /tmp/giga_shield /tmp/giga_shield_routed.pcb"&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcb-rnd -x png --dpi 600 --photo-mode --outfile /tmp/top.png /tmp/giga_shield_routed.pcb"&lt;/span&gt;

&lt;span class="c1"&gt;# Download results&lt;/span&gt;
scp&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27:/tmp/giga_shield.*.gbr&lt;span class="w"&gt; &lt;/span&gt;.
scp&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27:/tmp/top.png&lt;span class="w"&gt; &lt;/span&gt;giga_shield_top.png
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;It's more keystrokes than clicking Export in a GUI. But it's scriptable, repeatable, and fits into the same terminal where Claude Code is running. When I needed to iterate (move a component, re-route, re-export) I could do it in a single pipeline without switching contexts.&lt;/p&gt;
&lt;h3&gt;Claude Code as a Hardware Design Partner&lt;/h3&gt;
&lt;p&gt;I should be explicit about what Claude Code did and didn't do in this project, because the AI angle is the part people will either find most interesting or most suspicious.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What Claude Code did:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Parsed the KiCad schematic to extract the 72-channel signal mapping across 9 level shifter ICs&lt;/li&gt;
&lt;li&gt;Wrote the initial &lt;code&gt;build_giga_shield.py&lt;/code&gt; generator script, including all four footprint generators and the net assignment logic&lt;/li&gt;
&lt;li&gt;Debugged pcb-rnd format issues by analyzing error messages and file structure&lt;/li&gt;
&lt;li&gt;Managed the remote SSH workflow: uploading files, running pcb-rnd commands, downloading results&lt;/li&gt;
&lt;li&gt;Fixed bugs in the Freerouting build (the &lt;code&gt;static main&lt;/code&gt; issue, the null &lt;code&gt;maxThreads&lt;/code&gt;, the Gradle &lt;code&gt;fileMode&lt;/code&gt; API change)&lt;/li&gt;
&lt;li&gt;Handled iterative changes: "move tinycomputers.io down by a millimeter" became an edit to the Python script, a regeneration, a re-import, and a re-export, all executed as a single flow&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;What Claude Code didn't do:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Make architectural decisions. The choice to use SN74LVC8T245 over TXB0108, the DIR control header design, the decision to use pull-down resistors defaulting to A-to-B direction. Those were my decisions based on understanding the Z80 bus protocol; it is also on me for selecting the TXB0108 in the first place&lt;/li&gt;
&lt;li&gt;Verify electrical correctness. I checked the SN74LVC8T245 datasheet pin mapping myself. I verified that OE# tied to GND means always-enabled. I confirmed the 10K pull-down value was appropriate for the DIR pin&lt;/li&gt;
&lt;li&gt;Replace domain knowledge. I knew why the TXB0108 failed during tri-state periods because I understand Z80 bus cycles. Claude Code could have looked up the TXB0108 datasheet, but it couldn't have diagnosed the real-world failure mode from first principles&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The pattern that emerged was: I made design decisions, Claude Code implemented them. I said "the DIR pins need pull-down resistors to default A-to-B direction," Claude Code generated the pcb-rnd Element entries with the correct footprint, position, and net assignments. I said "export gerbers at 600 DPI with photo mode," Claude Code ran the right &lt;code&gt;pcb-rnd&lt;/code&gt; command on the remote machine.&lt;/p&gt;
&lt;p&gt;This is the same division of labor I described in the &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;dual Z80 post&lt;/a&gt;: I bring the domain knowledge, the AI handles the format translation. The text-based nature of gEDA files makes this work. If the design lived in a binary format or required mouse interactions, the AI would have been far less useful.&lt;/p&gt;
&lt;h3&gt;The New Design&lt;/h3&gt;
&lt;p&gt;Here's what the redesigned board looks like:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;v0.1 (Fiverr/TXB0108)&lt;/th&gt;
&lt;th&gt;v0.2 (Claude Code/SN74LVC8T245)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Level Shifter IC&lt;/td&gt;
&lt;td&gt;TXB0108PW (TSSOP-20)&lt;/td&gt;
&lt;td&gt;SN74LVC8T245PW (TSSOP-24)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Direction Control&lt;/td&gt;
&lt;td&gt;Auto-sensing&lt;/td&gt;
&lt;td&gt;Explicit DIR pin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Channels&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shifter ICs&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Decoupling Caps&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pull-down Resistors&lt;/td&gt;
&lt;td&gt;9 (OE)&lt;/td&gt;
&lt;td&gt;9 (DIR)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DIR Control Header&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;J11 (1x10)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Board Dimensions&lt;/td&gt;
&lt;td&gt;155mm x 90mm&lt;/td&gt;
&lt;td&gt;155mm x 90mm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layers&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design Tool&lt;/td&gt;
&lt;td&gt;KiCad 9.0 (GUI)&lt;/td&gt;
&lt;td&gt;Python + pcb-rnd (CLI)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design Cost&lt;/td&gt;
&lt;td&gt;$468.63&lt;/td&gt;
&lt;td&gt;$0 (open source tools)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design Time&lt;/td&gt;
&lt;td&gt;~10 days (outsourced)&lt;/td&gt;
&lt;td&gt;~2 days (with AI)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The J11 header is the key addition. It's a 1x10 pin header with 9 direction control pins (one per shifter IC) and a ground reference. Each DIR pin has a 10K pull-down resistor that defaults the direction to A-to-B (3.3V to 5V). To reverse a shifter's direction (for example, when the Arduino needs to read from the Z80's data bus) you drive the corresponding J11 pin high. The Arduino firmware manages this dynamically during bus cycles.&lt;/p&gt;
&lt;p&gt;The board carries "tinycomputers.io" and "v0.2" on the silkscreen, placed near the bottom edge. Version tracking on the physical board, a lesson learned from the Fiverr experience, where I had to pay $57 for a revision just to add version text to the silkscreen.&lt;/p&gt;
&lt;h3&gt;Generating Production Files&lt;/h3&gt;
&lt;p&gt;With the routed board in hand, the final step was generating files suitable for manufacturing. pcb-rnd handles this with command-line exporters:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Gerber files (9 layers: top/bottom copper, mask, silk, paste, outline, drill, fab)&lt;/span&gt;
pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;gerber&lt;span class="w"&gt; &lt;/span&gt;--gerberfile&lt;span class="w"&gt; &lt;/span&gt;giga_shield&lt;span class="w"&gt; &lt;/span&gt;giga_shield_routed.pcb

&lt;span class="c1"&gt;# Photo-realistic renderings&lt;/span&gt;
pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;png&lt;span class="w"&gt; &lt;/span&gt;--dpi&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--photo-mode&lt;span class="w"&gt; &lt;/span&gt;--outfile&lt;span class="w"&gt; &lt;/span&gt;top.png&lt;span class="w"&gt; &lt;/span&gt;giga_shield_routed.pcb
pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;png&lt;span class="w"&gt; &lt;/span&gt;--dpi&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--photo-mode&lt;span class="w"&gt; &lt;/span&gt;--photo-flip-x&lt;span class="w"&gt; &lt;/span&gt;--outfile&lt;span class="w"&gt; &lt;/span&gt;bottom.png&lt;span class="w"&gt; &lt;/span&gt;giga_shield_routed.pcb
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Gerber output includes everything a fab house needs: top and bottom copper, solder mask, silkscreen, paste stencil, board outline, and drill locations. The photo-realistic PNG renderings use pcb-rnd's built-in renderer: green solder mask, gold-plated pads, white silkscreen text. They're useful for documentation and for sanity-checking the layout before sending it to fabrication.&lt;/p&gt;
&lt;p&gt;The BOM and centroid files were generated separately from the Python script's component data. The centroid file lists every SMD component's X/Y position and rotation, which is essential if you're having the boards assembled by a service rather than hand-soldering.&lt;/p&gt;
&lt;h3&gt;What's Different About This Approach&lt;/h3&gt;
&lt;p&gt;The standard way to design a PCB in 2026 is: open KiCad or Altium, draw a schematic, assign footprints, lay out the board, route traces (manually or with the built-in autorouter), and export Gerbers. It's a visual, interactive process that works well for most people and most projects.&lt;/p&gt;
&lt;p&gt;What I did is different in a few ways that I think are worth noting, even if they're not universally applicable:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The entire design is a Python script.&lt;/strong&gt; &lt;code&gt;build_giga_shield.py&lt;/code&gt; is the single source of truth. Want to move a component? Change a coordinate in the script. Want to add a net? Add it to the dictionary. Want to change every decoupling cap from 0.1uF to 0.22uF? Change a string. Then re-run the pipeline. There's no "did I save the layout?" ambiguity, no undo history to worry about, no risk of accidentally moving something with a stray mouse click.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Every intermediate file is text.&lt;/strong&gt; The &lt;code&gt;.pcb&lt;/code&gt; file, the &lt;code&gt;.dsn&lt;/code&gt; file, the &lt;code&gt;.ses&lt;/code&gt; file. All text, all diffable, all version-controllable. When I moved a component and re-routed, I could &lt;code&gt;git diff&lt;/code&gt; the PCB file and see exactly what changed. Try that with a binary PCB format.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AI can participate meaningfully.&lt;/strong&gt; Because the files are text, Claude Code could read them, modify them, and verify them. It could grep for a component reference in the PCB file, find its coordinates, suggest a new position, and make the edit. It could read the Freerouting log and diagnose why routing failed. This level of AI participation simply isn't possible with graphical-only workflows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The workflow is reproducible.&lt;/strong&gt; I can hand someone the Python script and the Freerouting JAR and they can regenerate the entire board from scratch, on any machine with Python and Java. No KiCad version compatibility issues, no plugin dependencies, no "works on my machine" problems.&lt;/p&gt;
&lt;p&gt;The trade-off is obvious: this approach requires understanding file formats at a level that graphical tools abstract away. If pcb-rnd's parser rejects your file with a misleading error message, you need to debug the file format, not just re-click a button. It's a power-user workflow. But for someone comfortable with text editors and command lines (which describes most of the audience reading a blog called tinycomputers.io), it's a viable alternative.&lt;/p&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;The Gerber files are ready for fabrication. In part two, I'll cover ordering the boards from &lt;a href="https://baud.rs/youwpy"&gt;PCBWay&lt;/a&gt;, sourcing the SN74LVC8T245PW and passive components, and the moment of truth: plugging the RetroShield Z80 into the new shield and seeing if the Arduino can finally see the Z80's bus cycles clearly.&lt;/p&gt;
&lt;p&gt;I'll also compare the v0.2 board side-by-side with the original Fiverr v0.1 board: the TXB0108 auto-sensing design versus the SN74LVC8T245 driven design. Same board dimensions, same connector layout, fundamentally different level-shifting approach. The comparison should be instructive for anyone choosing between auto-sensing and driven level translators for bus interfaces.&lt;/p&gt;
&lt;p&gt;The Python build script, pcb-rnd source files, Gerber outputs, and all helper scripts are open source:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/pOawfA"&gt;giga-shield&lt;/a&gt;&lt;/strong&gt;: Complete design files, build pipeline, and production outputs&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;This is part one of a two-part series. Part two will cover fabrication, assembly, and testing.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Previous posts in this series: &lt;a href="https://tinycomputers.io/posts/fiverr-pcb-design-arduino-giga-shield.html"&gt;Fiverr PCB Design ($468)&lt;/a&gt; · &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;Dual Z80 RetroShield&lt;/a&gt; · &lt;a href="https://tinycomputers.io/posts/cpm-on-arduino-giga-r1-wifi.html"&gt;CP/M on the Giga R1&lt;/a&gt; · &lt;a href="https://tinycomputers.io/posts/zork-on-retroshield-z80-arduino-giga.html"&gt;Zork on the Giga&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description><category>ai</category><category>arduino</category><category>arduino giga</category><category>claude code</category><category>freerouting</category><category>geda</category><category>hardware</category><category>level shifter</category><category>open-source</category><category>pcb design</category><category>pcb-rnd</category><category>retroshield</category><category>z80</category><guid>https://tinycomputers.io/posts/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1.html</guid><pubDate>Fri, 13 Mar 2026 16:00:00 GMT</pubDate></item><item><title>Part 4: 132 Tests, Zero Failures - Verifying the Sampo CPU on Real Hardware</title><link>https://tinycomputers.io/posts/sampo-fpga-isa-verification.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/sampo-fpga-isa-verification_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;12 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;In &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Part 1&lt;/a&gt;, we designed the Sampo 16-bit RISC architecture. In &lt;a href="https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html"&gt;Part 2&lt;/a&gt;, we synthesized it to an ECP5 FPGA on the ULX3S board. In &lt;a href="https://tinycomputers.io/posts/sampo-llvm-backend-rust-compiler.html"&gt;Part 3&lt;/a&gt;, we built an LLVM backend so Rust could compile for it. But there was a glaring gap in the project: we'd never systematically verified that the hardware actually implements the ISA correctly.&lt;/p&gt;
&lt;p&gt;The "Hello, Sampo!" demo program exercises maybe 10 of the CPU's 66 instructions. The LLVM backend generates code that assumes the hardware matches the spec. If a single instruction is subtly wrong - a carry flag not set, a branch offset miscalculated, a byte load sign-extending when it shouldn't - the entire toolchain is built on sand.&lt;/p&gt;
&lt;p&gt;This post documents the process of building a comprehensive test suite, running it in simulation, finding a real pipeline hazard bug in the CPU, and then the surprisingly treacherous journey of getting those tests running on real FPGA hardware.&lt;/p&gt;
&lt;h3&gt;The Test Strategy&lt;/h3&gt;
&lt;p&gt;The approach is straightforward: write assembly programs that exercise every instruction in the ISA, compare results against known-good values, and report PASS or FAIL over UART. The testbench monitors the serial output, and if it sees "FAIL" anywhere, the test run fails.&lt;/p&gt;
&lt;p&gt;Each test follows the same pattern:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;; Load known inputs&lt;/span&gt;
&lt;span class="nf"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0x1234&lt;/span&gt;
&lt;span class="nf"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0x5678&lt;/span&gt;

&lt;span class="c1"&gt;; Execute the instruction under test&lt;/span&gt;
&lt;span class="nf"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R9&lt;/span&gt;

&lt;span class="c1"&gt;; Check the result&lt;/span&gt;
&lt;span class="nf"&gt;MOV&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R10&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;; actual value&lt;/span&gt;
&lt;span class="nf"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0x68AC&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;; expected value&lt;/span&gt;
&lt;span class="nf"&gt;JALX&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;check_eq&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;; prints PASS or FAIL&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;check_eq&lt;/code&gt; subroutine compares R4 (actual) against R5 (expected) and prints the result over the UART. This makes the test output human-readable and machine-parseable:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;=== ALU Tests ===
ADD basic: PASS
ADD zero: PASS
ADD carry out: PASS
ADD overflow: PASS
SUB basic: PASS
...
Done.
&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;The Test Framework&lt;/h3&gt;
&lt;p&gt;Every test program begins with a block of helper subroutines that handle UART communication and result reporting. The core is a busy-wait loop that polls the MC6850-compatible UART status register:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="na"&gt;.equ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;ACIA_STATUS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0x80&lt;/span&gt;
&lt;span class="na"&gt;.equ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;ACIA_DATA&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0x81&lt;/span&gt;

&lt;span class="nl"&gt;print_char:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;; R5 = character to output&lt;/span&gt;
&lt;span class="nl"&gt;.wait:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;INI&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;ACIA_STATUS&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;; Read status register&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;AND&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R6&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;; Copy to R7&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;ADDI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;-2&lt;/span&gt;&lt;span class="w"&gt;             &lt;/span&gt;&lt;span class="c1"&gt;; Check if TX ready (bit 1)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;BNE&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;.wait&lt;/span&gt;&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="c1"&gt;; Loop until ready&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;OUTI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;ACIA_DATA&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R5&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;; Send character&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;JR&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="no"&gt;RA&lt;/span&gt;&lt;span class="w"&gt;                 &lt;/span&gt;&lt;span class="c1"&gt;; Return&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;check_eq&lt;/code&gt; helper prints "PASS" or "FAIL" based on a register comparison, and the &lt;code&gt;print_str&lt;/code&gt; helper walks a null-terminated string byte by byte. These routines are duplicated in each test file rather than linked - there's no linker in this toolchain, just a single-file assembler.&lt;/p&gt;
&lt;h3&gt;Test Coverage&lt;/h3&gt;
&lt;p&gt;We organized the tests into 10 programs, each targeting a specific area of the instruction set:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Test Program&lt;/th&gt;
&lt;th&gt;Instructions Tested&lt;/th&gt;
&lt;th&gt;Test Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;test_alu&lt;/td&gt;
&lt;td&gt;ADD, SUB, AND, OR, XOR, NEG + flags&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_addi&lt;/td&gt;
&lt;td&gt;ADDI with signed immediates + flags&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_shift&lt;/td&gt;
&lt;td&gt;SLL, SRL, SRA, ROL, ROR, SWAP (1/4/8-bit variants)&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_muldiv&lt;/td&gt;
&lt;td&gt;MUL, MULH, DIV, DIVU, REM, REMU&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_loadstore&lt;/td&gt;
&lt;td&gt;LW, LB, LBU, SW, SB + offset variants&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_branch&lt;/td&gt;
&lt;td&gt;All 16 branch conditions (taken + not taken)&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_jump&lt;/td&gt;
&lt;td&gt;J, JR, JALR, JX, JALX&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_stack&lt;/td&gt;
&lt;td&gt;PUSH, POP, CMP, TEST, MOV&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_misc&lt;/td&gt;
&lt;td&gt;EXX, GETF, SETF, SCF, CCF, NOP&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test_extended&lt;/td&gt;
&lt;td&gt;ADDIX, SUBIX, ANDIX, ORIX, XORIX, SLLX, SRLX, SRAX&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;132&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The branch tests are particularly thorough - each of the 16 conditions (BEQ, BNE, BLT, BGE, BLTU, BGEU, BMI, BPL, BVS, BVC, BCS, BCC, BGT, BLE, BHI, BLS) gets tested both for the taken and not-taken case. We set up flags with arithmetic, then verify the branch goes the right way.&lt;/p&gt;
&lt;h3&gt;Finding a Real Bug: The Pipeline Hazard&lt;/h3&gt;
&lt;p&gt;The first time we ran the full test suite in simulation, 130 of 132 tests passed. Two tests in &lt;code&gt;test_loadstore&lt;/code&gt; were failing: the multi-word store/load test and a load with offset test.&lt;/p&gt;
&lt;p&gt;The failing pattern was consistent: any test that performed a store followed immediately by a load from a different address would read stale data. The load would return the value from the &lt;em&gt;previous&lt;/em&gt; memory operation instead of the current one.&lt;/p&gt;
&lt;p&gt;The root cause was a pipeline hazard between the MEMORY and FETCH states. Here's what was happening:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;Cycle N:   MEMORY state - store completes, mem_ready asserts
Cycle N+1: FETCH state  - new instruction fetch begins
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The problem: &lt;code&gt;mem_ready&lt;/code&gt; is a one-cycle delayed version of &lt;code&gt;mem_valid&lt;/code&gt; (because the RAM is synchronous). When the CPU transitions from MEMORY to WRITEBACK to FETCH, the &lt;code&gt;mem_ready&lt;/code&gt; signal from the store was still asserted during the first cycle of the next FETCH. The CPU latched the stale &lt;code&gt;mem_rdata&lt;/code&gt; from the previous store operation as if it were the new instruction.&lt;/p&gt;
&lt;p&gt;The fix was to add a WRITEBACK state after every MEMORY operation - not just loads, but stores too. This gives &lt;code&gt;mem_ready&lt;/code&gt; a cycle to deassert before the next FETCH begins:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;Before&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MEMORY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;FETCH&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mem_ready&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;still&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;high&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;After&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;MEMORY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;WRITEBACK&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;FETCH&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mem_ready&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;deasserts&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;during&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;WRITEBACK&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;A one-line change to the next-state logic:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="no"&gt;`ST_MEMORY&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;begin&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mem_ready&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;begin&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// Always go through WRITEBACK after MEMORY.&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// For stores: allows mem_ready to deassert before&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// next FETCH (prevents stale rdata latch).&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;next_state&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;`ST_WRITEBACK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This is exactly the kind of bug that simulation catches and manual inspection misses. The instruction executes correctly in isolation - it's only the &lt;em&gt;interaction&lt;/em&gt; between consecutive memory operations that triggers the hazard. After the fix, all 132 tests passed in simulation.&lt;/p&gt;
&lt;h3&gt;Taking It to the FPGA&lt;/h3&gt;
&lt;p&gt;With simulation clean, the next step was running the tests on real hardware. The ULX3S board has an &lt;a href="https://baud.rs/bJSrEK"&gt;FTDI&lt;/a&gt; FT231X USB-serial chip connected to the FPGA, so UART output appears on a serial port at 115200 baud.&lt;/p&gt;
&lt;p&gt;There was an immediate practical problem: the test programs run fast. At 12.5 MHz, the entire 20-test ALU suite completes in about 30 milliseconds. By the time openFPGALoader finishes programming the FPGA and releases the USB port, the test output is long gone. The FTDI chip has a small receive buffer, but 364 characters of test output overflows it before you can open the serial port.&lt;/p&gt;
&lt;p&gt;The solution: patch the hex files to loop instead of halting. Replace the HALT instruction with a delay loop followed by a jump back to the reset vector. The test runs, outputs its results, waits about half a second, and starts over. You can open the serial port at any time and catch a complete iteration.&lt;/p&gt;
&lt;h4&gt;The Delay Loop Patch&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;hex_loop_patch.py&lt;/code&gt; script performs binary patching on the assembled hex files. It finds the HALT instruction (encoded as &lt;code&gt;0xE100&lt;/code&gt;) and replaces it with a delay loop:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;; Delay ~0.38 seconds at 12.5 MHz&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0x0008&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;; outer counter&lt;/span&gt;
&lt;span class="nl"&gt;outer:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0xFFFF&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;; inner counter = 65535&lt;/span&gt;
&lt;span class="nl"&gt;inner:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;ADDI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;-1&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;BNE&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;inner&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;ADDI&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;-1&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;BNE&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;outer&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;JX&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="mi"&gt;0x0100&lt;/span&gt;&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="c1"&gt;; jump back to reset vector&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The first version of this script &lt;em&gt;inserted&lt;/em&gt; these 10 words at the HALT position. This seemed obviously correct. The tests ran on FPGA. Characters appeared on the serial port.&lt;/p&gt;
&lt;p&gt;They were the wrong characters.&lt;/p&gt;
&lt;h3&gt;The Address Shift Bug&lt;/h3&gt;
&lt;p&gt;The FPGA output for the "Hello, Sampo!" test program was &lt;code&gt;\x08\x08\x08\x08&lt;/code&gt; - four backspace characters, repeating forever. The ALU test suite showed truncated output with roughly 45% of characters missing. Same pattern at 12.5 MHz and 6.25 MHz, ruling out timing violations. Simulation with realistic UART timing (1,080 cycles per byte, matching the hardware baud rate) passed perfectly.&lt;/p&gt;
&lt;p&gt;I spent considerable time investigating the wrong theories. Was the UART transmitter dropping bytes? Was there a clock domain crossing issue? Was &lt;code&gt;$readmemh&lt;/code&gt; in Yosys interpreting the hex file differently from Icarus Verilog? None of these panned out.&lt;/p&gt;
&lt;p&gt;The breakthrough came from staring at &lt;code&gt;\x08&lt;/code&gt;. That's the byte value 8. Where would 8 come from? The "Hello, Sampo!" program loads its message pointer with &lt;code&gt;LIX R4, message&lt;/code&gt; where &lt;code&gt;message&lt;/code&gt; is the label for the string data. In the assembled hex, &lt;code&gt;message&lt;/code&gt; resolves to address &lt;code&gt;0x011E&lt;/code&gt; - the byte immediately after the HALT instruction.&lt;/p&gt;
&lt;p&gt;And there it was. Look at the assembly structure:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="nl"&gt;done:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nf"&gt;HALT&lt;/span&gt;&lt;span class="w"&gt;                    &lt;/span&gt;&lt;span class="c1"&gt;; address 0x011C&lt;/span&gt;
&lt;span class="nl"&gt;message:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="na"&gt;.asciz&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"Hello, Sampo!\n"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;; address 0x011E&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The string data lives immediately after HALT. When &lt;code&gt;hex_loop_patch.py&lt;/code&gt; &lt;em&gt;inserts&lt;/em&gt; 10 words of delay loop code at the HALT position, it pushes the string data down by 20 bytes. But the &lt;code&gt;LIX R4, 0x011E&lt;/code&gt; instruction still points to the original address. At &lt;code&gt;0x011E&lt;/code&gt; there's now the second word of &lt;code&gt;LIX R8, 0x0008&lt;/code&gt; - which contains the value &lt;code&gt;0x0008&lt;/code&gt;. The low byte is &lt;code&gt;0x08&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The CPU faithfully reads byte &lt;code&gt;0x08&lt;/code&gt; from the patched address, outputs it via UART, advances the pointer to &lt;code&gt;0x011F&lt;/code&gt; where the high byte is &lt;code&gt;0x00&lt;/code&gt; (the null terminator), and stops. One &lt;code&gt;\x08&lt;/code&gt; per iteration, four iterations captured. Mystery solved.&lt;/p&gt;
&lt;p&gt;This same address shift corrupted every test program. The test strings ("ADD basic: ", "PASS\n", etc.) all live after HALT and all got displaced. The CPU was reading from locations that now contained delay loop machine code instead of ASCII text. Some fragments of text survived because adjacent strings partially overlapped with their shifted locations, producing the truncated output we saw.&lt;/p&gt;
&lt;h4&gt;The Fix&lt;/h4&gt;
&lt;p&gt;The correct approach: don't shift any data. Place the delay loop at address &lt;code&gt;0x0000&lt;/code&gt; - the 256 bytes of unused memory before the &lt;code&gt;0x0100&lt;/code&gt; reset vector - and replace the single-word HALT with a single-word relative &lt;code&gt;J&lt;/code&gt; (jump) instruction that jumps backward to the loop code. One word replaces one word. No data moves.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Place delay loop at address 0x0000 (unused space)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LOOP_PATCH&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;loop_base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;

&lt;span class="c1"&gt;# Replace HALT with J instruction to address 0x0000&lt;/span&gt;
&lt;span class="c1"&gt;# J encoding: opcode 0x9, 12-bit signed offset&lt;/span&gt;
&lt;span class="c1"&gt;# target = PC + 2 + (sign_extend(offset) &amp;lt;&amp;lt; 1)&lt;/span&gt;
&lt;span class="n"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_addr&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;halt_addr&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;j_word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mh"&gt;0x9000&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="mh"&gt;0xFFF&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;halt_idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;j_word&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;There's a subtle complication: the J instruction shares opcode &lt;code&gt;0x9&lt;/code&gt; with JR (register indirect jump) and JALR (jump and link register). The decoder distinguishes them by specific bit patterns in the offset field. If the calculated offset happens to have &lt;code&gt;bits[3:0] == 0x1&lt;/code&gt; and &lt;code&gt;bits[11:8] != 0xF&lt;/code&gt;, the decoder interprets it as JALR instead of J. The script tries successive target addresses (&lt;code&gt;0x0000&lt;/code&gt;, &lt;code&gt;0x0002&lt;/code&gt;, &lt;code&gt;0x0004&lt;/code&gt;, ...) until it finds one that doesn't collide with the JR/JALR encoding space.&lt;/p&gt;
&lt;p&gt;After the fix, the patched hex files have exactly the same number of words as the originals. The only changes are the delay loop code written to the zero page and the HALT word replaced with a backward jump.&lt;/p&gt;
&lt;p&gt;With the corrected patcher, the "Hello, Sampo!" program finally works on the FPGA - looping cleanly with zero character loss:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/sampo-fpga-isa-verification/HelloSampo.png" style="width: 100%; max-width: 720px; border-radius: 8px; box-shadow: 0 4px 12px rgba(0,0,0,0.15); margin: 1em 0;" loading="lazy" alt="Terminal showing Hello, Sampo! repeating on the ULX3S FPGA via cu serial connection"&gt;&lt;/p&gt;
&lt;h3&gt;The Testbench: Trusting but Verifying&lt;/h3&gt;
&lt;p&gt;One important discovery during this process: the simulation testbench had &lt;code&gt;tx_ready = 1&lt;/code&gt; permanently. The simulated UART never pushed back on the CPU - it accepted every byte instantly. This meant the CPU's busy-wait loop (&lt;code&gt;INI R6, ACIA_STATUS / ADDI R7, -2 / BNE wait&lt;/code&gt;) was never actually tested in simulation. The status register always returned "ready," so the loop body executed zero times.&lt;/p&gt;
&lt;p&gt;On real hardware, the UART transmitter takes about 87 microseconds per byte at 115200 baud. The busy-wait loop runs hundreds of times per character, exercising the INI instruction, the AND/ADDI flag-setting sequence, and the BNE branch in a tight loop. If any of those instructions had a subtle bug, it would only manifest on hardware.&lt;/p&gt;
&lt;p&gt;We added realistic UART timing to the testbench:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;parameter&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;TX_BYTE_CYCLES&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;108&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// ~1080 cycles per byte&lt;/span&gt;
&lt;span class="kt"&gt;reg&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mh"&gt;15&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="mh"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tx_delay_cnt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;always&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;@(&lt;/span&gt;&lt;span class="k"&gt;posedge&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;clk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;begin&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx_valid&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tx_ready&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;begin&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;tx_ready&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;tx_delay_cnt&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;TX_BYTE_CYCLES&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx_delay_cnt&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;begin&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;tx_delay_cnt&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tx_delay_cnt&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx_delay_cnt&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="n"&gt;tx_ready&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;With this change, simulation exercises the same code paths as the hardware. All 132 tests still pass - the UART flow control logic was correct all along, it just wasn't being tested.&lt;/p&gt;
&lt;h3&gt;Running All Tests on the FPGA&lt;/h3&gt;
&lt;video controls style="width: 100%; max-width: 720px; border-radius: 8px; box-shadow: 0 4px 12px rgba(0,0,0,0.15); margin: 0 0 1em 0;"&gt;
&lt;source src="https://tinycomputers.io/sampo-fpga-test-suite.mp4" type="video/mp4"&gt;
Your browser does not support the video tag.
&lt;/source&gt;&lt;/video&gt;

&lt;p&gt;With the patch bug fixed, we ran the complete suite. Each test requires a separate FPGA build (Yosys synthesis, nextpnr place-and-route, ecppack bitstream generation), programming via JTAG, and serial capture. The Makefile automates the entire pipeline:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="nf"&gt;fpga-%&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;&lt;span class="nv"&gt;BUILD_DIR&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt;/&lt;span class="n"&gt;sampo_&lt;/span&gt;%.&lt;span class="n"&gt;bit&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;openFPGALoader&lt;span class="w"&gt; &lt;/span&gt;-b&lt;span class="w"&gt; &lt;/span&gt;ulx3s&lt;span class="w"&gt; &lt;/span&gt;$&amp;lt;
&lt;span class="w"&gt;    &lt;/span&gt;sleep&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;fpga_capture.py&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;SERIAL_PORT&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;SERIAL_BAUD&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;fpga_capture.py&lt;/code&gt; script opens the serial port, discards the first partial iteration (we might join mid-stream), waits for the &lt;code&gt;=== ... ===&lt;/code&gt; header line that starts each test, captures everything until the header repeats, and outputs one clean iteration.&lt;/p&gt;
&lt;p&gt;The results:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;========================================
=== FPGA: test_alu ===
========================================
=== ALU Tests ===
ADD basic: PASS
ADD zero: PASS
ADD carry out: PASS
...
AND clr C/V: PASS
All tests passed!

========================================
=== FPGA: test_addi ===
========================================
...
All tests passed!

...

========================================
FPGA Test Summary: 10 passed, 0 failed
========================================
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;All 10 test suites pass. All 132 individual tests pass. Zero failures on real hardware.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Test Suite&lt;/th&gt;
&lt;th&gt;Tests&lt;/th&gt;
&lt;th&gt;FPGA Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ALU (ADD, SUB, AND, OR, XOR)&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ADDI (immediate arithmetic)&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shift (SLL, SRL, SRA, ROL, SWAP)&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MulDiv (MUL, DIV, REM variants)&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Load/Store (LW, LB, LBU, SW, SB)&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Branch (all 16 conditions)&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jump (J, JR, JALR, JX, JALX)&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stack (PUSH, POP, CMP, TEST, MOV)&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Misc (EXX, GETF, SETF, SCF, CCF, NOP)&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extended (ADDIX, SUBIX, SLLX, etc.)&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;All PASS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;132&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;All PASS&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3&gt;What This Means&lt;/h3&gt;
&lt;p&gt;Having all 132 ISA tests pass on hardware is a significant milestone for the project. It means:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Verilog RTL is correct.&lt;/strong&gt; Every instruction in the Sampo ISA produces the right result, sets the right flags, and handles edge cases (zero, overflow, carry, sign extension) correctly. Not just in behavioral simulation, but in synthesized logic on a real FPGA running at 12.5 MHz.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The assembler is correct.&lt;/strong&gt; All 66 instructions encode properly. Branch offsets calculate correctly. Extended instructions (LIX, JALX, OUTX) with their 32-bit encoding work. The &lt;code&gt;sasm&lt;/code&gt; Rust assembler and the Verilog decoder agree on every instruction format.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The LLVM backend has a solid foundation.&lt;/strong&gt; When the Rust compiler generates a &lt;code&gt;ADD&lt;/code&gt; or &lt;code&gt;BNE&lt;/code&gt; or &lt;code&gt;JALX&lt;/code&gt;, the hardware will execute it correctly. The test suite doesn't exercise every possible code generation pattern, but it validates every primitive instruction that the compiler builds upon.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The UART subsystem works end-to-end.&lt;/strong&gt; Status register polling, TX busy-wait, byte transmission, baud rate generation - all verified on hardware. The MC6850-compatible interface works exactly as specified.&lt;/p&gt;
&lt;h3&gt;Lessons Learned&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Test your assumptions.&lt;/strong&gt; The testbench had &lt;code&gt;tx_ready = 1&lt;/code&gt;. It went unnoticed because simulation "worked." The real hardware exercises code paths that simulation shortcuts. Add realistic peripheral timing to your testbenches from day one.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Binary patching is fragile.&lt;/strong&gt; Inserting bytes into a binary without updating references is a classic relocation bug - the same class of problem that linkers exist to solve. If your patch changes the size of anything, every address reference past the patch point is wrong. The fix - placing the patch in unused address space and using a same-size replacement instruction - avoids the problem entirely.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Simulation is necessary but not sufficient.&lt;/strong&gt; The pipeline hazard bug was caught by simulation. The address shift bug was invisible to simulation (both used the same patching script, and the original programs - without patching - worked fine). You need both simulation and hardware testing, exercising different code paths and different failure modes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Systematic testing finds bugs that demos don't.&lt;/strong&gt; "Hello, Sampo!" worked on the FPGA from day one. It exercises &lt;code&gt;LIX&lt;/code&gt;, &lt;code&gt;LBU&lt;/code&gt;, &lt;code&gt;CMP&lt;/code&gt;, &lt;code&gt;BEQ&lt;/code&gt;, &lt;code&gt;INI&lt;/code&gt;, &lt;code&gt;OUTI&lt;/code&gt;, &lt;code&gt;ADDI&lt;/code&gt;, and &lt;code&gt;J&lt;/code&gt; - about 8 instructions. The pipeline hazard only manifested when a store was followed by a load to a different address, a pattern that doesn't occur in a simple print loop. You need tests specifically designed to exercise corner cases.&lt;/p&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;The entire Sampo project - assembler, emulator, Verilog RTL, FPGA build scripts, test suite, and LLVM backend - is open source on &lt;a href="https://baud.rs/r74wA8"&gt;GitHub&lt;/a&gt;. With hardware verification complete, the next steps might be:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Running Rust-compiled code on the FPGA.&lt;/strong&gt; The LLVM backend generates assembly, the assembler produces hex files, and we now know the hardware executes them correctly. Closing this loop - &lt;code&gt;cargo build&lt;/code&gt; to blinking LEDs - is the obvious next milestone.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Adding more peripherals.&lt;/strong&gt; The ULX3S has 32MB of SDRAM, an HDMI output, a microSD slot, and an ESP32 co-processor. Each of these opens up interesting possibilities for a working 16-bit computer.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance optimization.&lt;/strong&gt; The CPU currently runs at 12.5 MHz with a multi-cycle FSM (5-8 cycles per instruction). Pipelining could push this significantly higher on the ECP5.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But first: 132 tests, zero failures. The Sampo CPU works.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is Part 4 of the Sampo series. &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Part 1&lt;/a&gt; covers architecture design, &lt;a href="https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html"&gt;Part 2&lt;/a&gt; covers FPGA implementation, and &lt;a href="https://tinycomputers.io/posts/sampo-llvm-backend-rust-compiler.html"&gt;Part 3&lt;/a&gt; covers the LLVM backend.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;Recommended Resources&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/wvPosK"&gt;OrangeCrab ECP5 FPGA Board&lt;/a&gt; - A compact Lattice ECP5 board with DDR3 and USB-C, available on Amazon&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/6U3DBr"&gt;ECP5 FPGA Development Boards&lt;/a&gt; - Other ECP5 boards available on Amazon&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/RGjpAj"&gt;&lt;em&gt;Getting Started with FPGAs&lt;/em&gt;&lt;/a&gt; by Russell Merrick - Beginner-friendly introduction with Verilog and VHDL examples&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/bJSrEK"&gt;FTDI USB Serial Adapters&lt;/a&gt; - Useful for UART debugging with FPGAs&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/dBX5Ij"&gt;USB Logic Analyzers&lt;/a&gt; - Essential for debugging digital signals&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Source Code&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/r74wA8"&gt;github.com/ajokela/sampo&lt;/a&gt;&lt;/strong&gt; - CPU architecture, assembler, emulator, Verilog RTL, test suite, and FPGA build scripts&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/GCQDRa"&gt;github.com/ajokela/llvm-sampo&lt;/a&gt;&lt;/strong&gt; - LLVM backend and Rust target specification&lt;/li&gt;
&lt;/ul&gt;</description><category>cpu design</category><category>ecp5</category><category>fpga</category><category>hardware</category><category>isa</category><category>risc</category><category>sampo</category><category>testing</category><category>uart</category><category>ulx3s</category><category>verification</category><category>verilog</category><guid>https://tinycomputers.io/posts/sampo-fpga-isa-verification.html</guid><pubDate>Sun, 15 Feb 2026 20:00:00 GMT</pubDate></item><item><title>Part 3: Building an LLVM Backend for Sampo - Rust Runs on a Custom 16-bit RISC CPU</title><link>https://tinycomputers.io/posts/sampo-llvm-backend-rust-compiler.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/sampo-llvm-backend-rust-compiler.mp3" type="audio/mpeg"&gt;
Your browser does not support the audio element.
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;14:58 · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;In &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Part 1&lt;/a&gt;, we designed the Sampo 16-bit RISC architecture from scratch. In &lt;a href="https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html"&gt;Part 2&lt;/a&gt;, we brought it to life on an FPGA (sort of). Now, in Part 3, we tackle arguably the most ambitious goal of the project: &lt;strong&gt;making Rust compile for Sampo&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;This isn't just about having a working assembler and emulator. It's about integrating a custom CPU architecture into one of the most sophisticated compiler infrastructures in existence (&lt;a href="https://baud.rs/ZLCbHI"&gt;LLVM&lt;/a&gt;) and then building Rust's standard library for a 16-bit target that has never existed before.&lt;/p&gt;
&lt;p&gt;The result? A complete toolchain where you can write:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="cp"&gt;#![no_std]&lt;/span&gt;
&lt;span class="cp"&gt;#![no_main]&lt;/span&gt;

&lt;span class="k"&gt;extern&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"C"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;putc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="cp"&gt;#[no_mangle]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;extern&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"C"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;_start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;unsafe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;putc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;b'H'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;putc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;b'i'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;putc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;b'!'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;loop&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;And it compiles to native Sampo assembly that runs on our emulator:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;Sampo Emulator - Loaded 310 bytes
Starting execution at 0x0100

Hi!

CPU halted at 0x0122
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This article documents the journey: the architecture of an LLVM backend, the challenges of targeting a 16-bit architecture with modern compiler infrastructure, and how AI-assisted development with Claude Code made this ambitious project achievable.&lt;/p&gt;
&lt;h3&gt;Why LLVM?&lt;/h3&gt;
&lt;p&gt;Before diving into implementation details, it's worth asking: why LLVM at all? We already have a working assembler (&lt;code&gt;sasm&lt;/code&gt;) written in Rust. Why not just write a simple C compiler that targets that assembler directly?&lt;/p&gt;
&lt;p&gt;The answer is leverage. LLVM is used by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/zotdzv"&gt;Rust&lt;/a&gt;&lt;/strong&gt; (via &lt;code&gt;rustc&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/Zb46XW"&gt;Clang&lt;/a&gt;&lt;/strong&gt; (C/C++/Objective-C)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/GMibYa"&gt;Swift&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/Whdc21"&gt;Julia&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/UlR4Sx"&gt;Zig&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;And dozens of other languages&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By implementing a single LLVM backend, Sampo gains access to &lt;em&gt;all&lt;/em&gt; of these languages. More importantly, we get decades of optimization research (constant folding, dead code elimination, loop unrolling, register allocation) for free. A hand-written C compiler would take years to reach the same quality.&lt;/p&gt;
&lt;p&gt;The tradeoff is complexity. LLVM is a massive codebase (~30 million lines of C++) with steep learning curves. But with modern AI-assisted development tools, that complexity becomes manageable.&lt;/p&gt;
&lt;h3&gt;Prior Art: LLVM on the Z80&lt;/h3&gt;
&lt;p&gt;This isn't our first attempt at bringing LLVM to unconventional hardware. Before Sampo, we tackled an even more constrained target: the &lt;a href="https://tinycomputers.io/posts/rust-on-z80-an-llvm-backend-odyssey.html"&gt;Zilog Z80&lt;/a&gt;, an 8-bit processor from 1976.&lt;/p&gt;
&lt;p&gt;The Z80 project was, in many ways, a proving ground. We learned:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;GlobalISel is the right choice for new backends.&lt;/strong&gt; The older SelectionDAG framework is battle-tested but harder to debug. GlobalISel's modular design made iterative development practical.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Type legalization is where 90% of the work lives.&lt;/strong&gt; An 8-bit processor running code written for 64-bit assumptions requires extensive transformation rules.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AI-assisted development actually works for compilers.&lt;/strong&gt; The Z80 backend was our first serious test of using Claude Code for systems programming. The collaboration model we developed there (human direction, AI implementation, iterative refinement) carried directly into Sampo.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Z80 experience also revealed the limits of targeting truly minimal hardware. With only 64KB of address space, no hardware multiply, and registers measured in single bytes, many Rust abstractions simply couldn't fit. The &lt;a href="https://tinycomputers.io/posts/rust-on-z80-an-llvm-backend-odyssey.html"&gt;full write-up&lt;/a&gt; documents both the successes and the fundamental constraints we hit.&lt;/p&gt;
&lt;p&gt;Sampo, as a 16-bit architecture with hardware multiply/divide and a cleaner register file, sidesteps many of those limitations. The Z80 taught us &lt;em&gt;how&lt;/em&gt; to build LLVM backends; Sampo let us build one that actually works well.&lt;/p&gt;
&lt;h3&gt;The Role of Claude Code&lt;/h3&gt;
&lt;p&gt;This project would not have been feasible without extensive use of &lt;a href="https://baud.rs/iO989C"&gt;Claude Code&lt;/a&gt;, Anthropic's AI-powered coding assistant. I want to be explicit about this: implementing an LLVM backend is traditionally a multi-month effort requiring deep expertise in compiler internals. With Claude Code, the core implementation was completed in intensive sessions over a few days.&lt;/p&gt;
&lt;p&gt;Here's how Claude Code contributed:&lt;/p&gt;
&lt;h4&gt;1. Scaffolding the Backend Structure&lt;/h4&gt;
&lt;p&gt;LLVM backends follow a specific structure with dozens of interrelated files: &lt;code&gt;SampoTargetMachine.cpp&lt;/code&gt;, &lt;code&gt;SampoInstrInfo.td&lt;/code&gt;, &lt;code&gt;SampoRegisterInfo.td&lt;/code&gt;, &lt;code&gt;SampoCallingConv.td&lt;/code&gt;, and many more. Claude Code generated the initial scaffolding based on patterns from existing backends (RISC-V, MSP430, AVR), then systematically customized each file for Sampo's specific requirements.&lt;/p&gt;
&lt;h4&gt;2. Debugging Cryptic LLVM Errors&lt;/h4&gt;
&lt;p&gt;LLVM's error messages can be... opaque. Messages like "unable to legalize instruction: G_TRUNC s12 = G_TRUNC s32" or "SmallVector capacity overflow" don't immediately point to solutions. Claude Code could analyze stack traces, cross-reference them with LLVM's source code, and identify the root causes, often obscure interactions between type legalization rules.&lt;/p&gt;
&lt;h4&gt;3. Iterative Refinement&lt;/h4&gt;
&lt;p&gt;The development process was highly iterative. We'd attempt to compile a test case, hit an error, fix it, and discover the next issue. Claude Code maintained context across hundreds of these iterations, remembering what had been tried, what worked, and what the current state of each file was.&lt;/p&gt;
&lt;h4&gt;4. Understanding LLVM Internals&lt;/h4&gt;
&lt;p&gt;LLVM has two instruction selection frameworks: SelectionDAG (legacy) and GlobalISel (newer, recommended for new backends). Claude Code explained the tradeoffs, recommended GlobalISel for Sampo, and then implemented the required components: &lt;code&gt;SampoLegalizerInfo&lt;/code&gt;, &lt;code&gt;SampoRegisterBankInfo&lt;/code&gt;, and &lt;code&gt;SampoInstructionSelector&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This isn't to diminish the human element; architectural decisions, design philosophy, and validation all required human judgment. But the mechanical work of writing hundreds of lines of boilerplate C++, TableGen definitions, and CMake configurations was dramatically accelerated.&lt;/p&gt;
&lt;h3&gt;LLVM Backend Architecture&lt;/h3&gt;
&lt;p&gt;An LLVM backend transforms LLVM Intermediate Representation (IR) into target-specific machine code. For Sampo, this involves several stages:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;Rust Source Code
       ↓
   rustc frontend
       ↓
    LLVM IR
       ↓
  Instruction Selection (GlobalISel)
       ↓
  Register Allocation
       ↓
  Prologue/Epilogue Insertion
       ↓
  MC Layer (Machine Code)
       ↓
  Sampo Assembly (.s file)
       ↓
  sasm Assembler
       ↓
  Binary (.bin file)
       ↓
  semu Emulator
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Let's examine the key components we implemented.&lt;/p&gt;
&lt;h4&gt;File Structure&lt;/h4&gt;
&lt;p&gt;A complete LLVM backend requires approximately 25-30 files. Here's the structure for Sampo:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;llvm&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;lib&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Target&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CMakeLists&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;td&lt;/span&gt;&lt;span class="w"&gt;                    &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Top&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;TableGen&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoAsmPrinter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Assembly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;generation&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoCallingConv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;td&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Calling&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;convention&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoFrameLowering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Stack&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;handling&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoFrameLowering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoInstrFormats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;td&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Instruction&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;encoding&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoInstrInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Instruction&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;utilities&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoInstrInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoInstrInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;td&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Instruction&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;definitions&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoISelLowering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DAG&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lowering&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minimal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoISelLowering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoMCInstLower&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MachineInstr&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MCInst&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoMCInstLower&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoRegisterInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Register&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;handling&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoRegisterInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoRegisterInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;td&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Register&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;definitions&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoSubtarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoSubtarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoTargetMachine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Entry&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;point&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoTargetMachine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GISel&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoCallLowering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GlobalISel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;calls&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoCallLowering&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoInstructionSelector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoLegalizerInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;legalization&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoLegalizerInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoRegisterBankInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoRegisterBankInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MCTargetDesc&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoAsmBackend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Object&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;generation&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoELFObjectWriter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoInstPrinter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Assembly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;printing&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoMCAsmInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoMCCodeEmitter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoMCTargetDesc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;
&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;TargetInfo&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoTargetInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cpp&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="p"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;registration&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Each file has a specific role. The TableGen files (&lt;code&gt;.td&lt;/code&gt;) are processed at build time to generate C++ code for instruction encoding, assembly printing, and more. The &lt;code&gt;GISel/&lt;/code&gt; directory contains GlobalISel-specific components; this is where most of the interesting logic lives.&lt;/p&gt;
&lt;h4&gt;Target Description (TableGen)&lt;/h4&gt;
&lt;p&gt;LLVM uses &lt;a href="https://baud.rs/k04R4l"&gt;TableGen&lt;/a&gt;, a domain-specific language, to describe target architectures declaratively. For Sampo, we defined:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Registers&lt;/strong&gt; (&lt;code&gt;SampoRegisterInfo.td&lt;/code&gt;):&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R0&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoReg&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s"&gt;"R0"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;;&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c c-SingleLine"&gt;// Zero register&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoReg&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s"&gt;"R1"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;;&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c c-SingleLine"&gt;// Return address&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;SampoReg&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s"&gt;"R2"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;;&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c c-SingleLine"&gt;// Stack pointer&lt;/span&gt;
&lt;span class="c c-SingleLine"&gt;// ... R3-R15&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;RegisterClass&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="s"&gt;"Sampo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i16&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sequence&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"R%u"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;)&amp;gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Instructions&lt;/strong&gt; (&lt;code&gt;SampoInstrInfo.td&lt;/code&gt;):&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;FormatR&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mh"&gt;0x0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;0x0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rd&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ins&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rs1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rs2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="w"&gt;                  &lt;/span&gt;&lt;span class="s"&gt;"ADD&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;$rd, $rs1, $rs2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;                  &lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rs1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rs2&lt;/span&gt;&lt;span class="p"&gt;))]&amp;gt;;&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;FormatXNoRs&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mh"&gt;0x8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rd&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ins&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;imm16&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$imm&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="w"&gt;                      &lt;/span&gt;&lt;span class="s"&gt;"LIX&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;$rd, $imm"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;                      &lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;GPR&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$rd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;imm16&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;$imm&lt;/span&gt;&lt;span class="p"&gt;)]&amp;gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Calling Convention&lt;/strong&gt; (&lt;code&gt;SampoCallingConv.td&lt;/code&gt;):&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CC_Sampo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CallingConv&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;[&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c c-SingleLine"&gt;// First 4 arguments in R4-R7&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;CCIfType&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;[&lt;/span&gt;&lt;span class="n"&gt;i16&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CCAssignToReg&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;[&lt;/span&gt;&lt;span class="n"&gt;R4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R7&lt;/span&gt;&lt;span class="p"&gt;]&amp;gt;&amp;gt;,&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c c-SingleLine"&gt;// Additional arguments on stack&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;CCIfType&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;[&lt;/span&gt;&lt;span class="n"&gt;i16&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CCAssignToStack&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;]&amp;gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;These declarative definitions generate thousands of lines of C++ code automatically.&lt;/p&gt;
&lt;h4&gt;GlobalISel: The Modern Instruction Selector&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://baud.rs/69RpUC"&gt;GlobalISel&lt;/a&gt; is LLVM's newer instruction selection framework, designed to be more modular and easier to target than the legacy SelectionDAG approach. It works in phases:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;IRTranslator&lt;/strong&gt;: Converts LLVM IR to Generic Machine IR (GMIR)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Legalizer&lt;/strong&gt;: Transforms illegal operations into legal ones&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;RegBankSelect&lt;/strong&gt;: Assigns operands to register banks&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;InstructionSelect&lt;/strong&gt;: Maps GMIR to target instructions&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For a 16-bit architecture like Sampo, the &lt;strong&gt;Legalizer&lt;/strong&gt; is where most complexity lives. LLVM IR freely uses types like &lt;code&gt;i32&lt;/code&gt;, &lt;code&gt;i64&lt;/code&gt;, and even &lt;code&gt;i128&lt;/code&gt;. Sampo's ALU only operates on 16-bit values. The legalizer must transform these:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;// In SampoLegalizerInfo.cpp&lt;/span&gt;
&lt;span class="n"&gt;getActionDefinitionsBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G_ADD&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legalFor&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;s16&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;&lt;span class="c1"&gt;// i16 add is native&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clampScalar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Clamp to 16-bit&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;widenScalarToNextPow2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// Widen smaller types&lt;/span&gt;

&lt;span class="n"&gt;getActionDefinitionsBuilder&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;G_SDIV&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;G_UDIV&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legalFor&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;s16&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;libcallFor&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;s32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s64&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// Use libcalls for larger types&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clampScalar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s64&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This tells LLVM: "16-bit addition is a single instruction. 32-bit addition needs to be broken into multiple 16-bit operations. 64-bit division should call a library function."&lt;/p&gt;
&lt;h4&gt;Debugging the Legalizer: A Case Study&lt;/h4&gt;
&lt;p&gt;One particularly memorable debugging session illustrates the challenges of LLVM development. When first attempting to compile Rust's &lt;code&gt;libcore&lt;/code&gt;, the compiler crashed with:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;Assertion failed: (idx &amp;lt; size()), function operator[], file SmallVector.h, line 301
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This cryptic error (a SmallVector bounds overflow deep in LLVM's internals) gave no indication of what was wrong. The stack trace pointed to &lt;code&gt;SampoInstPrinter::printOperand&lt;/code&gt;, which prints assembly operands.&lt;/p&gt;
&lt;p&gt;Working with Claude Code, we traced the issue through multiple layers:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The crash occurred when printing a &lt;code&gt;JALR&lt;/code&gt; (indirect call) instruction&lt;/li&gt;
&lt;li&gt;&lt;code&gt;JALR&lt;/code&gt; is defined in TableGen as &lt;code&gt;JALR $rd, $rs1&lt;/code&gt; (two operands)&lt;/li&gt;
&lt;li&gt;Our call lowering code was only providing one operand (the target register)&lt;/li&gt;
&lt;li&gt;The printer tried to access operand index 1, which didn't exist&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The fix was a single line change, adding the return address destination register:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;// Before (broken):&lt;/span&gt;
&lt;span class="n"&gt;MIRBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;buildInstr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;JALR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addReg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Callee&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getReg&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="c1"&gt;// After (fixed):&lt;/span&gt;
&lt;span class="n"&gt;MIRBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;buildInstr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;JALR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addDef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Return address destination&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addReg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Callee&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getReg&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This pattern repeated throughout development: an opaque error, careful tracing through LLVM's layers, and ultimately a small fix. Without Claude Code's ability to quickly navigate LLVM's massive codebase and maintain context across debugging sessions, each of these issues could have taken days to resolve.&lt;/p&gt;
&lt;h4&gt;The 16-bit Challenge: Type Legalization&lt;/h4&gt;
&lt;p&gt;The most significant technical challenge was handling non-16-bit types. Consider what happens when Rust code uses a &lt;code&gt;u32&lt;/code&gt;:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;u32&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;0x12345678&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;u32&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Sampo has no 32-bit registers. LLVM must:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Split the 32-bit value across two 16-bit registers (R4:R5)&lt;/li&gt;
&lt;li&gt;Implement addition with carry propagation&lt;/li&gt;
&lt;li&gt;Track both halves through register allocation&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The legalizer handles this through "narrowing" actions:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;getActionDefinitionsBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G_ADD&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legalFor&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="n"&gt;s16&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;narrowScalarFor&lt;/span&gt;&lt;span class="p"&gt;({{&lt;/span&gt;&lt;span class="n"&gt;s32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s16&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Narrow s32 to s16 pairs&lt;/span&gt;
&lt;span class="w"&gt;                     &lt;/span&gt;&lt;span class="p"&gt;[](&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LegalityQuery&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;                       &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LLT&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;scalar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="w"&gt;                     &lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;We also encountered issues with unusual type sizes. LLVM's intermediate stages sometimes create types like &lt;code&gt;s12&lt;/code&gt; or &lt;code&gt;s24&lt;/code&gt; (12-bit and 24-bit integers). These aren't power-of-two sizes, which caused crashes in the type legalization framework:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;LLVM ERROR: unable to legalize instruction: %1:_(s12) = G_TRUNC %0:_(s32)
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The fix required careful specification of widening rules:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;getActionDefinitionsBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;G_TRUNC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;widenScalarIf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;[](&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LegalityQuery&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Size&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Types&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;getSizeInBits&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;llvm&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;isPowerOf2_32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Non-power-of-2?&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;[](&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LegalityQuery&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Size&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Types&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;getSizeInBits&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;NewSize&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;llvm&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;PowerOf2Ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Size&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;std&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;make_pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LLT&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;scalar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NewSize&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;legalIf&lt;/span&gt;&lt;span class="p"&gt;([](&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LegalityQuery&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Types&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;getSizeInBits&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;=&lt;/span&gt;
&lt;span class="w"&gt;             &lt;/span&gt;&lt;span class="n"&gt;Query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Types&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;getSizeInBits&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This tells LLVM: "If you see a non-power-of-2 type, round it up to the next power of 2 first, then proceed with normal legalization."&lt;/p&gt;
&lt;h4&gt;Multi-Word Arithmetic&lt;/h4&gt;
&lt;p&gt;When Rust code uses 32-bit or 64-bit integers, Sampo must synthesize these operations from 16-bit primitives. Consider a simple 32-bit addition:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;u32&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;0x12340000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;u32&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mh"&gt;0x00005678&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// 0x12345678&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This compiles to a sequence that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Adds the low 16-bit halves&lt;/li&gt;
&lt;li&gt;Adds the high 16-bit halves with carry propagation&lt;/li&gt;
&lt;li&gt;Manages results across register pairs&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The generated assembly looks like:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;; R4:R5 = first operand (low:high)&lt;/span&gt;
&lt;span class="c1"&gt;; R6:R7 = second operand (low:high)&lt;/span&gt;
&lt;span class="nf"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="no"&gt;R8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R6&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;; Add low halves&lt;/span&gt;
&lt;span class="nf"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="no"&gt;R9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;&lt;span class="c1"&gt;; Prepare carry&lt;/span&gt;
&lt;span class="c1"&gt;; (carry detection logic)&lt;/span&gt;
&lt;span class="nf"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="no"&gt;R10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R7&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="c1"&gt;; Add high halves&lt;/span&gt;
&lt;span class="nf"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="no"&gt;R10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;R9&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;; Add carry&lt;/span&gt;
&lt;span class="c1"&gt;; Result in R8:R10&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;LLVM's legalizer generates this multi-instruction sequence automatically through "narrowing" rules. We didn't write this expansion manually; we just told LLVM that 32-bit operations should be narrowed to 16-bit pairs.&lt;/p&gt;
&lt;h4&gt;Function Calling Convention&lt;/h4&gt;
&lt;p&gt;Getting function calls right was crucial. Sampo uses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;R4-R7&lt;/strong&gt;: First four arguments (caller-saved)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;R1&lt;/strong&gt;: Return address&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;R2&lt;/strong&gt;: Stack pointer&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;R8-R11&lt;/strong&gt;: Temporaries (caller-saved)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;R12-R15&lt;/strong&gt;: Saved registers (callee-saved)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;SampoCallLowering.cpp&lt;/code&gt; file implements this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;SampoCallLowering::lowerCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MachineIRBuilder&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;MIRBuilder&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;                                   &lt;/span&gt;&lt;span class="n"&gt;CallLoweringInfo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Copy arguments to their designated registers&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;static&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MCPhysReg&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ArgRegs&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;R4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;R5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;                                       &lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;R6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;R7&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrigArgs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="n"&gt;MIRBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;buildCopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ArgRegs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrigArgs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;Regs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;// Spill to stack&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Build the call instruction&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Callee&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isReg&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// Indirect call: JALR R1, Rs  (save return addr to R1, jump to Rs)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;MIRBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;buildInstr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;JALR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addDef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addReg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Callee&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getReg&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// Direct call: JALX symbol&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;MIRBuilder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;buildInstr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;JALX&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Callee&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Mark caller-saved registers as clobbered&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// ... implicit defs for R4-R11&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;One subtle bug took hours to track down: the &lt;code&gt;JALR&lt;/code&gt; instruction (indirect call) expects two operands: the destination register for the return address (R1) and the source register containing the jump target. Initially, we only provided one operand, causing a crash deep in the assembly printer when it tried to access the non-existent second operand. The error message was simply "SmallVector capacity overflow," not exactly illuminating without context.&lt;/p&gt;
&lt;h4&gt;The Assembly Printer Layer&lt;/h4&gt;
&lt;p&gt;The final stage of code generation converts LLVM's internal machine instructions to textual assembly. This involves two components:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MCInstLower&lt;/strong&gt; converts MachineInstr (high-level) to MCInst (low-level):&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;SampoMCInstLower::Lower&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MachineInstr&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;MI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MCInst&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;OutMI&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;OutMI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setOpcode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MI&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;getOpcode&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MachineOperand&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;MO&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MI&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;operands&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;MCOperand&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MCOp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LowerOperand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MO&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MCOp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isValid&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Skip implicit operands&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="n"&gt;OutMI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addOperand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MCOp&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;InstPrinter&lt;/strong&gt; converts MCInst to assembly text:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;SampoInstPrinter::printOperand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MCInst&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;MI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;OpNo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;                                    &lt;/span&gt;&lt;span class="n"&gt;raw_ostream&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;O&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MCOperand&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MI&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;getOperand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OpNo&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isReg&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;printRegName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;O&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getReg&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isImm&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;O&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getImm&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isExpr&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;MAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;printExpr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;O&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Op&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getExpr&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;TableGen generates most of the printer code automatically from instruction definitions. The pattern &lt;code&gt;"ADD\t$rd, $rs1, $rs2"&lt;/code&gt; in the TableGen file directly produces the assembly format.&lt;/p&gt;
&lt;h3&gt;Building Rust's Standard Library&lt;/h3&gt;
&lt;p&gt;With the LLVM backend working, the next step was teaching Rust about Sampo. This required:&lt;/p&gt;
&lt;h4&gt;1. Adding the Target Triple&lt;/h4&gt;
&lt;p&gt;In Rust's &lt;code&gt;rustc_target&lt;/code&gt; crate, we added &lt;code&gt;sampo-unknown-none&lt;/code&gt;:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;// compiler/rustc_target/src/spec/targets/sampo_unknown_none.rs&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;crate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;fn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;target&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;Target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;data_layout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"e-m:e-p:16:16-i8:8-i16:16-i32:16-n16-S16"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;into&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;llvm_target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"sampo-unknown-none"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;into&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;pointer_width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;arch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Arch&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Sampo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;TargetOptions&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="n"&gt;panic_strategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;PanicStrategy&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Abort&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="n"&gt;atomic_cas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="n"&gt;max_atomic_width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="n"&gt;c_int_width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="nb"&gt;Default&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;data_layout&lt;/code&gt; string is critical; it tells LLVM that pointers are 16 bits, alignment requirements, and native integer sizes. Getting this wrong causes subtle miscompilations.&lt;/p&gt;
&lt;h4&gt;2. Registering the Target in Rust&lt;/h4&gt;
&lt;p&gt;Rust's build system needs to know about new targets in multiple places:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;// compiler/rustc_target/src/spec/mod.rs&lt;/span&gt;
&lt;span class="n"&gt;supported_targets&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ... existing targets ...&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"sampo-unknown-none"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sampo_unknown_none&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// compiler/rustc_span/src/symbol.rs&lt;/span&gt;
&lt;span class="n"&gt;Symbols&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ... existing symbols ...&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;sampo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;Arch&lt;/code&gt; enum in &lt;code&gt;rustc_target&lt;/code&gt; also needed a new variant. These changes propagate through Rust's bootstrap system, eventually producing a compiler that recognizes &lt;code&gt;--target sampo-unknown-none&lt;/code&gt;.&lt;/p&gt;
&lt;h4&gt;3. Building Core Libraries&lt;/h4&gt;
&lt;p&gt;Rust's &lt;code&gt;#![no_std]&lt;/code&gt; programs still need &lt;code&gt;libcore&lt;/code&gt; (the dependency-free foundation) and &lt;code&gt;compiler_builtins&lt;/code&gt; (intrinsics for operations the hardware doesn't support natively). Building these required:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Point Rust at our custom LLVM&lt;/span&gt;
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;LLVM_CONFIG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/path/to/llvm-sampo/build/bin/llvm-config

&lt;span class="c1"&gt;# Build stage 1 compiler&lt;/span&gt;
./x.py&lt;span class="w"&gt; &lt;/span&gt;build&lt;span class="w"&gt; &lt;/span&gt;--stage&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;

&lt;span class="c1"&gt;# Build libraries for Sampo&lt;/span&gt;
./x.py&lt;span class="w"&gt; &lt;/span&gt;build&lt;span class="w"&gt; &lt;/span&gt;--stage&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;library&lt;span class="w"&gt; &lt;/span&gt;--target&lt;span class="w"&gt; &lt;/span&gt;sampo-unknown-none
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This compiles approximately 50,000 lines of Rust into Sampo assembly, a significant stress test of the backend. The resulting libraries:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;libcore&lt;/code&gt;: 1.1 MB (Rust's core library)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;liballoc&lt;/code&gt;: 211 KB (heap allocation)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;libcompiler_builtins&lt;/code&gt;: 2.3 MB (soft-float, 64-bit arithmetic, etc.)&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;3. Handling Missing Features&lt;/h4&gt;
&lt;p&gt;A 16-bit CPU without atomic operations or floating-point hardware needs careful configuration:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;atomic_cas: false&lt;/code&gt;: No compare-and-swap&lt;/li&gt;
&lt;li&gt;&lt;code&gt;max_atomic_width: Some(0)&lt;/code&gt;: No atomic operations at all&lt;/li&gt;
&lt;li&gt;&lt;code&gt;panic_strategy: PanicStrategy::Abort&lt;/code&gt;: No unwinding&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Rust's type system handles these gracefully. Code that requires atomics simply won't compile for Sampo, with clear error messages.&lt;/p&gt;
&lt;h3&gt;The Complete Pipeline&lt;/h3&gt;
&lt;p&gt;Let's trace through what happens when compiling our "Hi!" program:&lt;/p&gt;
&lt;h4&gt;Stage 1: Rust to LLVM IR&lt;/h4&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;putc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sc"&gt;b'H'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Becomes:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;call&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="vg"&gt;@putc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;i8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;zeroext&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;72&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Stage 2: LLVM IR to Generic Machine IR&lt;/h4&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c"&gt;%0:gpr = G_CONSTANT i16 72&lt;/span&gt;
$&lt;span class="n"&gt;r4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;COPY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;%0&lt;/span&gt;
&lt;span class="n"&gt;JALX&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;@putc,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;implicit&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;$r4,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;implicit-def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;$r1,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Stage 3: Instruction Selection&lt;/h4&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c"&gt;%0:gpr = LIX 72&lt;/span&gt;
$&lt;span class="n"&gt;r4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;COPY&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;%0&lt;/span&gt;
&lt;span class="n"&gt;JALX&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;@putc,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Stage 4: Register Allocation&lt;/h4&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="n"&gt;r4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;72&lt;/span&gt;
&lt;span class="n"&gt;JALX&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;@putc&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Stage 5: Assembly Output&lt;/h4&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="nf"&gt;LIX&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="no"&gt;R4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;72&lt;/span&gt;
&lt;span class="nf"&gt;JALX&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;putc&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Stage 6: Binary&lt;/h4&gt;
&lt;p&gt;Our &lt;code&gt;sasm&lt;/code&gt; assembler produces the final binary, which runs on &lt;code&gt;semu&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;The Development Process: Iterating with AI&lt;/h3&gt;
&lt;p&gt;Traditional compiler development follows a deliberate pace: study the codebase for weeks, implement a small feature, spend days debugging, repeat. With Claude Code, this cycle compressed dramatically.&lt;/p&gt;
&lt;p&gt;A typical session looked like:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Describe the goal&lt;/strong&gt;: "I need to implement call lowering for indirect function calls"&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Receive implementation&lt;/strong&gt;: Claude Code generates &lt;code&gt;SampoCallLowering.cpp&lt;/code&gt; with appropriate patterns&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test&lt;/strong&gt;: Compile a test case, observe failure&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Debug together&lt;/strong&gt;: Share the error, get analysis and fixes&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Iterate&lt;/strong&gt;: Sometimes 10-20 cycles for a single feature&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The key insight is that Claude Code doesn't just generate code; it explains &lt;em&gt;why&lt;/em&gt; that code is correct (or incorrect). When the call lowering crashed, Claude Code walked through:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;How MachineInstrs represent instructions&lt;/li&gt;
&lt;li&gt;The difference between explicit and implicit operands&lt;/li&gt;
&lt;li&gt;Why the TableGen definition expected two operands&lt;/li&gt;
&lt;li&gt;What the MCInstLower layer does with each operand type&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This contextual understanding accelerates learning far beyond copy-paste coding.&lt;/p&gt;
&lt;h4&gt;Code Quality Considerations&lt;/h4&gt;
&lt;p&gt;AI-generated code requires the same scrutiny as human-written code. During this project, we found:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Things Claude Code did well:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Boilerplate that follows established patterns&lt;/li&gt;
&lt;li&gt;TableGen definitions (highly formulaic)&lt;/li&gt;
&lt;li&gt;Explaining LLVM concepts and architecture&lt;/li&gt;
&lt;li&gt;Debugging from error messages and stack traces&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Things requiring human judgment:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Architectural decisions (GlobalISel vs SelectionDAG)&lt;/li&gt;
&lt;li&gt;Performance tradeoffs in instruction selection&lt;/li&gt;
&lt;li&gt;Edge cases in type legalization&lt;/li&gt;
&lt;li&gt;Testing strategy and coverage&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The final codebase reflects this collaboration. Claude Code generated perhaps 80% of the initial code, but human review and iteration refined it into something production-quality.&lt;/p&gt;
&lt;h3&gt;Lessons Learned&lt;/h3&gt;
&lt;h4&gt;1. Start with GlobalISel&lt;/h4&gt;
&lt;p&gt;For new backends, GlobalISel is significantly easier to work with than SelectionDAG. The modular design means you can implement and test each phase independently.&lt;/p&gt;
&lt;h4&gt;2. Type Legalization is the Hard Part&lt;/h4&gt;
&lt;p&gt;For non-standard word sizes (16-bit, 8-bit), most complexity lives in the legalizer. Plan to spend 60%+ of your effort here.&lt;/p&gt;
&lt;h4&gt;3. Test Early and Often&lt;/h4&gt;
&lt;p&gt;We maintained a suite of LLVM IR test files that exercised specific features:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c"&gt;; test_call.ll - Function calling&lt;/span&gt;
&lt;span class="k"&gt;define&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="vg"&gt;@_start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;call&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="vg"&gt;@putc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;i8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;72&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c"&gt;; 'H'&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;ret&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Each bug fix was validated against this suite before proceeding.&lt;/p&gt;
&lt;h4&gt;4. AI-Assisted Development Changes Everything&lt;/h4&gt;
&lt;p&gt;Traditional LLVM backend development requires months of ramp-up time just to understand the codebase. Claude Code's ability to explain concepts, generate boilerplate, and debug issues compressed this dramatically. The key is knowing what questions to ask and validating the outputs.&lt;/p&gt;
&lt;h4&gt;5. LLVM's Abstractions Are Worth It&lt;/h4&gt;
&lt;p&gt;Despite the complexity, LLVM's abstractions pay dividends. Register allocation, instruction scheduling, and numerous optimizations come for free. A hand-written code generator would take years to match this quality.&lt;/p&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;With Rust compiling for Sampo, several exciting possibilities open up:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Operating System Development&lt;/strong&gt;: Sampo now has enough tooling to write a simple operating system. A minimal kernel with task switching, memory management, and device drivers becomes feasible. Rust's ownership model could make this a particularly safe OS, even on a minimal 16-bit platform.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Language Ports&lt;/strong&gt;: Since we implemented an LLVM backend (not just Rust support), Clang should work with minimal additional effort. C and C++ for Sampo would enable porting existing retrocomputing software. Imagine &lt;a href="https://baud.rs/3YiduS"&gt;CP/M&lt;/a&gt; utilities or classic games recompiled for modern Sampo hardware.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Hardware Verification&lt;/strong&gt;: Running Rust-generated code on the FPGA implementation will provide end-to-end validation of both the hardware and software toolchains. Any discrepancy between the emulator and hardware would become immediately visible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Educational Materials&lt;/strong&gt;: A complete, working compiler toolchain for a simple CPU is valuable for teaching. Students can trace code from high-level Rust through every compilation stage to final execution. The relative simplicity of a 16-bit architecture makes the concepts accessible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Performance Optimization&lt;/strong&gt;: The current backend generates correct code, but there's room for improvement. Instruction scheduling, better register allocation hints, and peephole optimizations could improve code density and speed.&lt;/p&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;Building an LLVM backend for a custom CPU is one of those projects that sounds impossible until you're in the middle of it, then sounds impossible again when you hit your third cryptic linker error at 2 AM. But it's achievable, especially with modern AI-assisted development tools.&lt;/p&gt;
&lt;p&gt;The Sampo project now spans:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Architecture design&lt;/strong&gt;: A clean 16-bit RISC with Z80-inspired features&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hardware implementation&lt;/strong&gt;: Verilog RTL running on an ECP5 FPGA (need to order hardware first!)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Assembler and emulator&lt;/strong&gt;: Written in Rust, fully functional&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;LLVM backend&lt;/strong&gt;: Complete GlobalISel-based code generator&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Rust support&lt;/strong&gt;: &lt;code&gt;libcore&lt;/code&gt;, &lt;code&gt;liballoc&lt;/code&gt;, and &lt;code&gt;compiler_builtins&lt;/code&gt; for &lt;code&gt;sampo-unknown-none&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From Finnish mythology, the &lt;a href="https://baud.rs/GbaMVL"&gt;Sampo&lt;/a&gt; was a magical mill that produced endless riches. Our Sampo is more modest; it just produces machine code. But there's something magical about typing &lt;code&gt;cargo build --target sampo-unknown-none&lt;/code&gt; and watching a high-level language compile down to instructions for a CPU that didn't exist a few months ago.&lt;/p&gt;
&lt;p&gt;The complete source code is available on GitHub:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/GCQDRa"&gt;llvm-sampo&lt;/a&gt;&lt;/strong&gt; - The LLVM backend and Rust target specification&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/r74wA8"&gt;sampo&lt;/a&gt;&lt;/strong&gt; - CPU architecture, assembler, emulator, and FPGA RTL&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Whether you're interested in compiler development, CPU design, or just want to see how deep the rabbit hole goes, I hope this series has been illuminating.&lt;/p&gt;
&lt;h3&gt;Recommended Books&lt;/h3&gt;
&lt;p&gt;If you're interested in learning more about LLVM, Rust, or computer architecture, these books are excellent resources:&lt;/p&gt;
&lt;h4&gt;LLVM &amp;amp; Compiler Development&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/cpTnhU"&gt;Learn LLVM 17&lt;/a&gt; by Kai Nacke - Comprehensive guide to LLVM internals, including backend development&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/N610Db"&gt;LLVM Techniques, Tips, and Best Practices&lt;/a&gt; by Min-Yih Hsu - Practical patterns for working with LLVM&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/YtLncr"&gt;LLVM Code Generation&lt;/a&gt; - Focused coverage of code generation, instruction selection, and register allocation&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Rust Programming&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/kAPJDa"&gt;&lt;em&gt;The Rust Programming Language, 3rd Edition&lt;/em&gt;&lt;/a&gt; by Steve Klabnik &amp;amp; Carol Nichols - The definitive Rust guide, updated for 2024&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/R1fDfb"&gt;&lt;em&gt;Programming Rust, 2nd Edition&lt;/em&gt;&lt;/a&gt; by Jim Blandy et al. - Deep dive into Rust's systems programming capabilities&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Computer Architecture&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/Y0TnVh"&gt;Computer Architecture: A Quantitative Approach&lt;/a&gt; by Hennessy &amp;amp; Patterson - The classic text on CPU design&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/UUtBki"&gt;Digital Design and Computer Architecture&lt;/a&gt; by Harris &amp;amp; Harris - From gates to processors, excellent for CPU design&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/pbDcC6"&gt;The RISC-V Reader&lt;/a&gt; - Modern RISC architecture principles (many Sampo design decisions were informed by RISC-V)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Source Code&lt;/h3&gt;
&lt;p&gt;All code is available under open source licenses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/GCQDRa"&gt;github.com/ajokela/llvm-sampo&lt;/a&gt;&lt;/strong&gt; - LLVM backend (Apache 2.0 + LLVM Exception)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/r74wA8"&gt;github.com/ajokela/sampo&lt;/a&gt;&lt;/strong&gt; - Assembler, emulator, FPGA RTL&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Acknowledgments&lt;/h3&gt;
&lt;p&gt;This project wouldn't have been possible without the LLVM community's extensive documentation and the examples provided by existing backends. The &lt;a href="https://baud.rs/s53YsX"&gt;MSP430&lt;/a&gt;, &lt;a href="https://baud.rs/ASLcbC"&gt;AVR&lt;/a&gt;, and &lt;a href="https://baud.rs/1SpI9N"&gt;RISC-V&lt;/a&gt; backends were particularly useful references for handling small word sizes.&lt;/p&gt;
&lt;p&gt;Claude Code, developed by Anthropic, was instrumental in navigating LLVM's complexity. While AI-assisted development is sometimes viewed skeptically, this project demonstrates its potential for tackling genuinely difficult engineering challenges. The key is treating AI as a collaborator rather than a replacement; it accelerates the mechanical aspects while humans provide direction and judgment.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This is Part 3 of the Sampo series. &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Part 1&lt;/a&gt; covers the architecture design, and &lt;a href="https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html"&gt;Part 2&lt;/a&gt; covers the FPGA implementation.&lt;/em&gt;&lt;/p&gt;</description><category>ai-assisted development</category><category>claude code</category><category>code generation</category><category>compiler</category><category>globalisel</category><category>llvm</category><category>retrocomputing</category><category>risc</category><category>rust</category><category>sampo</category><guid>https://tinycomputers.io/posts/sampo-llvm-backend-rust-compiler.html</guid><pubDate>Wed, 04 Feb 2026 16:00:00 GMT</pubDate></item><item><title>Part 2: Implementing Sampo on the ULX3S FPGA</title><link>https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;p&gt;After designing the &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Sampo RISC architecture&lt;/a&gt; on paper (complete with a working assembler and emulator) it's time to bring it to life in silicon. Or at least, in programmable logic. This post documents the hardware selection and implementation planning for synthesizing Sampo on an FPGA.&lt;/p&gt;
&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/sampo-fpga-implementation-ulx3s_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;7 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h3&gt;The Story So Far&lt;/h3&gt;
&lt;p&gt;If you haven't read &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Part 1 of this series&lt;/a&gt;, here's the quick version: Sampo is a 16-bit RISC CPU designed to bridge the gap between clean RISC design principles and Z80-friendly features. It has 16 general-purpose registers, ~66 instructions, port-based I/O, block operations (LDIR, LDDR), alternate registers for fast interrupt handling, and hardware multiply/divide.&lt;/p&gt;
&lt;p&gt;The project already includes working tools written in Rust:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;sasm&lt;/strong&gt; - A full assembler&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;semu&lt;/strong&gt; - An emulator with TUI debugger (step, breakpoints, memory inspection)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And for hardware implementation, we now have two complete RTL implementations:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Amaranth HDL&lt;/strong&gt; (&lt;code&gt;/rtl/&lt;/code&gt;):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;cpu.py&lt;/code&gt;, &lt;code&gt;alu.py&lt;/code&gt;, &lt;code&gt;decode.py&lt;/code&gt;, &lt;code&gt;regfile.py&lt;/code&gt;, &lt;code&gt;soc.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Python-based, excellent for rapid iteration&lt;/li&gt;
&lt;li&gt;Generates Verilog for synthesis&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;AI Assisted Hand-written Verilog&lt;/strong&gt; (&lt;code&gt;/verilog/rtl/&lt;/code&gt;):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;cpu.v&lt;/code&gt;, &lt;code&gt;alu.v&lt;/code&gt;, &lt;code&gt;decode.v&lt;/code&gt;, &lt;code&gt;regfile.v&lt;/code&gt;, &lt;code&gt;shifter.v&lt;/code&gt;, &lt;code&gt;uart.v&lt;/code&gt;, &lt;code&gt;ram.v&lt;/code&gt;, &lt;code&gt;soc.v&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Readable, portable, works with any toolchain&lt;/li&gt;
&lt;li&gt;Includes testbenches for Icarus Verilog and Verilator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now it's time to synthesize it to real hardware.&lt;/p&gt;
&lt;h3&gt;Choosing an FPGA Platform&lt;/h3&gt;
&lt;p&gt;The FPGA world is split between proprietary toolchains (Xilinx Vivado, Intel Quartus) and the growing open source ecosystem. For a project like Sampo, where understanding every layer of the stack matters, open source tooling is the clear choice.&lt;/p&gt;
&lt;h4&gt;Open Source FPGA Options&lt;/h4&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;FPGA Family&lt;/th&gt;
&lt;th&gt;Capacity&lt;/th&gt;
&lt;th&gt;Toolchain&lt;/th&gt;
&lt;th&gt;Maturity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gowin GW1N/GW2A&lt;/td&gt;
&lt;td&gt;1K-55K LUTs&lt;/td&gt;
&lt;td&gt;Project Apicula&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lattice iCE40&lt;/td&gt;
&lt;td&gt;1K-8K LUTs&lt;/td&gt;
&lt;td&gt;Project IceStorm&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lattice ECP5&lt;/td&gt;
&lt;td&gt;12K-85K LUTs&lt;/td&gt;
&lt;td&gt;Project Trellis&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Xilinx 7-series&lt;/td&gt;
&lt;td&gt;10K-200K+ LUTs&lt;/td&gt;
&lt;td&gt;Project X-Ray (partial)&lt;/td&gt;
&lt;td&gt;Experimental&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For Sampo, which estimates at &lt;strong&gt;~1,500-2,500 LUTs&lt;/strong&gt; for the basic CPU, even the smaller FPGAs have more than enough capacity. But if we want room to grow (adding caches, more peripherals, maybe even multi-core experiments) a larger device makes sense.&lt;/p&gt;
&lt;h3&gt;The ULX3S Board&lt;/h3&gt;
&lt;p&gt;The &lt;a href="https://baud.rs/Ij7oaR"&gt;ULX3S&lt;/a&gt; is an open hardware development board built around the ECP5 FPGA. It's designed by &lt;a href="https://baud.rs/v9aiPd"&gt;Radiona.org&lt;/a&gt; and has become the de facto standard for open source FPGA development.&lt;/p&gt;
&lt;h4&gt;Specifications&lt;/h4&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Specification&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FPGA&lt;/td&gt;
&lt;td&gt;Lattice ECP5 (LFE5U-85F/45F/12F-6BG381C)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LUTs&lt;/td&gt;
&lt;td&gt;12K / 44K / 84K (depending on variant)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;USB&lt;/td&gt;
&lt;td&gt;FTDI FT231XS (500 kbit JTAG, 3 Mbit serial)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPIO&lt;/td&gt;
&lt;td&gt;56 pins (28 differential pairs), PMOD-compatible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAM&lt;/td&gt;
&lt;td&gt;32 MB SDRAM @ 166 MHz&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flash&lt;/td&gt;
&lt;td&gt;4-16 MB Quad-SPI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;microSD slot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LEDs&lt;/td&gt;
&lt;td&gt;11 total (8 user, 2 USB, 1 WiFi)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Buttons&lt;/td&gt;
&lt;td&gt;7 (4 direction, 2 fire, 1 power)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audio&lt;/td&gt;
&lt;td&gt;3.5mm jack (stereo + digital/composite)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video&lt;/td&gt;
&lt;td&gt;GPDI (HDMI-compatible) with level shifter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Display&lt;/td&gt;
&lt;td&gt;Header for 0.96" SPI OLED (SSD1331)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wireless&lt;/td&gt;
&lt;td&gt;ESP32-WROOM-32 (WiFi/Bluetooth, standalone JTAG)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ADC&lt;/td&gt;
&lt;td&gt;8 channels, 12-bit, 1 MS/s (MAX11125)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Clock&lt;/td&gt;
&lt;td&gt;25 MHz onboard, differential input available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Power&lt;/td&gt;
&lt;td&gt;3 switching regulators (1.1V, 2.5V, 3.3V)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sleep&lt;/td&gt;
&lt;td&gt;5 µA standby, RTC wake-up with battery backup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dimensions&lt;/td&gt;
&lt;td&gt;94mm × 51mm&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h4&gt;Why ULX3S for Sampo&lt;/h4&gt;
&lt;p&gt;The ULX3S isn't just an FPGA breakout board; it's a complete system:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;32MB SDRAM&lt;/strong&gt;: Real memory, not just block RAM. Essential for running actual programs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HDMI output&lt;/strong&gt;: Video terminal without external hardware.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;microSD slot&lt;/strong&gt;: Load programs, implement a filesystem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ESP32 co-processor&lt;/strong&gt;: WiFi-based JTAG debugging from any device.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Buttons and LEDs&lt;/strong&gt;: Instant I/O for testing without wiring anything.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Audio output&lt;/strong&gt;: Even supports composite video through the audio jack.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Budget Alternative: Tang Nano 9K&lt;/h3&gt;
&lt;p&gt;Before we dive into the ULX3S, it's worth mentioning a much cheaper option. The &lt;strong&gt;Tang Nano 9K&lt;/strong&gt; (~$15 on AliExpress) uses a Gowin GW1NR-9 FPGA with 8,640 LUTs, more than enough for Sampo:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;8,640 LUTs&lt;/li&gt;
&lt;li&gt;64Mbit PSRAM (can serve as the full 64KB address space and then some)&lt;/li&gt;
&lt;li&gt;HDMI output for a video terminal&lt;/li&gt;
&lt;li&gt;USB-C programming&lt;/li&gt;
&lt;li&gt;Fully supported by open-source toolchain (Yosys + nextpnr-gowin)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For initial development and testing, the Tang Nano 9K is hard to beat on price. But the ULX3S offers more I/O, more RAM, and a richer peripheral set, making it the better choice for a more complete Sampo system.&lt;/p&gt;
&lt;h3&gt;LUT Budget Planning&lt;/h3&gt;
&lt;p&gt;The Sampo RTL implementation is designed to be compact. Here's the resource breakdown:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Estimated LUTs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;16 × 16-bit registers&lt;/td&gt;
&lt;td&gt;~256 FFs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ALU (16-bit)&lt;/td&gt;
&lt;td&gt;200 - 400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Control logic&lt;/td&gt;
&lt;td&gt;500 - 1,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Instruction decode&lt;/td&gt;
&lt;td&gt;300 - 500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sampo CPU core&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~1,500 - 2,500&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UART (115200 baud)&lt;/td&gt;
&lt;td&gt;200 - 300&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPI controller (SD card)&lt;/td&gt;
&lt;td&gt;300 - 500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPIO controller&lt;/td&gt;
&lt;td&gt;200 - 400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Basic system&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~2,500 - 4,000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SDRAM controller&lt;/td&gt;
&lt;td&gt;500 - 1,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Instruction cache&lt;/td&gt;
&lt;td&gt;1,000 - 2,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data cache&lt;/td&gt;
&lt;td&gt;1,000 - 2,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Full system&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~6,000 - 10,000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;These estimates are based on typical RISC CPU implementations. The actual numbers will depend on optimization choices and synthesis settings.&lt;/p&gt;
&lt;h4&gt;Variant Recommendations&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;12K LUTs&lt;/strong&gt; (ULX3S-12F): Plenty for basic Sampo + peripherals, tight for caches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;45K LUTs&lt;/strong&gt; (ULX3S-45F): Comfortable. Full CPU with cache, room for experiments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;85K LUTs&lt;/strong&gt; (ULX3S-85F): Luxurious. Multi-core experiments, extensive peripherals.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Toolchain Setup&lt;/h3&gt;
&lt;p&gt;The ECP5 toolchain is fully open source:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# macOS (Homebrew)&lt;/span&gt;
brew&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;yosys&lt;span class="w"&gt; &lt;/span&gt;nextpnr-ecp5&lt;span class="w"&gt; &lt;/span&gt;ecpprog&lt;span class="w"&gt; &lt;/span&gt;fujprog

&lt;span class="c1"&gt;# Ubuntu/Debian&lt;/span&gt;
apt&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;yosys&lt;span class="w"&gt; &lt;/span&gt;nextpnr-ecp5&lt;span class="w"&gt; &lt;/span&gt;ecpprog

&lt;span class="c1"&gt;# Amaranth HDL (for our existing RTL)&lt;/span&gt;
pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;amaranth&lt;span class="w"&gt; &lt;/span&gt;amaranth-boards

&lt;span class="c1"&gt;# Or build FPGA tools from source for latest features&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://github.com/YosysHQ/yosys
git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://github.com/YosysHQ/nextpnr
git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://github.com/YosysHQ/prjtrellis
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Tool Roles&lt;/h4&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Amaranth&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Python-based HDL (generates Verilog)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Yosys&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Verilog synthesis (RTL → netlist)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;nextpnr-ecp5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Place and route (netlist → bitstream)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Project Trellis&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ECP5 bitstream documentation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ecpprog/fujprog&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Upload bitstream to board&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h4&gt;Amaranth Build Flow&lt;/h4&gt;
&lt;p&gt;Since Sampo's RTL is written in Amaranth, the build flow starts with Python:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Generate Verilog from Amaranth&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;rtl/
python&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;amaranth&lt;span class="w"&gt; &lt;/span&gt;generate&lt;span class="w"&gt; &lt;/span&gt;soc.py&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;sampo.v

&lt;span class="c1"&gt;# Then synthesize with standard tools&lt;/span&gt;
yosys&lt;span class="w"&gt; &lt;/span&gt;-p&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"synth_ecp5 -top sampo_soc -json sampo.json"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;sampo.v
nextpnr-ecp5&lt;span class="w"&gt; &lt;/span&gt;--85k&lt;span class="w"&gt; &lt;/span&gt;--package&lt;span class="w"&gt; &lt;/span&gt;CABGA381&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;--lpf&lt;span class="w"&gt; &lt;/span&gt;ulx3s.lpf&lt;span class="w"&gt; &lt;/span&gt;--json&lt;span class="w"&gt; &lt;/span&gt;sampo.json&lt;span class="w"&gt; &lt;/span&gt;--textcfg&lt;span class="w"&gt; &lt;/span&gt;sampo.config
ecppack&lt;span class="w"&gt; &lt;/span&gt;sampo.config&lt;span class="w"&gt; &lt;/span&gt;sampo.bit

&lt;span class="c1"&gt;# Program the board&lt;/span&gt;
fujprog&lt;span class="w"&gt; &lt;/span&gt;sampo.bit
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Hand-Written Verilog Implementation&lt;/h4&gt;
&lt;p&gt;In addition to the Amaranth RTL, we now have a complete ai-assisted hand-written Verilog implementation at &lt;code&gt;/verilog/&lt;/code&gt;. While Amaranth can generate Verilog, the auto-generated output isn't particularly readable. The hand-written version is designed for clarity and portability:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;verilog&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;rtl&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sampo_pkg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vh&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# Opcodes, constants, state definitions&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;alu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# 16-bit ALU with all operations&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;shifter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;# Barrel shifter (1/4/8-bit shifts, rotates)&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;regfile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;# 16 registers + alternate set (EXX)&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;# Instruction decoder&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# FSM-based CPU core (8 states)&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ram&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# 64KB synchronous RAM&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;uart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;# Simple UART for serial I/O&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;soc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# Top-level SoC integration&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tb&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;alu_tb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;# ALU unit tests&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;regfile_tb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# Register file tests&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sampo_tb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="c1"&gt;# Full system testbench&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;programs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hex&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;# Test program in Verilog hex format&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Makefile&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;&lt;span class="c1"&gt;# Build automation&lt;/span&gt;
&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;bin2hex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;# Convert sasm output to Verilog $readmemh format&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Verilog implementation uses an 8-state FSM for the CPU: RESET → FETCH → FETCH_EXT → DECODE → EXECUTE → MEMORY → WRITEBACK → HALTED. This makes timing predictable and debugging straightforward.&lt;/p&gt;
&lt;h4&gt;Simulation with Icarus Verilog&lt;/h4&gt;
&lt;p&gt;The Verilog implementation includes a complete Makefile for testing:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;verilog/

&lt;span class="c1"&gt;# Run the main simulation (hello world)&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt;

&lt;span class="c1"&gt;# Run ALU unit tests&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;test-alu

&lt;span class="c1"&gt;# Run register file tests&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;test-regfile

&lt;span class="c1"&gt;# Build with Verilator (faster simulation)&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;verilate

&lt;span class="c1"&gt;# View waveforms in GTKWave&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;wave
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Sample output from &lt;code&gt;make test&lt;/code&gt;:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c"&gt;=== Sampo CPU Testbench ===&lt;/span&gt;
&lt;span class="c"&gt;RAM init file: &lt;/span&gt;&lt;span class="nt"&gt;..&lt;/span&gt;&lt;span class="c"&gt;/programs/hello&lt;/span&gt;&lt;span class="nt"&gt;.&lt;/span&gt;&lt;span class="c"&gt;hex&lt;/span&gt;

&lt;span class="c"&gt;CPU started at PC=0x0100&lt;/span&gt;
&lt;span class="c"&gt;UART output:&lt;/span&gt;
&lt;span class="nb"&gt;----------------------------------------&lt;/span&gt;
&lt;span class="c"&gt;Hello&lt;/span&gt;&lt;span class="nt"&gt;,&lt;/span&gt;&lt;span class="c"&gt; Sampo!&lt;/span&gt;
&lt;span class="nb"&gt;----------------------------------------&lt;/span&gt;

&lt;span class="c"&gt;Simulation complete:&lt;/span&gt;
&lt;span class="c"&gt;  Final PC:    0x011E&lt;/span&gt;
&lt;span class="c"&gt;  Cycles:      847&lt;/span&gt;
&lt;span class="c"&gt;  UART chars:  14&lt;/span&gt;
&lt;span class="c"&gt;  Status:      HALTED&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Verilog version is portable to any FPGA toolchain (Xilinx, Intel, Lattice, Gowin) without requiring Amaranth or Python in the build chain.&lt;/p&gt;
&lt;h3&gt;Implementation Roadmap&lt;/h3&gt;
&lt;p&gt;With both Amaranth and Verilog implementations complete and tested in simulation, the roadmap is now about bringing them up on hardware.&lt;/p&gt;
&lt;h4&gt;Phase 1: Core Bring-up ✓ (Complete)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;✓ Instruction fetch and decode&lt;/li&gt;
&lt;li&gt;✓ ALU operations (all 16 operations)&lt;/li&gt;
&lt;li&gt;✓ Barrel shifter (1/4/8-bit shifts, rotates, RCL/RCR)&lt;/li&gt;
&lt;li&gt;✓ Register file with alternate set (EXX)&lt;/li&gt;
&lt;li&gt;✓ FSM-based CPU core (8 states)&lt;/li&gt;
&lt;li&gt;✓ RAM interface (64KB)&lt;/li&gt;
&lt;li&gt;✓ UART for serial I/O&lt;/li&gt;
&lt;li&gt;✓ SoC integration&lt;/li&gt;
&lt;li&gt;✓ Testbenches passing (ALU, regfile, full system)&lt;/li&gt;
&lt;li&gt;✓ Hello World runs in simulation&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 1.5: FPGA Bring-up (Current)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;○ ULX3S pin constraints (.lpf file)&lt;/li&gt;
&lt;li&gt;○ Clock setup (PLL from 25MHz)&lt;/li&gt;
&lt;li&gt;○ Map UART to FTDI&lt;/li&gt;
&lt;li&gt;○ LED heartbeat / debug outputs&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 2: Memory System&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;SDRAM controller for 32MB RAM&lt;/li&gt;
&lt;li&gt;Instruction cache (optional but helps timing)&lt;/li&gt;
&lt;li&gt;Basic interrupt handling&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 3: Peripherals&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;SPI controller for SD card boot&lt;/li&gt;
&lt;li&gt;GPIO controller (buttons, LEDs)&lt;/li&gt;
&lt;li&gt;Timer/counter module&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 4: Advanced Features&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Data cache&lt;/li&gt;
&lt;li&gt;MMU for memory protection&lt;/li&gt;
&lt;li&gt;HDMI text console (VGA timing → GPDI)&lt;/li&gt;
&lt;li&gt;ESP32 WiFi integration for wireless debugging&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Recommended Tools &amp;amp; Books&lt;/h3&gt;
&lt;h4&gt;Hardware&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/HBq3zf"&gt;Tang Nano 9K FPGA&lt;/a&gt; - Budget-friendly FPGA board (~$25 on Amazon, ~$15 on AliExpress)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/BYIR58"&gt;USB Logic Analyzer&lt;/a&gt; - Essential for debugging signals (24MHz, 8 channels)&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Books&lt;/h4&gt;
&lt;p&gt;If you're new to Verilog or FPGA development, these are excellent starting points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/RGjpAj"&gt;&lt;em&gt;Getting Started with FPGAs&lt;/em&gt;&lt;/a&gt; by Russell Merrick - Beginner-friendly with Verilog and VHDL examples&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/tEyX95"&gt;&lt;em&gt;Programming FPGAs: Getting Started with Verilog&lt;/em&gt;&lt;/a&gt; by Simon Monk - Practical hands-on guide&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/6qfzvC"&gt;&lt;em&gt;Verilog by Example&lt;/em&gt;&lt;/a&gt; by Blaine Readler - Concise reference for working engineers&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Resources&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/VQxLTd"&gt;Sampo on GitHub&lt;/a&gt; - Full source including assembler, emulator, and RTL&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/JUjA8C"&gt;ULX3S GitHub&lt;/a&gt; - Schematics, examples, documentation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/JLKZBr"&gt;Project Trellis&lt;/a&gt; - ECP5 bitstream documentation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/0QCVAC"&gt;Amaranth HDL&lt;/a&gt; - Python-based hardware description&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/xlX31y"&gt;nextpnr&lt;/a&gt; - Place and route tool&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/LZdP4F"&gt;Yosys&lt;/a&gt; - Verilog synthesis&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Where to Buy&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ULX3S:&lt;/strong&gt;
- &lt;a href="https://baud.rs/NClAGd"&gt;AliExpress&lt;/a&gt; - ~$100-150 depending on variant
- &lt;a href="https://baud.rs/AQB0Xg"&gt;Mouser&lt;/a&gt; - Official distribution
- &lt;a href="https://baud.rs/0gTuW6"&gt;CrowdSupply&lt;/a&gt; - Original campaign page&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tang Nano 9K (budget alternative):&lt;/strong&gt;
- &lt;a href="https://baud.rs/HBq3zf"&gt;Amazon&lt;/a&gt; - ~$25, faster shipping
- &lt;a href="https://baud.rs/9G7KR0"&gt;AliExpress&lt;/a&gt; - ~$15, slower shipping&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Next up: Getting our first instructions executing on real hardware. Both the Amaranth and Verilog implementations are ready and tested; Hello World runs in simulation and the testbenches pass. Now it's a matter of pin constraints, clock domains, and debugging the inevitable timing issues.&lt;/p&gt;</description><category>amaranth</category><category>cpu design</category><category>ecp5</category><category>fpga</category><category>hardware</category><category>lattice</category><category>open-source</category><category>risc</category><category>sampo</category><category>ulx3s</category><category>verilog</category><guid>https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html</guid><pubDate>Mon, 02 Feb 2026 18:00:00 GMT</pubDate></item><item><title>My Experience Using Fiverr for Custom PCB Design: A $468 Arduino Giga Shield</title><link>https://tinycomputers.io/posts/fiverr-pcb-design-arduino-giga-shield.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;figure&gt;&lt;img src="https://tinycomputers.io/files/arduino-giga-shield-3d-top.png"&gt;&lt;/figure&gt; &lt;p&gt;When I decided to run vintage Z80 code on a modern &lt;a href="https://baud.rs/poSQeo"&gt;Arduino Giga R1&lt;/a&gt;, I hit an immediate roadblock: voltage incompatibility. The &lt;a href="https://baud.rs/87wbBL"&gt;RetroShield Z80&lt;/a&gt; by 8-Bit Force is a fantastic piece of hardware that lets you run a real Zilog Z80 processor on an Arduino, but it's designed for the 5V-tolerant &lt;a href="https://baud.rs/CWPoOM"&gt;Arduino Mega 2560&lt;/a&gt;. The Arduino Giga R1, with its powerful STM32H747 dual-core processor and 76 GPIO pins, operates at 3.3V logic levels and can be permanently damaged by 5V signals.&lt;/p&gt;
&lt;p&gt;The solution? A custom shield with level shifters that could translate between the Giga's 3.3V world and the Z80's 5V domain. Rather than spend weeks learning PCB design software and risking amateur mistakes, I decided to outsource the work to a professional on Fiverr. Here's what that experience was like, including the full cost breakdown and everything I received.&lt;/p&gt;
&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/fiverr-pcb-design-arduino-giga-shield_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;30 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h3&gt;The Project Requirements&lt;/h3&gt;
&lt;p&gt;My requirements were relatively straightforward on the surface:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Design an Arduino Giga R1-compatible shield (matching the Giga's unique form factor)&lt;/li&gt;
&lt;li&gt;Include bidirectional level shifting from 3.3V to 5V on all relevant GPIO pins&lt;/li&gt;
&lt;li&gt;Provide pass-through headers so the RetroShield Z80 could plug in on top&lt;/li&gt;
&lt;li&gt;Use KiCad for the design (my preferred EDA tool for future modifications)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The Arduino Giga R1 is essentially a larger, more powerful cousin of the Arduino Mega 2560. It shares some pin compatibility but has a different physical layout with additional headers. The shield needed to accommodate all of this while providing level shifting for approximately 70+ digital I/O lines.&lt;/p&gt;
&lt;h3&gt;Why the Arduino Giga R1?&lt;/h3&gt;
&lt;p&gt;You might wonder why I chose the Arduino Giga R1 over other options. The &lt;a href="https://baud.rs/4gVIFO"&gt;Arduino Due&lt;/a&gt;, which I mentioned in my initial message to the designer, was my original consideration. It's also 3.3V logic and has a powerful ARM processor. However, the Giga R1 offers several compelling advantages:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Processing Power&lt;/strong&gt;: The Giga R1's STM32H747 is a dual-core Cortex-M7/M4 running at 480MHz and 240MHz respectively. This dwarfs the Due's 84MHz Cortex-M3. For running Z80 code, this extra headroom means I could potentially implement cycle-accurate emulation or run multiple virtual Z80s simultaneously.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Memory&lt;/strong&gt;: The Giga R1 has 2MB of internal flash and 1MB of RAM, plus it supports external memory. The Due has 512KB flash and 96KB RAM. More memory means I can load larger Z80 programs and implement more sophisticated peripherals.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Connectivity&lt;/strong&gt;: The Giga R1 includes WiFi and Bluetooth out of the box. Imagine running a Z80 BBS that's actually accessible over the internet, or wireless file transfers to a CP/M system. The possibilities are intriguing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Camera Support&lt;/strong&gt;: The Giga R1 has a camera connector. While seemingly unrelated to Z80 computing, it opens doors for interesting projects like OCR input devices or barcode reading peripherals.&lt;/p&gt;
&lt;p&gt;The trade-off is that the Giga R1's 3.3V logic requires level shifting for any 5V hardware, hence this project. The Mega 2560's 5V tolerance made the RetroShield plug-and-play, but I felt the Giga's advantages were worth the additional complexity.&lt;/p&gt;
&lt;h3&gt;Finding a Designer on Fiverr&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://baud.rs/dbDCgR"&gt;Fiverr's&lt;/a&gt; PCB design category has hundreds of sellers ranging from hobbyists charging \$20 to professional engineers charging \$500+. After reviewing portfolios and reading reviews, I found a designer named &lt;a href="https://baud.rs/tkQg41"&gt;Elijah&lt;/a&gt; (username: ekeziah) whose work looked professional and who specifically mentioned KiCad experience. He does PCB design, CAD, and firmware work.&lt;/p&gt;
&lt;p&gt;His base gig was priced reasonably, but I knew custom work like this would require negotiation. I reached out with my requirements:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"I have a relatively straightforward project. I need an Arduino Due/Giga form factor shield that can use level shifters to go from 3.3V to 5V. I have this: [RetroShield link] which is 5V, and instead of using an Arduino Mega 2560, which is 5V tolerant, I want to use either an Arduino Due or Giga which is not tolerant of 5V."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The designer responded quickly and we began negotiating scope and pricing.&lt;/p&gt;
&lt;h3&gt;The Cost Reality Check&lt;/h3&gt;
&lt;p&gt;Let me be transparent about the costs because this is often glossed over in "I made a thing" posts:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Initial Order (January 4, 2026)&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Base price: \$75&lt;/li&gt;
&lt;li&gt;Custom extras negotiated: \$200&lt;/li&gt;
&lt;li&gt;Fiverr service fees: \$108.67&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Subtotal: \$383.67&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The designer initially quoted \$175, then corrected himself saying it was a typo and meant \$275. We settled on a \$275 total for the custom work with a 7-10 day timeline. Fiverr's fees added roughly 28% on top.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Revision Order (January 20, 2026)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;After receiving the initial files, I realized I wanted to add version numbering and my website URL to the silkscreen. Since the designer hadn't included the KiCad source files in the first delivery (only Gerber files), I needed him to make the changes and regenerate everything.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Revision price negotiated: \$57&lt;/li&gt;
&lt;li&gt;Fiverr service fees: \$27.96&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Subtotal: \$84.96&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Total Project Cost: \$468.63&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Is this expensive? It depends on your perspective. A professional PCB design service might charge \$100-200 per hour, and this project involved creating a schematic from scratch, laying out a moderately complex board, and generating production files. Doing it myself would have taken 20-40 hours of learning and work. At that rate, the \$468 represents reasonable value, but it's definitely not pocket change for a hobby project.&lt;/p&gt;
&lt;h3&gt;The Design Process&lt;/h3&gt;
&lt;p&gt;Communication happened entirely through Fiverr's messaging system. Here's how the project unfolded:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;January 4-8: Requirements Gathering&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The designer studied the Arduino Giga R1 documentation and the RetroShield Z80 pinout. He asked clarifying questions about whether I needed level shifting on pins 22-53 (the additional digital pins on the Giga's side headers). I confirmed that yes, all pins needed level shifting to ensure full compatibility.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;January 9: Schematic Complete&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Elijah sent the first schematic PDF showing the circuit design. The approach was clean: nine TXB0108PW 8-bit bidirectional level shifter ICs, providing 72 channels of voltage translation. Each level shifter had proper decoupling capacitors and pull-up resistors on the output enable pins.&lt;/p&gt;
&lt;p&gt;The TXB0108 is a popular choice for this application because it's bidirectional; you don't need to specify which direction each pin will operate, making it ideal for GPIO that might be configured as either input or output.&lt;/p&gt;
&lt;p&gt;This is a key design decision worth understanding. Alternative level shifter approaches include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;74LVC245 bus transceivers&lt;/strong&gt;: These require a direction control pin, which adds complexity when GPIO pins change direction dynamically&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Simple resistor dividers&lt;/strong&gt;: Work for unidirectional high-to-low shifting, but not bidirectional&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;MOSFETs with pull-ups&lt;/strong&gt;: The classic BSS138 approach works well but requires one MOSFET per channel and can be slow&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dedicated level shifter ICs&lt;/strong&gt;: The TXB0108 auto-detects direction and handles both directions at high speed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The designer's choice of TXB0108 was sound; it simplifies the design and ensures the shield will work regardless of how the software configures each GPIO pin.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;January 10: Layout and Routing&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The physical layout came together quickly. The board dimensions are 155mm x 90mm, matching the Arduino Giga R1's footprint with additional space for the level shifter circuitry. The routing was done on a two-layer board, keeping things manufacturable at low-cost PCB fabs.&lt;/p&gt;
&lt;p&gt;The designer sent progress images showing the component placement with the level shifter ICs arranged along the edges of the board, close to their respective pin headers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;January 10: First Delivery&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The initial delivery included:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Gerber files (ready for PCB manufacturing)&lt;/li&gt;
&lt;li&gt;BOM (Bill of Materials) in CSV and Excel formats&lt;/li&gt;
&lt;li&gt;3D rendered images of the board&lt;/li&gt;
&lt;li&gt;Schematic PDF&lt;/li&gt;
&lt;li&gt;Component placement (CPL) file for assembly&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What was missing from this first delivery: the KiCad source files. This became important later.&lt;/p&gt;
&lt;h3&gt;The Revision Request&lt;/h3&gt;
&lt;p&gt;After examining the delivered files, I noticed two things I wanted to change:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Add "v0.1" to the board silkscreen for version tracking&lt;/li&gt;
&lt;li&gt;Add my website URL (https://tinycomputers.io/) for attribution&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;These are simple text changes, but without the KiCad source files, I couldn't make them myself. The Gerber files are essentially "compiled" output; you can view them and send them to a fab, but you can't easily edit them.&lt;/p&gt;
&lt;p&gt;I reached out to the designer:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"Is it possible for you to add something to the silkscreen? I would like to add 'v0.1' to the 'ARDUINO GIGA R1 SHIELD', so that line would be 'ARDUINO GIGA R1 SHIELD v0.1'. And then in a smaller font, directly to the right of that above text, I would like 'https://tinycomputers.io/'. I would add these things myself but I am not seeing any KiCAD source files, the only things that open in KiCAD are the Gerber files."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We negotiated \$57 for the revision, which also included finally receiving the full KiCad source files. The revision took about two days, with a quick back-and-forth to remove quotation marks from around the URL that the designer had initially added.&lt;/p&gt;
&lt;h3&gt;What I Received: The Complete Deliverables&lt;/h3&gt;
&lt;p&gt;The final delivery package was comprehensive. Here's everything included:&lt;/p&gt;
&lt;h4&gt;Source Files (SRC_FILES/)&lt;/h4&gt;
&lt;p&gt;The complete KiCad project including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AlexJ_bz_ArduinoGigaShield.kicad_pcb&lt;/code&gt; - PCB layout file&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AlexJ_bz_ArduinoGigaShield.kicad_sch&lt;/code&gt; - Schematic file&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AlexJ_bz_ArduinoGigaShield.kicad_pro&lt;/code&gt; - Project file&lt;/li&gt;
&lt;li&gt;Multiple backup ZIPs showing the design evolution&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Having the source files means I can make future modifications myself: adding features, fixing issues, or creating derivative designs.&lt;/p&gt;
&lt;h4&gt;Gerber Files (GERBER_FILES/)&lt;/h4&gt;
&lt;p&gt;Production-ready files for PCB manufacturing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Front and back copper layers (F_Cu.gbr, B_Cu.gbr)&lt;/li&gt;
&lt;li&gt;Solder mask layers (F_Mask.gbr, B_Mask.gbr)&lt;/li&gt;
&lt;li&gt;Silkscreen layers (F_Silkscreen.gbr, B_Silkscreen.gbr)&lt;/li&gt;
&lt;li&gt;Paste layers for SMD assembly (F_Paste.gbr, B_Paste.gbr)&lt;/li&gt;
&lt;li&gt;Board outline (Edge_Cuts.gbr)&lt;/li&gt;
&lt;li&gt;Drill files (PTH.drl, NPTH.drl)&lt;/li&gt;
&lt;li&gt;Gerber job file for fab house compatibility&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These files can be uploaded directly to JLCPCB, PCBWay, OSH Park, or any other PCB fabrication service.&lt;/p&gt;
&lt;h4&gt;Bill of Materials (BOM/)&lt;/h4&gt;
&lt;p&gt;Component lists in both CSV and Excel formats:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Reference&lt;/th&gt;
&lt;th&gt;Qty&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;th&gt;Part Number&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;C1-C27&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;0.1uF&lt;/td&gt;
&lt;td&gt;CC0603KRX7R9BB104&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;R1-R9&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;10K&lt;/td&gt;
&lt;td&gt;RC0603FR-0710KL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;U1-U9&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;TXB0108PW&lt;/td&gt;
&lt;td&gt;TXB0108PWR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;J1-J10&lt;/td&gt;
&lt;td&gt;Various&lt;/td&gt;
&lt;td&gt;Pin Headers&lt;/td&gt;
&lt;td&gt;DNP&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The "DNP" (Do Not Populate) entries for connectors indicate these would typically be hand-soldered rather than machine-placed, or sourced separately.&lt;/p&gt;
&lt;h4&gt;3D Renders (IMAGES/)&lt;/h4&gt;
&lt;p&gt;Professional-looking 3D renders showing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Top view with the Arduino Giga R1 mounted&lt;/li&gt;
&lt;li&gt;Bottom view showing the routing&lt;/li&gt;
&lt;li&gt;Angled perspective view&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are great for documentation and for visualizing how the final assembly will look.&lt;/p&gt;
&lt;div style="text-align: center; margin: 30px 0;"&gt;
&lt;img src="https://tinycomputers.io/arduino-giga-shield-3d-top.png" alt="Arduino Giga Shield 3D render - top view" style="max-width: 100%; border: 1px solid #ddd; border-radius: 8px;"&gt;
&lt;p style="color: #666; font-size: 12px; margin-top: 10px;"&gt;3D render showing the shield PCB with Arduino Giga R1 mounted (top view)&lt;/p&gt;
&lt;/div&gt;

&lt;div style="text-align: center; margin: 30px 0;"&gt;
&lt;img src="https://tinycomputers.io/arduino-giga-shield-3d-bottom.jpg" alt="Arduino Giga Shield 3D render - bottom view" style="max-width: 100%; border: 1px solid #ddd; border-radius: 8px;"&gt;
&lt;p style="color: #666; font-size: 12px; margin-top: 10px;"&gt;Bottom view showing the PCB routing and through-hole connections&lt;/p&gt;
&lt;/div&gt;

&lt;h4&gt;Schematic PDF (SCH_PDF/)&lt;/h4&gt;
&lt;p&gt;A beautifully laid out schematic showing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;All nine TXB0108PW level shifters with their connections&lt;/li&gt;
&lt;li&gt;Pin mapping from Arduino Giga headers to 5V output headers&lt;/li&gt;
&lt;li&gt;Power distribution (3.3V and 5V rails)&lt;/li&gt;
&lt;li&gt;Decoupling capacitor placement&lt;/li&gt;
&lt;li&gt;Four mounting holes for secure attachment&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can &lt;a href="https://tinycomputers.io/arduino-giga-shield-schematic.pdf"&gt;download the full schematic PDF here&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Reference Assets (ASSETS/)&lt;/h4&gt;
&lt;p&gt;The designer included reference materials used during the design:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Arduino Giga R1 datasheet (2MB PDF)&lt;/li&gt;
&lt;li&gt;CAD files for the Arduino Giga R1 (ABX00063)&lt;/li&gt;
&lt;li&gt;STEP files for 3D modeling&lt;/li&gt;
&lt;li&gt;DXF file of the Giga R1 outline&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are helpful for understanding the design decisions and for future reference.&lt;/p&gt;
&lt;h4&gt;Component Placement File (CPL_FILE/)&lt;/h4&gt;
&lt;p&gt;A CSV file with X/Y coordinates and rotation for each component, useful if you're having the boards assembled by a fab house rather than hand-soldering.&lt;/p&gt;
&lt;h3&gt;The Circuit Design&lt;/h3&gt;
&lt;p&gt;Looking at the schematic, the design is elegant in its simplicity. Each TXB0108PW provides 8 channels of bidirectional level shifting. With nine of these ICs, the design provides 72 channels, more than enough for all the Arduino Giga's GPIO pins.&lt;/p&gt;
&lt;p&gt;Key design elements:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Level Shifters&lt;/strong&gt;: The TXB0108PW is an 8-bit bidirectional voltage-level translator. It automatically detects the signal direction, making it perfect for GPIO that might be configured as either input or output at runtime. The A-side connects to the 3.3V Arduino Giga pins, and the B-side connects to the 5V RetroShield pins.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Decoupling Capacitors&lt;/strong&gt;: Each level shifter has a 0.1µF ceramic capacitor on both the 3.3V (VCCA) and 5V (VCCB) power pins. This is standard practice to filter high-frequency noise and ensure stable operation. With 27 capacitors total, the power rails should be rock-solid.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Output Enable Pull-ups&lt;/strong&gt;: Each TXB0108 has a 10K pull-up resistor on the OE (Output Enable) pin, tying it to 3.3V. This ensures the level shifters are always active when the Arduino is powered.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Pass-Through Headers&lt;/strong&gt;: The design includes matching pin headers on both the 3.3V and 5V sides. The Arduino Giga plugs into female headers on the bottom, while the RetroShield (or other 5V shields) can plug into male headers on top.&lt;/p&gt;
&lt;h3&gt;Lessons Learned&lt;/h3&gt;
&lt;p&gt;After going through this process, here's what I'd do differently or recommend to others:&lt;/p&gt;
&lt;h4&gt;1. Specify Source Files Upfront&lt;/h4&gt;
&lt;p&gt;Make it explicit in your initial requirements that you need the original design files (KiCad, Altium, Eagle, etc.), not just Gerber output files. This saves the cost and hassle of a revision later. Many designers consider source files an "extra" unless you ask for them.&lt;/p&gt;
&lt;h4&gt;2. Include Silkscreen Details Early&lt;/h4&gt;
&lt;p&gt;Think about what text you want on the board before the design starts. Version numbers, URLs, logos, regulatory markings: all of these are easy to add during initial design but require regenerating all files if added later.&lt;/p&gt;
&lt;h4&gt;3. Budget for Fiverr's Fees&lt;/h4&gt;
&lt;p&gt;Fiverr's service fees add roughly 25-30% to the listed price. When negotiating with a designer, account for this in your mental budget. A \$275 job becomes \$350+ after fees.&lt;/p&gt;
&lt;h4&gt;4. Communicate Frequently&lt;/h4&gt;
&lt;p&gt;Don't disappear for days at a time. Quick responses keep the project moving and help catch misunderstandings early. The designer asked good clarifying questions; make sure you answer them thoroughly.&lt;/p&gt;
&lt;h4&gt;5. Review Carefully Before Approving&lt;/h4&gt;
&lt;p&gt;Take time to review delivered files carefully. Open the Gerbers in a viewer (KiCad has a built-in Gerber viewer, or use an online tool), check the schematic for obvious errors, verify the BOM has the right components. It's much cheaper to catch issues before ordering PCBs.&lt;/p&gt;
&lt;h3&gt;Alternatives to Fiverr&lt;/h3&gt;
&lt;p&gt;Before deciding on Fiverr, I considered several alternatives:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;DIY with KiCad&lt;/strong&gt;: The open-source route. KiCad is free, powerful, and has excellent documentation. However, PCB design has a steep learning curve. Understanding design rules, proper trace widths, via sizes, clearances, and manufacturing constraints takes time. For a one-off project, the learning investment didn't seem justified.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Upwork or Other Freelance Platforms&lt;/strong&gt;: Similar to Fiverr but often with higher prices and a more traditional freelancer relationship. Upwork tends to attract more experienced (and expensive) engineers. For a small project like this, Fiverr's fixed-price gig format seemed more appropriate.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Local EE Students/Engineers&lt;/strong&gt;: Universities often have engineering students looking for small projects. This can be cheaper, but finding someone and managing the relationship takes effort. You also don't have the platform protections that Fiverr offers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PCB Design Services&lt;/strong&gt;: Companies like PCBWay and JLCPCB offer design services alongside manufacturing. These can be convenient but pricing varies widely and communication can be challenging across language barriers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Open Source Existing Designs&lt;/strong&gt;: Sometimes you can find an existing design that's close to what you need. I looked for Arduino Giga shields with level shifting but found nothing. The Giga R1 is relatively new and its unique form factor means fewer compatible shields exist.&lt;/p&gt;
&lt;p&gt;Fiverr won because of its accessibility, fixed pricing model, and the portfolio/review system that let me evaluate designers before committing.&lt;/p&gt;
&lt;h3&gt;Was It Worth It?&lt;/h3&gt;
&lt;p&gt;For my situation, absolutely. I have a professional-quality PCB design that I can manufacture, modify, and iterate on. The alternative was spending 20-40 hours learning PCB design from scratch and likely making beginner mistakes that could damage expensive hardware.&lt;/p&gt;
&lt;p&gt;The \$468 total is significant for a hobby project, but context matters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The Arduino Giga R1 costs \$90&lt;/li&gt;
&lt;li&gt;The RetroShield Z80 costs \$65&lt;/li&gt;
&lt;li&gt;PCB manufacturing will add another \$20-50 depending on quantity&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The total investment for this Z80-on-Giga project will be around \$650-700 including the shield design. That's real money, but for a unique piece of hardware that lets me run authentic Z80 code on a modern microcontroller with WiFi, Bluetooth, and a camera interface, it feels worthwhile.&lt;/p&gt;
&lt;h3&gt;A Note on Maker Culture&lt;/h3&gt;
&lt;p&gt;I'm aware that hiring someone to design a circuit board runs counter to the ethos of Maker Culture. There's something deeply satisfying about designing, building, and debugging your own hardware, learning from mistakes, understanding every trace and component choice, and earning that sense of accomplishment that comes from true DIY.&lt;/p&gt;
&lt;p&gt;Outsourcing the design felt like a shortcut, and in some ways it was. I traded the learning experience for speed and convenience.&lt;/p&gt;
&lt;p&gt;That said, I didn't stop there. After receiving the Fiverr design, I also created an alternative version of the shield myself. I used &lt;a href="https://baud.rs/Z6Oq4k"&gt;Claude Code&lt;/a&gt; to help work through the component connections and pin mappings, and &lt;a href="https://baud.rs/XRtos4"&gt;Quilter.ai&lt;/a&gt; to handle the PCB routing, an AI-powered tool that automates trace layout while respecting design rules. The result is a second design that I understand more intimately, having been involved in every decision.&lt;/p&gt;
&lt;p&gt;Once &lt;a href="https://baud.rs/youwpy"&gt;PCBWay&lt;/a&gt; manufactures both versions, I'll post a comparison of the two approaches: the professionally designed Fiverr board versus the AI-assisted DIY version. It should be an interesting look at how modern AI tools are changing what's possible for makers who want to learn by doing but also want a safety net of intelligent assistance.&lt;/p&gt;
&lt;h3&gt;Next Steps&lt;/h3&gt;
&lt;p&gt;With the design files in hand, my next steps are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Order prototype PCBs&lt;/strong&gt; from JLCPCB or PCBWay&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Source components&lt;/strong&gt; from LCSC or DigiKey (the BOM helps here)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Assemble and test&lt;/strong&gt; the prototype&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Document any issues&lt;/strong&gt; and potentially order a v0.2 revision&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Write about running Z80 code&lt;/strong&gt; on the Arduino Giga R1&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The beauty of having the KiCad source files is that if I find issues during testing, I can fix them myself and generate new Gerber files. The \$57 revision cost to get those source files has already paid for itself in peace of mind.&lt;/p&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;&lt;a href="https://baud.rs/dbDCgR"&gt;Fiverr&lt;/a&gt; can be a viable option for custom PCB design, especially for projects that are well-defined and don't require extensive back-and-forth iteration. The key is finding a competent designer, communicating clearly, and budgeting realistically for both the designer's fee and Fiverr's platform fees.&lt;/p&gt;
&lt;p&gt;My Arduino Giga R1 shield project cost \$468.63 total across two orders, more than I initially hoped to spend, but less than I would have paid for my own time learning PCB design. The deliverables were comprehensive, professional, and gave me everything I need to manufacture, modify, and document the design.&lt;/p&gt;
&lt;p&gt;If you're considering using &lt;a href="https://baud.rs/dbDCgR"&gt;Fiverr&lt;/a&gt; for PCB design, go in with realistic expectations about cost and timeline, and make sure to specify exactly what deliverables you need upfront. It might not be the cheapest option, but for a one-off custom project, it can be a reasonable trade-off between time and money.&lt;/p&gt;
&lt;p&gt;Now, if you'll excuse me, I have some prototype PCBs to order and a Z80 to make talk to an STM32.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;Coming Soon&lt;/strong&gt;: Thanks to an upcoming sponsorship from &lt;a href="https://baud.rs/youwpy"&gt;PCBWay&lt;/a&gt;, I'll be able to bring this design from KiCad files and Gerber renders into the physical world. Stay tuned for a follow-up post where I'll document the manufacturing process, assembly, and first power-on of the Arduino Giga R1 Level Shifter Shield. Will it work on the first try? Will the Z80 finally talk to the STM32? Check back to find out!&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;The files discussed in this post, including the schematic and 3D renders, are from the actual delivered project. The Arduino Giga R1 is a product of Arduino. The RetroShield Z80 is designed by 8-Bit Force and available on Tindie.&lt;/em&gt;&lt;/p&gt;</description><category>arduino</category><category>arduino giga</category><category>fiverr</category><category>hardware</category><category>kicad</category><category>level shifter</category><category>pcb design</category><category>retroshield</category><category>z80</category><guid>https://tinycomputers.io/posts/fiverr-pcb-design-arduino-giga-shield.html</guid><pubDate>Sat, 24 Jan 2026 20:00:00 GMT</pubDate></item><item><title>Building a Browser-Based Z80 Emulator for the RetroShield</title><link>https://tinycomputers.io/posts/browser-based-z80-emulator-retroshield.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;p&gt;There's something deeply satisfying about running code on vintage hardware. The blinking cursor, the deliberate pace of execution, the direct connection between your keystrokes and the machine's response. The &lt;a href="https://baud.rs/zSCNiC"&gt;RetroShield&lt;/a&gt; by Erturk Kocalar brings this experience to modern makers by allowing real vintage CPUs like the Zilog Z80 to run on Arduino boards. But what if you could experience that same feeling directly in your web browser?&lt;/p&gt;
&lt;p&gt;That's exactly what I set out to build: a complete Z80 emulator that runs RetroShield firmware in WebAssembly, complete with authentic CRT visual effects and support for multiple programming language interpreters.&lt;/p&gt;
&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/browser-based-z80-emulator-retroshield_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;32 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h3&gt;Try It Now&lt;/h3&gt;
&lt;p&gt;Select a ROM below and click "Load ROM" to start. Click on the terminal to focus it, then type to interact with the interpreter.&lt;/p&gt;
&lt;style&gt;
@font-face {
    font-family: 'Glass TTY VT220';
    src: url('https://tinycomputers.io/z80-emulator/Glass_TTY_VT220.woff') format('woff'),
         url('https://tinycomputers.io/z80-emulator/Glass_TTY_VT220.ttf') format('truetype');
    font-weight: normal;
    font-style: normal;
}

.emulator-container {
    max-width: 900px;
    margin: 0 auto;
    font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
}

.control-panel {
    display: flex;
    flex-wrap: wrap;
    gap: 10px;
    align-items: center;
    margin-bottom: 15px;
    padding: 15px;
    background: #f5f5f5;
    border-radius: 8px;
}

.control-panel label {
    font-weight: 600;
    margin-right: 5px;
}

.control-panel select {
    padding: 8px 12px;
    font-size: 14px;
    border: 1px solid #ccc;
    border-radius: 4px;
    background: white;
    min-width: 200px;
}

.control-panel button {
    padding: 8px 16px;
    font-size: 14px;
    border: none;
    border-radius: 4px;
    cursor: pointer;
    transition: background 0.2s;
}

.btn-primary {
    background: #007bff;
    color: white;
}

.btn-primary:hover {
    background: #0056b3;
}

.btn-secondary {
    background: #6c757d;
    color: white;
}

.btn-secondary:hover {
    background: #545b62;
}

.crt-container {
    position: relative;
    border-radius: 20px;
    overflow: hidden;
    box-shadow: 0 0 40px rgba(0, 255, 0, 0.15), inset 0 0 20px rgba(0, 0, 0, 0.5);
}

#terminal {
    background: #0a0a0a;
    color: #00ff00;
    font-family: 'Glass TTY VT220', "Courier New", Courier, monospace;
    font-size: 18px;
    line-height: 1.4;
    padding: 20px;
    border-radius: 20px;
    height: 400px;
    overflow-y: auto;
    white-space: pre-wrap;
    word-wrap: break-word;
    border: none;
    text-shadow: 0 0 5px rgba(0, 255, 0, 0.5), 0 0 10px rgba(0, 255, 0, 0.3);
    animation: textShadow 1.6s infinite;
    position: relative;
}

#terminal:focus {
    outline: none;
}

.crt-container::before {
    content: " ";
    display: block;
    position: absolute;
    top: 0;
    left: 0;
    bottom: 0;
    right: 0;
    background: linear-gradient(rgba(18, 16, 16, 0) 50%, rgba(0, 0, 0, 0.25) 50%),
                linear-gradient(90deg, rgba(255, 0, 0, 0.03), rgba(0, 255, 0, 0.01), rgba(0, 0, 255, 0.03));
    z-index: 2;
    background-size: 100% 4px, 3px 100%;
    pointer-events: none;
    border-radius: 20px;
}

.crt-container::after {
    content: " ";
    display: block;
    position: absolute;
    top: 0;
    left: 0;
    bottom: 0;
    right: 0;
    background: rgba(18, 16, 16, 0.1);
    opacity: 0;
    z-index: 2;
    pointer-events: none;
    animation: flicker 0.15s infinite;
    border-radius: 20px;
}

.vignette {
    position: absolute;
    top: 0;
    left: 0;
    bottom: 0;
    right: 0;
    background: radial-gradient(ellipse at center, transparent 0%, transparent 60%, rgba(0, 0, 0, 0.6) 100%);
    pointer-events: none;
    z-index: 3;
    border-radius: 20px;
}

@keyframes flicker {
    0% { opacity: 0.27861; }
    5% { opacity: 0.34769; }
    10% { opacity: 0.23604; }
    15% { opacity: 0.90626; }
    20% { opacity: 0.18128; }
    25% { opacity: 0.83891; }
    30% { opacity: 0.65583; }
    35% { opacity: 0.67807; }
    40% { opacity: 0.26559; }
    45% { opacity: 0.84693; }
    50% { opacity: 0.96019; }
    55% { opacity: 0.08594; }
    60% { opacity: 0.20313; }
    65% { opacity: 0.71988; }
    70% { opacity: 0.53455; }
    75% { opacity: 0.37288; }
    80% { opacity: 0.71428; }
    85% { opacity: 0.70419; }
    90% { opacity: 0.7003; }
    95% { opacity: 0.36108; }
    100% { opacity: 0.24387; }
}

@keyframes textShadow {
    0% { text-shadow: 0.4px 0 1px rgba(0,30,255,0.5), -0.4px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    5% { text-shadow: 2.8px 0 1px rgba(0,30,255,0.5), -2.8px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    10% { text-shadow: 0.1px 0 1px rgba(0,30,255,0.5), -0.1px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    15% { text-shadow: 0.4px 0 1px rgba(0,30,255,0.5), -0.4px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    20% { text-shadow: 3.5px 0 1px rgba(0,30,255,0.5), -3.5px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    25% { text-shadow: 1.6px 0 1px rgba(0,30,255,0.5), -1.6px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    30% { text-shadow: 0.7px 0 1px rgba(0,30,255,0.5), -0.7px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    35% { text-shadow: 3.9px 0 1px rgba(0,30,255,0.5), -3.9px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    40% { text-shadow: 3.9px 0 1px rgba(0,30,255,0.5), -3.9px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    45% { text-shadow: 2.2px 0 1px rgba(0,30,255,0.5), -2.2px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    50% { text-shadow: 0.1px 0 1px rgba(0,30,255,0.5), -0.1px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    55% { text-shadow: 2.4px 0 1px rgba(0,30,255,0.5), -2.4px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    60% { text-shadow: 2.2px 0 1px rgba(0,30,255,0.5), -2.2px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    65% { text-shadow: 2.9px 0 1px rgba(0,30,255,0.5), -2.9px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    70% { text-shadow: 0.5px 0 1px rgba(0,30,255,0.5), -0.5px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    75% { text-shadow: 1.9px 0 1px rgba(0,30,255,0.5), -1.9px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    80% { text-shadow: 0.1px 0 1px rgba(0,30,255,0.5), -0.1px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    85% { text-shadow: 0.1px 0 1px rgba(0,30,255,0.5), -0.1px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    90% { text-shadow: 3.4px 0 1px rgba(0,30,255,0.5), -3.4px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    95% { text-shadow: 2.2px 0 1px rgba(0,30,255,0.5), -2.2px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
    100% { text-shadow: 2.6px 0 1px rgba(0,30,255,0.5), -2.6px 0 1px rgba(255,0,80,0.3), 0 0 5px rgba(0,255,0,0.5); }
}

.scanline {
    position: absolute;
    width: 100%;
    height: 2px;
    background: rgba(0, 255, 0, 0.1);
    z-index: 4;
    pointer-events: none;
    animation: scanlineMove 8s linear infinite;
}

@keyframes scanlineMove {
    0% { top: 0%; }
    100% { top: 100%; }
}

.status-bar {
    display: flex;
    justify-content: space-between;
    align-items: center;
    padding: 10px 15px;
    background: #333;
    color: #fff;
    border-radius: 0 0 8px 8px;
    font-size: 13px;
    font-family: monospace;
    margin-top: -8px;
}

.status-left, .status-right {
    display: flex;
    gap: 20px;
}

.status-item {
    display: flex;
    align-items: center;
    gap: 5px;
}

.status-indicator {
    width: 10px;
    height: 10px;
    border-radius: 50%;
    background: #666;
}

.status-indicator.running {
    background: #00ff00;
    box-shadow: 0 0 5px #00ff00;
}

.status-indicator.halted {
    background: #ff0000;
}

.rom-info {
    margin-top: 20px;
    padding: 15px;
    background: #e9ecef;
    border-radius: 8px;
}

.rom-info h4 {
    margin-top: 0;
    margin-bottom: 10px;
}

.rom-info p {
    margin: 5px 0;
    font-size: 14px;
}

.keyboard-hint {
    margin-top: 15px;
    padding: 10px 15px;
    background: #fff3cd;
    border-radius: 4px;
    font-size: 13px;
    color: #856404;
}

#rom-name { font-weight: bold; }
#rom-tips { font-style: italic; margin-top: 10px; }

#cursor {
    display: inline;
    background-color: #00ff00;
    animation: blink 1s step-end infinite;
}

@keyframes blink {
    0%, 100% { opacity: 1; }
    50% { opacity: 0; }
}
&lt;/style&gt;

&lt;div class="emulator-container"&gt;
    &lt;div class="control-panel"&gt;
        &lt;label for="rom-select"&gt;ROM:&lt;/label&gt;
        &lt;select id="rom-select"&gt;
            &lt;option value=""&gt;-- Select a ROM --&lt;/option&gt;
            &lt;option value="fortran77.bin" selected&gt;Fortran 77 Interpreter&lt;/option&gt;
            &lt;option value="mint.z80.bin"&gt;MINT (Minimal Interpreter)&lt;/option&gt;
            &lt;option value="firth.z80.bin"&gt;Firth Forth&lt;/option&gt;
            &lt;option value="monty.z80.bin"&gt;Monty&lt;/option&gt;
            &lt;option value="pascal.bin"&gt;Retro Pascal&lt;/option&gt;
            &lt;option value="basic_gs47b.bin"&gt;Grant Searle BASIC&lt;/option&gt;
            &lt;option value="efex.bin"&gt;EFEX Monitor&lt;/option&gt;
        &lt;/select&gt;
        &lt;button id="btn-load" class="btn-primary"&gt;Load ROM&lt;/button&gt;
        &lt;button id="btn-reset" class="btn-secondary"&gt;Reset&lt;/button&gt;
        &lt;button id="btn-clear" class="btn-secondary"&gt;Clear&lt;/button&gt;
    &lt;/div&gt;
    &lt;div class="crt-container"&gt;
        &lt;div id="terminal" tabindex="0"&gt;&lt;span id="terminal-content"&gt;&lt;/span&gt;&lt;span id="cursor"&gt; &lt;/span&gt;&lt;/div&gt;
        &lt;div class="vignette"&gt;&lt;/div&gt;
        &lt;div class="scanline"&gt;&lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="status-bar"&gt;
        &lt;div class="status-left"&gt;
            &lt;div class="status-item"&gt;
                &lt;span class="status-indicator" id="status-indicator"&gt;&lt;/span&gt;
                &lt;span id="status-text"&gt;Idle&lt;/span&gt;
            &lt;/div&gt;
            &lt;div class="status-item"&gt;
                &lt;span&gt;PC:&lt;/span&gt;
                &lt;span id="status-pc"&gt;0000&lt;/span&gt;
            &lt;/div&gt;
        &lt;/div&gt;
        &lt;div class="status-right"&gt;
            &lt;div class="status-item"&gt;
                &lt;span&gt;Cycles:&lt;/span&gt;
                &lt;span id="status-cycles"&gt;0&lt;/span&gt;
            &lt;/div&gt;
            &lt;div class="status-item"&gt;
                &lt;span&gt;Speed:&lt;/span&gt;
                &lt;span id="status-speed"&gt;0 MHz&lt;/span&gt;
            &lt;/div&gt;
        &lt;/div&gt;
    &lt;/div&gt;
    &lt;div class="keyboard-hint"&gt;
        &lt;strong&gt;Tip:&lt;/strong&gt; Click on the terminal to focus it, then type to send input. Try loading Fortran 77 and entering: &lt;code&gt;INTEGER X&lt;/code&gt; then &lt;code&gt;X = 42&lt;/code&gt; then &lt;code&gt;WRITE(*,*) X&lt;/code&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;div class="rom-info" id="rom-info"&gt;
    &lt;h5&gt;ROM Information&lt;/h5&gt;
    &lt;p id="rom-name"&gt;&lt;/p&gt;
    &lt;p id="rom-description"&gt;Select a ROM above to load it into the emulator.&lt;/p&gt;
    &lt;p id="rom-tips"&gt;&lt;/p&gt;
&lt;/div&gt;

&lt;script type="module"&gt;
import init, { Z80Emulator } from '/z80-emulator/retro_z80_emulator.js';

const ROM_INFO = {
    'fortran77.bin': {
        name: 'Fortran 77 Interpreter',
        description: 'A Fortran 77 subset interpreter with BCD floating point, DO loops, IF/THEN/ELSE, arrays, and SUBROUTINE/FUNCTION support.',
        serial: 'acia',
        tips: 'Type PROGRAM to start a program, END to finish, then RUN to execute. Try: INTEGER X followed by X = 42 then WRITE(*,*) X'
    },
    'mint.z80.bin': {
        name: 'MINT (Minimal Interpreter)',
        description: 'A minimal stack-based interpreter similar to Forth. Very compact and efficient.',
        serial: 'acia',
        tips: 'MINT uses single-character commands. Try: 1 2 + . (adds 1+2 and prints result)'
    },
    'firth.z80.bin': {
        name: 'Firth Forth',
        description: 'A Forth interpreter by John Hardy. Full Forth implementation with standard words.',
        serial: 'acia',
        tips: 'Standard Forth commands. Try: 2 3 * . (multiplies 2*3 and prints 6)'
    },
    'monty.z80.bin': {
        name: 'Monty',
        description: 'An interpreted language by John Hardy inspired by Python and Forth.',
        serial: 'acia',
        tips: 'Try: print "Hello World" or basic arithmetic expressions.'
    },
    'pascal.bin': {
        name: 'Retro Pascal',
        description: 'A minimal Pascal interpreter in expression evaluation mode.',
        serial: 'acia',
        tips: 'Type arithmetic expressions: 2 + 3 or 10 * 4 + 2. Boolean: TRUE AND FALSE'
    },
    'basic_gs47b.bin': {
        name: 'Grant Searle BASIC',
        description: 'Microsoft BASIC adapted by Grant Searle.',
        serial: '8251',
        tips: 'Standard BASIC. Try: PRINT 2+2 or write a program with line numbers.'
    },
    'efex.bin': {
        name: 'EFEX Monitor',
        description: 'A simple machine code monitor for examining and modifying memory.',
        serial: '8251',
        tips: 'Type ? for help. Use D to dump memory, E to examine/edit.'
    }
};

let emulator = null;
let running = false;
let animationId = null;
let lastCycles = 0;
let lastTime = 0;

const terminal = document.getElementById('terminal');
const terminalContent = document.getElementById('terminal-content');
const cursor = document.getElementById('cursor');
const romSelect = document.getElementById('rom-select');
const btnLoad = document.getElementById('btn-load');
const btnReset = document.getElementById('btn-reset');
const btnClear = document.getElementById('btn-clear');
const statusIndicator = document.getElementById('status-indicator');
const statusText = document.getElementById('status-text');
const statusPC = document.getElementById('status-pc');
const statusCycles = document.getElementById('status-cycles');
const statusSpeed = document.getElementById('status-speed');
const romName = document.getElementById('rom-name');
const romDescription = document.getElementById('rom-description');
const romTips = document.getElementById('rom-tips');

function appendOutput(text) {
    let output = '';
    for (const char of text) {
        const code = char.charCodeAt(0);
        if (code === 13) {
        } else if (code === 10) {
            output += '\n';
        } else if (code === 8) {
            if (terminalContent.textContent.length &gt; 0) {
                terminalContent.textContent = terminalContent.textContent.slice(0, -1);
            }
        } else if (code &gt;= 32 &amp;&amp; code &lt; 127) {
            output += char;
        }
    }
    terminalContent.textContent += output;
    terminal.scrollTop = terminal.scrollHeight;
}

function updateStatus() {
    if (!emulator) return;
    const pc = emulator.get_pc();
    const cycles = emulator.get_cycles();
    const halted = emulator.is_halted();
    statusPC.textContent = pc.toString(16).toUpperCase().padStart(4, '0');
    statusCycles.textContent = cycles.toLocaleString();
    if (halted) {
        statusIndicator.className = 'status-indicator halted';
        statusText.textContent = 'Halted';
        running = false;
    } else if (running) {
        statusIndicator.className = 'status-indicator running';
        statusText.textContent = 'Running';
    } else {
        statusIndicator.className = 'status-indicator';
        statusText.textContent = 'Idle';
    }
    const now = performance.now();
    if (lastTime &gt; 0) {
        const elapsed = (now - lastTime) / 1000;
        const cyclesDelta = Number(cycles) - Number(lastCycles);
        if (elapsed &gt; 0) {
            const mhz = (cyclesDelta / elapsed / 1000000).toFixed(2);
            statusSpeed.textContent = mhz + ' MHz';
        }
    }
    lastCycles = cycles;
    lastTime = now;
}

function runLoop() {
    if (!running || !emulator) return;
    emulator.run(50000);
    const output = emulator.get_output_string();
    if (output.length &gt; 0) {
        appendOutput(output);
    }
    updateStatus();
    if (!emulator.is_halted()) {
        animationId = requestAnimationFrame(runLoop);
    }
}

async function loadRom(filename) {
    if (!emulator) return;
    try {
        terminalContent.textContent = 'Loading ' + filename + '...\n';
        const response = await fetch('/z80-emulator/roms/' + filename);
        if (!response.ok) {
            throw new Error('Failed to load ROM: ' + response.statusText);
        }
        const buffer = await response.arrayBuffer();
        const data = new Uint8Array(buffer);
        emulator.load_rom(data);
        const info = ROM_INFO[filename];
        if (info &amp;&amp; info.serial === '8251') {
            emulator.set_8251_mode(true);
        } else {
            emulator.set_8251_mode(false);
        }
        terminalContent.textContent = '';
        running = true;
        lastCycles = 0;
        lastTime = 0;
        runLoop();
        if (info) {
            romName.textContent = info.name;
            romDescription.textContent = info.description;
            romTips.textContent = 'Tips: ' + info.tips;
        }
    } catch (err) {
        terminalContent.textContent += 'Error: ' + err.message + '\n';
        console.error(err);
    }
}

btnLoad.addEventListener('click', function() {
    const rom = romSelect.value;
    if (rom) {
        if (animationId) {
            cancelAnimationFrame(animationId);
        }
        running = false;
        loadRom(rom);
    }
});

btnReset.addEventListener('click', function() {
    if (emulator) {
        if (animationId) {
            cancelAnimationFrame(animationId);
        }
        emulator.reset();
        terminalContent.textContent = '';
        running = true;
        lastCycles = 0;
        lastTime = 0;
        runLoop();
    }
});

btnClear.addEventListener('click', function() {
    terminalContent.textContent = '';
});

terminal.addEventListener('keydown', function(e) {
    if (!emulator || !running) return;
    if (e.key === 'Enter') {
        emulator.send_char(13);
        e.preventDefault();
    } else if (e.key === 'Backspace') {
        emulator.send_char(8);
        e.preventDefault();
    } else if (e.key.length === 1) {
        emulator.send_char(e.key.charCodeAt(0));
        e.preventDefault();
    }
});

terminal.addEventListener('paste', function(e) {
    if (!emulator || !running) return;
    const text = e.clipboardData.getData('text');
    emulator.send_string(text);
    e.preventDefault();
});

async function main() {
    try {
        await init();
        emulator = new Z80Emulator();
        terminalContent.textContent = 'Z80 Emulator ready. Select a ROM and click "Load ROM" to begin.\n';
        updateStatus();
    } catch (err) {
        terminalContent.textContent = 'Failed to initialize emulator: ' + err.message + '\n';
        console.error(err);
    }
}

main();
&lt;/script&gt;

&lt;hr&gt;
&lt;h3&gt;The RetroShield Platform&lt;/h3&gt;
&lt;p&gt;Before diving into the emulator, it's worth understanding what makes the RetroShield special. Unlike software emulators that simulate a CPU in code, the RetroShield uses a &lt;em&gt;real&lt;/em&gt; vintage microprocessor. The Z80 variant features an actual Zilog Z80 chip running at its native speed, connected to an Arduino Mega or Teensy that provides:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Memory emulation&lt;/strong&gt;: The Arduino's SRAM serves as the Z80's RAM, while program code is stored in the Arduino's flash memory&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;I/O peripherals&lt;/strong&gt;: Serial communication, typically through an emulated MC6850 ACIA or Intel 8251 USART&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Clock generation&lt;/strong&gt;: The Arduino provides the clock signal to the Z80&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This hybrid approach means you get authentic Z80 behavior - every timing quirk, every undocumented opcode - while still having the convenience of USB connectivity and easy program loading.&lt;/p&gt;
&lt;p&gt;The RetroShield is &lt;a href="https://baud.rs/dqAGfn"&gt;open source hardware&lt;/a&gt; and available on &lt;a href="https://baud.rs/7HAxS8"&gt;Tindie&lt;/a&gt;. For larger programs, the &lt;a href="https://baud.rs/ywxYHT"&gt;Teensy adapter&lt;/a&gt; expands available RAM from about 4KB to 256KB.&lt;/p&gt;
&lt;h4&gt;The Hardware Up Close&lt;/h4&gt;
&lt;p&gt;Here's my RetroShield Z80 setup with the Teensy adapter:&lt;/p&gt;
&lt;p&gt;&lt;img alt="RetroShield Z80 with Teensy adapter - overhead view" src="https://tinycomputers.io/images/retroshield2/IMG_4141.jpg" title="RetroShield Z80 mounted on Teensy adapter board"&gt;&lt;/p&gt;
&lt;p&gt;The Zilog Z80 CPU sits in the 40-pin DIP socket, with the Teensy 4.1 providing memory emulation and I/O handling beneath.&lt;/p&gt;
&lt;p&gt;&lt;img alt="RetroShield Z80 - angled view showing the Z80 chip" src="https://tinycomputers.io/images/retroshield2/IMG_4142.jpg" title="Close-up of the Z80 CPU in its socket"&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt="RetroShield Z80 - side profile" src="https://tinycomputers.io/images/retroshield2/IMG_4143.jpg" title="Side view of the RetroShield stack"&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt="RetroShield Z80 - full assembly" src="https://tinycomputers.io/images/retroshield2/IMG_4144.jpg" title="Complete RetroShield Z80 assembly"&gt;&lt;/p&gt;
&lt;p&gt;The physical hardware runs identically to the browser emulator above - the same ROMs, the same interpreters, the same authentic Z80 execution.&lt;/p&gt;
&lt;h3&gt;Why Build a Browser Emulator?&lt;/h3&gt;
&lt;p&gt;Having built several interpreters and tools for the RetroShield, I found myself constantly cycling through the development loop: edit code, compile, flash to Arduino, test, repeat. A software emulator would speed this up significantly, but I also wanted something I could share with others who might not have the hardware.&lt;/p&gt;
&lt;p&gt;WebAssembly seemed like the perfect solution. It runs at near-native speed in any modern browser, requires no installation, and can be embedded directly in a web page. Someone curious about retro computing could try out a Fortran 77 interpreter or Forth environment without buying any hardware.&lt;/p&gt;
&lt;h3&gt;Building the Emulator in Rust&lt;/h3&gt;
&lt;p&gt;I chose Rust for the emulator implementation for several reasons:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Excellent WASM support&lt;/strong&gt;: Rust's &lt;code&gt;wasm-bindgen&lt;/code&gt; and &lt;code&gt;wasm-pack&lt;/code&gt; tools make compiling to WebAssembly straightforward&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;: Rust compiles to efficient code, important for cycle-accurate emulation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The rz80 crate&lt;/strong&gt;: Andre Weissflog's &lt;a href="https://baud.rs/ST34BV"&gt;rz80&lt;/a&gt; provides a battle-tested Z80 core&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The emulator architecture is straightforward:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="err"&gt;┌─────────────────────────────────────────────────┐&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;                 &lt;/span&gt;&lt;span class="n"&gt;Web&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Browser&lt;/span&gt;&lt;span class="w"&gt;                      &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;┌───────────────────────────────────────────┐&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="n"&gt;JavaScript&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;HTML&lt;/span&gt;&lt;span class="w"&gt;                 &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Terminal&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;display&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CRT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;effects&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Keyboard&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;handling&lt;/span&gt;&lt;span class="w"&gt;                &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;loading&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="ow"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;selection&lt;/span&gt;&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;└─────────────────────┬─────────────────────┘&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;                        &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;wasm&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;bindgen&lt;/span&gt;&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;┌─────────────────────▼─────────────────────┐&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="n"&gt;Rust&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;WebAssembly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Core&lt;/span&gt;&lt;span class="w"&gt;             &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;┌─────────────┐&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;┌─────────────────────┐&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;rz80&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CPU&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;Memory&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="n"&gt;KB&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;Emulation&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;ROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;RAM&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;└─────────────┘&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;└─────────────────────┘&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;┌─────────────────────────────────────┐&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;I&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;O&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Emulation&lt;/span&gt;&lt;span class="w"&gt;                       &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MC6850&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ACIA&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ports&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="o"&gt;/$&lt;/span&gt;&lt;span class="mi"&gt;81&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Intel&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8251&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;USART&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ports&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="o"&gt;/$&lt;/span&gt;&lt;span class="mi"&gt;01&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;└─────────────────────────────────────┘&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;└───────────────────────────────────────────┘&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;
&lt;span class="err"&gt;└─────────────────────────────────────────────────┘&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Dual Serial Chip Support&lt;/h4&gt;
&lt;p&gt;One challenge was supporting ROMs that use different serial chips. The RetroShield ecosystem has two common configurations:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;MC6850 ACIA&lt;/strong&gt; (ports $80/$81): Used by many homebrew projects including MINT, Firth Forth, and my own Fortran and Pascal interpreters. The ACIA has four registers (control, status, transmit data, receive data) mapped to two ports, with separate read/write functions per port.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Intel 8251 USART&lt;/strong&gt; (ports $00/$01): Used by Grant Searle's popular BASIC port and the EFEX monitor. The 8251 is simpler with just two ports - one for data and one for control/status.&lt;/p&gt;
&lt;p&gt;The emulator detects which chip to use based on ROM metadata and configures the I/O handlers accordingly.&lt;/p&gt;
&lt;h4&gt;Memory Layout&lt;/h4&gt;
&lt;p&gt;The standard RetroShield memory map looks like this:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Address Range&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;$0000-$7FFF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;32KB&lt;/td&gt;
&lt;td&gt;ROM/RAM (program dependent)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;$8000-$FFFF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;32KB&lt;/td&gt;
&lt;td&gt;Extended RAM (Teensy adapter)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Most of my interpreters use a layout where code occupies the lower addresses and data/stack occupy higher memory. The Fortran interpreter, for example, places its program text storage at $6700 and variable storage at $7200, with the stack growing down from $8000.&lt;/p&gt;
&lt;h3&gt;The CRT Effect&lt;/h3&gt;
&lt;p&gt;No retro computing experience would be complete without the warm glow of a CRT monitor. I implemented several visual effects using pure CSS:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Scanlines&lt;/strong&gt;: A repeating gradient overlay creates the horizontal line pattern characteristic of CRT displays:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;crt-container&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nd"&gt;before&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;linear-gradient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nb"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="kt"&gt;%&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nb"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="kt"&gt;%&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;background-size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="kt"&gt;%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="kt"&gt;px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Chromatic aberration&lt;/strong&gt;: CRT displays have slight color fringing due to the electron beam hitting phosphors at angles. I simulate this with animated text shadows that shift red and blue components:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="p"&gt;@&lt;/span&gt;&lt;span class="k"&gt;keyframes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;textShadow&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;0&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;text-shadow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="kt"&gt;px&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="kt"&gt;px&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="w"&gt;                    &lt;/span&gt;&lt;span class="mf"&gt;-0.4&lt;/span&gt;&lt;span class="kt"&gt;px&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="kt"&gt;px&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;rgba&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c"&gt;/* ... animation continues */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Flicker&lt;/strong&gt;: Real CRTs had subtle brightness variations. A randomized opacity animation creates this effect without being distracting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Vignette&lt;/strong&gt;: The edges of CRT screens were typically darker than the center, simulated with a radial gradient.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The font&lt;/strong&gt;: I'm using the &lt;a href="https://baud.rs/vWwOWL"&gt;Glass TTY VT220&lt;/a&gt; font, a faithful recreation of the DEC VT220 terminal font from the 1980s. It's public domain and adds significant authenticity to the experience.&lt;/p&gt;
&lt;h3&gt;The Language Interpreters&lt;/h3&gt;
&lt;p&gt;The emulator comes pre-loaded with several language interpreters, each running as native Z80 code:&lt;/p&gt;
&lt;h4&gt;Fortran 77 Interpreter&lt;/h4&gt;
&lt;p&gt;This is my most ambitious RetroShield project: a subset of Fortran 77 running interpretively on an 8-bit CPU. It supports:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;REAL numbers&lt;/strong&gt; via BCD (Binary Coded Decimal) floating point with 8 significant digits&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;INTEGER and REAL variables&lt;/strong&gt; with implicit typing (I-N are integers)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Arrays&lt;/strong&gt; up to 3 dimensions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DO loops&lt;/strong&gt; with optional STEP&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Block IF/THEN/ELSE/ENDIF&lt;/strong&gt; statements&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SUBROUTINE and FUNCTION&lt;/strong&gt; subprograms&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Intrinsic functions&lt;/strong&gt;: ABS, MOD, INT, REAL, SQRT&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's a sample session:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;FORTRAN&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;77&lt;/span&gt; &lt;span class="n"&gt;Interpreter&lt;/span&gt; &lt;span class="n"&gt;v0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;RetroShield&lt;/span&gt; &lt;span class="n"&gt;Z80&lt;/span&gt;
&lt;span class="n"&gt;Ready&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;PROGRAM&lt;/span&gt; &lt;span class="n"&gt;FACTORIAL&lt;/span&gt;
  &lt;span class="kt"&gt;INTEGER&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;
  &lt;span class="n"&gt;N&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
  &lt;span class="n"&gt;F&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="kr"&gt;DO&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;
  &lt;span class="n"&gt;F&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;I&lt;/span&gt;
  &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="n"&gt;CONTINUE&lt;/span&gt;
  &lt;span class="n"&gt;WRITE&lt;/span&gt;&lt;span class="cm"&gt;(*,*)&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'! ='&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;F&lt;/span&gt;
  &lt;span class="kr"&gt;END&lt;/span&gt;
&lt;span class="n"&gt;Program&lt;/span&gt; &lt;span class="n"&gt;entered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="n"&gt;RUN&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;RUN&lt;/span&gt;
&lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="err"&gt;!&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5040&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The interpreter is written in C and cross-compiled with SDCC. At roughly 21KB of code, it pushes the limits of what's practical on the base RetroShield, which is why it requires the Teensy adapter.&lt;/p&gt;
&lt;h4&gt;MINT (Minimal Interpreter)&lt;/h4&gt;
&lt;p&gt;MINT is a wonderfully compact stack-based language. Each command is a single character, making it incredibly memory-efficient:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;1 2 + .&lt;/span&gt;
3
&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;: SQ D * ;&lt;/span&gt;
&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;5 SQ .&lt;/span&gt;
25
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Firth Forth&lt;/h4&gt;
&lt;p&gt;A full Forth implementation by John Hardy. Forth's stack-based paradigm and extensibility made it popular on memory-constrained systems:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;: FACTORIAL ( n -- n! ) 1 SWAP 1+ 1 DO I * LOOP ;&lt;/span&gt;
&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;7 FACTORIAL .&lt;/span&gt;
5040
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Grant Searle's BASIC&lt;/h4&gt;
&lt;p&gt;A port of Microsoft BASIC that provides the classic BASIC experience:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;Z80 BASIC Ver 4.7b
Ok
&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;10 FOR I = 1 TO 10&lt;/span&gt;
&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;20 PRINT I * I&lt;/span&gt;
&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;30 NEXT I&lt;/span&gt;
&lt;span class="k"&gt;&amp;gt; &lt;/span&gt;&lt;span class="ge"&gt;RUN&lt;/span&gt;
1
4
9
...
&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;Technical Challenges&lt;/h3&gt;
&lt;p&gt;Building this project involved solving several interesting problems:&lt;/p&gt;
&lt;h4&gt;Memory Layout Debugging&lt;/h4&gt;
&lt;p&gt;The Fortran interpreter crashed mysteriously when entering lines with statement labels. After much investigation, I discovered the CODE section had grown to overlap with the DATA section. The linker was told to place data at $5000, but code had grown past that point. The fix was updating the memory layout to give code more room:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c"&gt;# Before: code overlapped data&lt;/span&gt;
&lt;span class="nv"&gt;LDFLAGS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--data-loc&lt;span class="w"&gt; &lt;/span&gt;0x5000

&lt;span class="c"&gt;# After: proper separation&lt;/span&gt;
&lt;span class="nv"&gt;LDFLAGS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--data-loc&lt;span class="w"&gt; &lt;/span&gt;0x5500
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This kind of bug is particularly insidious because it works fine until the code grows past a certain threshold.&lt;/p&gt;
&lt;h4&gt;BCD Floating Point&lt;/h4&gt;
&lt;p&gt;Implementing floating-point math on a Z80 without hardware support is challenging. I chose BCD (Binary Coded Decimal) representation because:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Exact decimal representation&lt;/strong&gt;: No binary floating-point surprises like 0.1 + 0.2 != 0.3&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Simpler conversion&lt;/strong&gt;: Reading and printing decimal numbers is straightforward&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reasonable precision&lt;/strong&gt;: 8 BCD digits give adequate precision for an educational interpreter&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Each BCD number uses 6 bytes: 1 for sign, 1 for exponent, and 4 bytes holding 8 packed decimal digits.&lt;/p&gt;
&lt;h4&gt;Cross-Compilation with SDCC&lt;/h4&gt;
&lt;p&gt;The &lt;a href="https://baud.rs/XOHX1N"&gt;Small Device C Compiler&lt;/a&gt; (SDCC) targets Z80 and other 8-bit processors. While it's an impressive project, there are quirks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No standard library functions that assume an OS&lt;/li&gt;
&lt;li&gt;Limited optimization compared to modern compilers&lt;/li&gt;
&lt;li&gt;Memory model constraints require careful attention to data placement&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I wrote a custom &lt;code&gt;crt0.s&lt;/code&gt; startup file that initializes the stack, sets up the serial port, and calls &lt;code&gt;main()&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Running the Emulator&lt;/h3&gt;
&lt;p&gt;The emulator runs at roughly 3-4 MHz equivalent speed, depending on your browser and hardware. This is actually faster than the original Z80's typical 4 MHz, but the difference isn't noticeable for interactive use.&lt;/p&gt;
&lt;p&gt;To try it:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Visit the &lt;a href="https://tinycomputers.io/pages/z80-emulator.html"&gt;Z80 Emulator page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Select a ROM from the dropdown (try Fortran 77)&lt;/li&gt;
&lt;li&gt;Click "Load ROM"&lt;/li&gt;
&lt;li&gt;Click on the terminal to focus it&lt;/li&gt;
&lt;li&gt;Start typing!&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For Fortran, try entering a simple program:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;PROGRAM&lt;/span&gt; &lt;span class="n"&gt;HELLO&lt;/span&gt;
&lt;span class="n"&gt;WRITE&lt;/span&gt;&lt;span class="cm"&gt;(*,*)&lt;/span&gt; &lt;span class="s"&gt;'HELLO, WORLD!'&lt;/span&gt;
&lt;span class="kr"&gt;END&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Then type &lt;code&gt;RUN&lt;/code&gt; to execute it.&lt;/p&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;There's always more to do:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;More ROM support&lt;/strong&gt;: Expanding to additional retro languages like LISP, Logo, or Pilot&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Debugger integration&lt;/strong&gt;: Showing registers, memory, and allowing single-stepping&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Save/restore state&lt;/strong&gt;: Persisting the emulator state to browser storage&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mobile support&lt;/strong&gt;: Touch-friendly keyboard for tablets and phones&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Source Code and Links&lt;/h3&gt;
&lt;p&gt;Everything is open source:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/3QBPQL"&gt;Emulator source (Rust/WASM)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/Vbb4MU"&gt;Fortran interpreter source&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/dqAGfn"&gt;RetroShield hardware designs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/pX8HqT"&gt;RetroLang compiler&lt;/a&gt; - my custom language for Z80&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The RetroShield hardware is available from &lt;a href="https://baud.rs/7HAxS8"&gt;8bitforce on Tindie&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;Acknowledgments&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/zSCNiC"&gt;Erturk Kocalar&lt;/a&gt;&lt;/strong&gt; - Creator of the RetroShield platform&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/a1YDGs"&gt;Andre Weissflog&lt;/a&gt;&lt;/strong&gt; - Author of the rz80 Z80 emulator core&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/NgAY2i"&gt;Grant Searle&lt;/a&gt;&lt;/strong&gt; - Z80 BASIC and reference designs&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/XhhL5j"&gt;John Hardy&lt;/a&gt;&lt;/strong&gt; - Author of Firth Forth, MINT, and Monty&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There's something magical about running 49-year-old CPU architectures in a modern web browser. The Z80 powered countless home computers, embedded systems, and arcade games. With this emulator, that legacy is just a click away.&lt;/p&gt;</description><category>emulator</category><category>forth</category><category>fortran</category><category>retro computing</category><category>retroshield</category><category>rust</category><category>wasm</category><category>webassembly</category><category>z80</category><guid>https://tinycomputers.io/posts/browser-based-z80-emulator-retroshield.html</guid><pubDate>Tue, 09 Dec 2025 03:00:00 GMT</pubDate></item></channel></rss>