<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>TinyComputers.io (Posts about open-source)</title><link>https://tinycomputers.io/</link><description></description><atom:link href="https://tinycomputers.io/categories/open-source.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2026 A.C. Jokela 
&lt;!-- div style="width: 100%" --&gt;
&lt;a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"&gt;&lt;img alt="" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/80x15.png" /&gt; Creative Commons Attribution-ShareAlike&lt;/a&gt;&amp;nbsp;|&amp;nbsp;
&lt;!-- /div --&gt;
</copyright><lastBuildDate>Mon, 06 Apr 2026 22:13:01 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Distilled Reasoning on Strix Halo: Running a Claude-Trained Thinking Model Locally</title><link>https://tinycomputers.io/posts/distilled-reasoning-on-strix-halo-qwen35-claude-thinking.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/distilled-reasoning-on-strix-halo-qwen35-claude-thinking_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;27 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;There is a specific moment in the open-source LLM ecosystem that keeps recurring: someone takes a frontier model's outputs, uses them as training data for a smaller model, and publishes the result. The technique is called distillation, and it has been applied to coding ability, instruction following, and general knowledge. What is newer is distilling &lt;em&gt;reasoning&lt;/em&gt;—the step-by-step chain-of-thought process that models like Claude use internally when working through complex problems.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF"&gt;Jackrong's Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled&lt;/a&gt; is one of the more interesting examples. It takes the Qwen3.5-27B base model and fine-tunes it on thousands of reasoning trajectories extracted from Claude 4.6 Opus. The result is a model that exposes its thinking process through &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; tags before delivering a final answer, mimicking the extended thinking behavior that Anthropic built into Claude natively. In 4-bit quantization, the entire model fits in about sixteen gigabytes.&lt;/p&gt;
&lt;p&gt;I wanted to know two things. First, whether this kind of distilled reasoning actually works—whether a 27B model can meaningfully replicate the structured thinking of a model orders of magnitude larger. Second, whether the AMD Strix Halo APU, with its unified memory architecture and integrated RDNA 3.5 GPU, could run it at useful speeds. The answer to both turned out to be more nuanced than a simple yes or no.&lt;/p&gt;
&lt;h3&gt;The Hardware&lt;/h3&gt;
&lt;p&gt;The machine is the same &lt;a href="https://tinycomputers.io/posts/amd-ai-max+-395-system-review-a-comprehensive-analysis.html"&gt;AMD Ryzen AI MAX+ 395&lt;/a&gt; that has appeared in several previous posts. It is an APU: CPU and GPU on the same die, sharing the same pool of LPDDR5X memory. There is no PCIe bus between the processor and the graphics engine. There is no dedicated VRAM to fill up. The GPU sees roughly 65GB of addressable memory out of the system's 122GB total, which means a 16GB quantized model loads without any of the memory pressure games you play on discrete GPU setups.&lt;/p&gt;
&lt;p&gt;This matters for local LLM inference because the bottleneck for most language models is memory bandwidth, not compute. Tokens are generated one at a time, each requiring a full pass through the model's weights. The faster you can stream those weights from memory to the processing units, the faster you generate tokens. The Strix Halo's LPDDR5X provides roughly 120 GB/s of bandwidth to the unified memory pool. A discrete GPU like the RTX 4090 has 1 TB/s of bandwidth to its dedicated VRAM, but the Strix Halo never has to copy weights across a PCIe bus. For models that fit entirely in the GPU's addressable space, the unified architecture eliminates an entire class of overhead.&lt;/p&gt;
&lt;p&gt;The system runs Ollama 0.17.6, which wraps llama.cpp and provides model management and an HTTP inference API. ROCm 7.2 handles the GPU compute layer, though Ollama's GGUF inference path is primarily CPU-based with GPU offloading for specific operations. The &lt;code&gt;gfx1151&lt;/code&gt; GPU target is not yet in the mainline PyTorch or llama.cpp kernel prebuilds, so &lt;code&gt;HSA_OVERRIDE_GFX_VERSION=11.0.0&lt;/code&gt; remains necessary to map it to the closest supported target (gfx1100, Navi 31).&lt;/p&gt;
&lt;h3&gt;The Model&lt;/h3&gt;
&lt;p&gt;The model's architecture is straightforward: Qwen3.5-27B, a 27 billion parameter transformer, fine-tuned via supervised learning on structured reasoning data. What makes it interesting is the training data. The creator assembled three datasets:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning-3000x-filtered"&gt;Opus-4.6-Reasoning-3000x-filtered&lt;/a&gt;&lt;/strong&gt;: Three thousand reasoning trajectories extracted from Claude 4.6 Opus, filtered for quality.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x"&gt;claude-4.5-opus-high-reasoning-250x&lt;/a&gt;&lt;/strong&gt;: Two hundred and fifty examples of high-intensity structured reasoning from an earlier Claude version.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://huggingface.co/datasets/Jackrong/Qwen3.5-reasoning-700x"&gt;Qwen3.5-reasoning-700x&lt;/a&gt;&lt;/strong&gt;: Seven hundred step-by-step problem-solving examples.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The combined training signal teaches the model to produce output in a specific format: a &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; block containing the reasoning process, followed by a clean final answer. This is architecturally similar to what Anthropic does with Claude's extended thinking, except that Claude's thinking is a native capability of the model's training and architecture, while this is a behavior pattern learned through supervised fine-tuning on examples of that behavior.&lt;/p&gt;
&lt;p&gt;The distinction matters, and I will come back to it.&lt;/p&gt;
&lt;p&gt;The model is distributed in GGUF format, which is the standard for llama.cpp and Ollama. I used the Q4_K_M quantization, which compresses the model's weights from 16-bit floats to 4-bit integers with a mixed precision scheme that preserves more information in attention layers. The file is 15.4GB on disk. The &lt;a href="https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF"&gt;model card&lt;/a&gt; reports 29-35 tokens per second on an RTX 3090; I was curious what the Strix Halo would deliver.&lt;/p&gt;
&lt;h3&gt;Setting It Up&lt;/h3&gt;
&lt;p&gt;Getting the model running took less than ten minutes. Download the GGUF file from HuggingFace:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;mkdir&lt;span class="w"&gt; &lt;/span&gt;-p&lt;span class="w"&gt; &lt;/span&gt;~/models/qwen35-reasoning
curl&lt;span class="w"&gt; &lt;/span&gt;-L&lt;span class="w"&gt; &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;~/models/qwen35-reasoning/model.gguf&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s1"&gt;'https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF/resolve/main/Qwen3.5-27B.Q4_K_M.gguf'&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Note the filename. The HuggingFace repo is named &lt;code&gt;Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF&lt;/code&gt;, but the actual GGUF files inside follow a simpler naming scheme: &lt;code&gt;Qwen3.5-27B.Q4_K_M.gguf&lt;/code&gt;. I wasted time trying to guess the full distilled name before checking the API.&lt;/p&gt;
&lt;p&gt;Create an Ollama Modelfile that imports the local GGUF and sets inference parameters:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;FROM&lt;span class="w"&gt; &lt;/span&gt;/home/alex/models/qwen35-reasoning/model.gguf

PARAMETER&lt;span class="w"&gt; &lt;/span&gt;temperature&lt;span class="w"&gt; &lt;/span&gt;0.6
PARAMETER&lt;span class="w"&gt; &lt;/span&gt;top_p&lt;span class="w"&gt; &lt;/span&gt;0.95
PARAMETER&lt;span class="w"&gt; &lt;/span&gt;num_ctx&lt;span class="w"&gt; &lt;/span&gt;8192
PARAMETER&lt;span class="w"&gt; &lt;/span&gt;repeat_penalty&lt;span class="w"&gt; &lt;/span&gt;1.2
PARAMETER&lt;span class="w"&gt; &lt;/span&gt;stop&lt;span class="w"&gt; &lt;/span&gt;"&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;|endoftext|&amp;gt;"
PARAMETER&lt;span class="w"&gt; &lt;/span&gt;stop&lt;span class="w"&gt; &lt;/span&gt;"&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;|im_end|&amp;gt;"
PARAMETER&lt;span class="w"&gt; &lt;/span&gt;stop&lt;span class="w"&gt; &lt;/span&gt;"&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;|eot_id|&amp;gt;"

SYSTEM&lt;span class="w"&gt; &lt;/span&gt;"You&lt;span class="w"&gt; &lt;/span&gt;are&lt;span class="w"&gt; &lt;/span&gt;a&lt;span class="w"&gt; &lt;/span&gt;deep-thinking&lt;span class="w"&gt; &lt;/span&gt;AI&lt;span class="w"&gt; &lt;/span&gt;assistant.&lt;span class="w"&gt; &lt;/span&gt;For&lt;span class="w"&gt; &lt;/span&gt;complex&lt;span class="w"&gt; &lt;/span&gt;questions,
use&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;think&amp;gt;&lt;/span&gt;...&lt;span class="nt"&gt;&amp;lt;/think&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;tags&lt;span class="w"&gt; &lt;/span&gt;to&lt;span class="w"&gt; &lt;/span&gt;show&lt;span class="w"&gt; &lt;/span&gt;your&lt;span class="w"&gt; &lt;/span&gt;reasoning&lt;span class="w"&gt; &lt;/span&gt;process&lt;span class="w"&gt; &lt;/span&gt;before
providing&lt;span class="w"&gt; &lt;/span&gt;the&lt;span class="w"&gt; &lt;/span&gt;final&lt;span class="w"&gt; &lt;/span&gt;answer."
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Then:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;ollama&lt;span class="w"&gt; &lt;/span&gt;create&lt;span class="w"&gt; &lt;/span&gt;qwen35-reasoning&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;Modelfile
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Ollama copies the GGUF into its own blob store, parses the architecture metadata, and registers it as a runnable model. The whole process takes about a minute on local storage.&lt;/p&gt;
&lt;h3&gt;The Stop Token Problem&lt;/h3&gt;
&lt;p&gt;The first run produced correct output followed by infinite repetition. The model answered a calculus question perfectly, then appended "This gives us the final answer:" and repeated the entire solution, over and over, until it hit the context window limit. The previous &lt;a href="https://www.marktechpost.com/2026/03/26/a-coding-implementation-to-run-qwen3-5-reasoning-models-distilled-with-claude-style-thinking-using-gguf-and-4-bit-quantization/"&gt;MarkTechPost&lt;/a&gt; article that inspired this experiment did not mention this issue, likely because their test prompts were short enough that the repetition was not obvious.&lt;/p&gt;
&lt;p&gt;The fix is explicit stop tokens in the Modelfile. Without them, the model does not know when to stop generating. This is a common issue with GGUF models imported into Ollama without a proper chat template: the model's native end-of-sequence tokens are not being interpreted by the inference engine. Adding &lt;code&gt;&amp;lt;|endoftext|&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;|im_end|&amp;gt;&lt;/code&gt;, and &lt;code&gt;&amp;lt;|eot_id|&amp;gt;&lt;/code&gt; as stop parameters catches the three most common EOS tokens used by Qwen and Llama-family models.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;repeat_penalty&lt;/code&gt; of 1.2 provides a second layer of defense by penalizing the model for reusing recent tokens. This helps but is not sufficient on its own. Without the stop tokens, the model can produce novel-but-meaningless text that avoids exact repetition while still degenerating into nonsense. More on this shortly.&lt;/p&gt;
&lt;h3&gt;Where It Works: Structured Problems&lt;/h3&gt;
&lt;p&gt;With the stop tokens in place, the model performs well on structured mathematical and analytical problems. I gave it a calculus question: find the derivative of x³sin(x) using the product rule.&lt;/p&gt;
&lt;p&gt;The response was genuinely good. The model opened a &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; block, identified the two component functions, recalled the product rule formula, computed each derivative, and applied the rule. Then it closed the think block and produced a clean, well-formatted answer with LaTeX notation, step-by-step derivation, and a factored final form. The thinking trace was coherent and tracked the actual reasoning process. It was not filler; each line in the trace corresponded to a meaningful step.&lt;/p&gt;
&lt;p&gt;Generation speed on the Strix Halo: &lt;strong&gt;10.3 tokens per second&lt;/strong&gt;. Not fast by cloud standards, but responsive enough for interactive use. You see the thinking appear in real time, which is surprisingly useful: you can watch the model work through the problem and catch errors before it commits to a final answer.&lt;/p&gt;
&lt;p&gt;For structured problems—mathematics, code analysis, formal logic—the distilled reasoning is genuinely functional. The model identifies subproblems, works through them sequentially, and arrives at correct answers. The think tags provide transparency into the process that you do not get from a standard instruction-tuned model.&lt;/p&gt;
&lt;h3&gt;Where It Falls Apart: The River Crossing&lt;/h3&gt;
&lt;p&gt;I ran the classic &lt;a href="https://en.wikipedia.org/wiki/Wolf,_goat_and_cabbage_problem"&gt;wolf-goat-cabbage river crossing&lt;/a&gt; puzzle as a comparison test, the same prompt on both the distilled Qwen model and Claude Haiku 4.5 via the Anthropic API.&lt;/p&gt;
&lt;p&gt;Claude Haiku returned a perfect, concise seven-step solution in 2.9 seconds. Two hundred and twenty-three tokens. The answer identified the critical insight (bring the goat back on one return trip), laid out the sequence clearly, and stopped.&lt;/p&gt;
&lt;p&gt;The Qwen model started well. It correctly identified that the goat must go first, recognized the wolf-goat conflict at the destination, and identified the need to bring the goat back. Then, around step three of the solution, the model began editorializing. "Oh joy what fun times ahead us humans truly enjoy sometimes huh?!" it wrote, mid-solution. Within a few more sentences, the output had degenerated into an unbroken stream-of-consciousness rant that cascaded into a wall of increasingly disconnected words. Not repeated words—the repeat penalty prevented that—but a firehose of unique, semantically null text that continued until it filled the entire 8,192-token context window.&lt;/p&gt;
&lt;p&gt;The output was, to use a technical term, unhinged. The model went from a correct partial solution to word salad in about two hundred tokens, and there was no recovery. The stop tokens could not save it because the model was not producing any end-of-sequence markers. It had entered a mode where it was generating fluent English syntax with zero semantic content, which is exactly the kind of failure that stop tokens and repeat penalties cannot catch.&lt;/p&gt;
&lt;h3&gt;What the Comparison Reveals&lt;/h3&gt;
&lt;p&gt;The numbers tell the story concisely:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Claude Haiku 4.5&lt;/th&gt;
&lt;th&gt;Qwen3.5-27B (Strix Halo)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.9 seconds&lt;/td&gt;
&lt;td&gt;Hit 8K context limit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Speed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;75.9 tok/s&lt;/td&gt;
&lt;td&gt;~10 tok/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;223 tokens, correct&lt;/td&gt;
&lt;td&gt;Thousands of tokens, degenerated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.0009&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;But the comparison is not really about speed or cost. It is about the difference between native reasoning and distilled reasoning.&lt;/p&gt;
&lt;p&gt;Claude's extended thinking is a capability that emerges from the model's architecture and training at scale. The model has internalized what it means to reason through a problem, including knowing when to stop, when a line of reasoning is unproductive, and when to switch strategies. These are meta-cognitive skills that are extremely difficult to distill.&lt;/p&gt;
&lt;p&gt;The Qwen model learned the &lt;em&gt;format&lt;/em&gt; of reasoning—the think tags, the step-by-step structure, the pattern of stating subproblems and working through them—from three thousand examples. What it did not learn, and arguably cannot learn from supervised fine-tuning alone, is the judgment about when reasoning is going off the rails. A model that has truly internalized reasoning has implicit quality checks: it recognizes incoherence in its own output and corrects course. A model that has learned to &lt;em&gt;mimic&lt;/em&gt; reasoning produces the surface pattern without the underlying self-monitoring.&lt;/p&gt;
&lt;p&gt;This is visible in the failure mode. The model did not produce wrong reasoning. It produced &lt;em&gt;no&lt;/em&gt; reasoning. It exited the reasoning pattern entirely and entered a generation mode that had nothing to do with the problem. A model with genuine reasoning capability would have recognized the incoherence and either corrected or terminated. The distilled model had no such circuit breaker.&lt;/p&gt;
&lt;h3&gt;The Economics&lt;/h3&gt;
&lt;p&gt;The cost comparison deserves its own section because it is often cited as the primary motivation for running local models.&lt;/p&gt;
&lt;p&gt;The Claude Haiku API call cost nine-tenths of a cent. If you ran a thousand similar queries per day, you would spend about nine dollars. That is less than the electricity cost of running the Strix Halo for a day under load. The Strix Halo draws roughly 65 watts at idle and 150 watts under GPU inference load. At Minnesota's residential electricity rate of around twelve cents per kilowatt-hour, running inference eight hours a day costs about fourteen cents. But the hardware itself cost north of two thousand dollars. You would need to amortize that over thousands of hours of inference to reach cost parity with the API, and only if you value your debugging time at zero.&lt;/p&gt;
&lt;p&gt;The economic case for local inference is not about per-query cost. It is about use cases where you need unlimited queries without metering, where data cannot leave your network, or where you want to experiment with model behavior without worrying about a bill. If you are evaluating a model's failure modes by running hundreds of adversarial prompts—which is exactly what I was doing—the local model is the right tool because you are not optimizing for answer quality. You are optimizing for the freedom to explore.&lt;/p&gt;
&lt;h3&gt;The Strix Halo as an Inference Platform&lt;/h3&gt;
&lt;p&gt;Ten tokens per second for a 27B Q4 model is respectable for an APU. It is not competitive with a discrete GPU: an RTX 3090 delivers 29-35 tokens per second on the same model, roughly three times faster. But the Strix Halo was not designed to compete with discrete GPUs on raw throughput.&lt;/p&gt;
&lt;p&gt;What it offers instead is capacity. The unified memory pool means you can load models that would not fit on most consumer GPUs. A Q8_0 quantization of this same model would be 28.6GB, which exceeds the VRAM of an RTX 4090 (24GB) but fits comfortably in the Strix Halo's addressable space. You could load a 70B Q4 model (roughly 40GB) without any of the layer-splitting gymnastics required on multi-GPU setups. I have run Llama 3.1 70B Q4 on this machine, and while the generation speed drops to about 4-5 tokens per second, it runs without errors or memory pressure.&lt;/p&gt;
&lt;p&gt;For a machine that also serves as a daily desktop, development workstation, and &lt;a href="https://tinycomputers.io/posts/ltx-api.html"&gt;video generation server&lt;/a&gt; (it runs LTX-2.3 on the same hardware), the ability to casually load and test a 27B reasoning model without dedicated GPU infrastructure is the actual value proposition. You do not plan a session. You do not allocate resources. You type &lt;code&gt;ollama run qwen35-reasoning&lt;/code&gt; and it works.&lt;/p&gt;
&lt;h3&gt;Lessons for the Blog Post Reader&lt;/h3&gt;
&lt;p&gt;If you want to replicate this setup, here is what I would emphasize:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The stop tokens are non-negotiable.&lt;/strong&gt; Without explicit &lt;code&gt;&amp;lt;|endoftext|&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;|im_end|&amp;gt;&lt;/code&gt;, and &lt;code&gt;&amp;lt;|eot_id|&amp;gt;&lt;/code&gt; stop parameters in your Modelfile, the model will produce infinite output on many prompts. This is not documented in the model card and is not mentioned in the MarkTechPost article that covers this implementation. It is the single most important configuration detail.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The model is good at structured problems and bad at open-ended ones.&lt;/strong&gt; Mathematics, code analysis, formal logic—anything where the reasoning has a clear structure and a definitive endpoint—works well. Open-ended problems, creative tasks, or anything requiring sustained coherent narrative are risky. The model can degenerate catastrophically and without warning.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A repeat penalty helps but does not solve the fundamental issue.&lt;/strong&gt; Setting &lt;code&gt;repeat_penalty&lt;/code&gt; to 1.2 prevents exact repetition loops but does not prevent the semantic degeneration I observed on the river crossing problem. The model simply produces unique garbage instead of repeated garbage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Distillation captures form, not judgment.&lt;/strong&gt; The think tags are real and useful. The step-by-step reasoning format works. What is missing is the implicit self-monitoring that frontier models have: the ability to recognize when their own output has become incoherent and to course-correct. This is probably the hardest thing to distill, because it is not present in the training examples. The examples show successful reasoning. They do not show the model catching and recovering from failed reasoning, because Claude's failed reasoning attempts are filtered out before the training data is assembled.&lt;/p&gt;
&lt;h3&gt;Where This Goes&lt;/h3&gt;
&lt;p&gt;The distilled reasoning model is, despite its failure modes, genuinely interesting. The &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; tags provide a form of transparency that standard instruction-tuned models lack. When the model is working correctly—which is most of the time on appropriate tasks—you get a window into the reasoning process that helps you evaluate the answer's quality before you act on it.&lt;/p&gt;
&lt;p&gt;The failure mode is also instructive. It demonstrates, concretely, the gap between learning a behavior pattern and internalizing the capability that produces that pattern. Supervised fine-tuning on reasoning trajectories can teach a model to produce reasoning-shaped output, but it cannot, from three thousand examples, teach the model to actually reason in the way the source model does. That requires either far more training data, a different training methodology (reinforcement learning from reasoning feedback, perhaps), or simply a larger model with more capacity to internalize the underlying patterns.&lt;/p&gt;
&lt;p&gt;For now, the practical advice is: use these models for what they are good at, know their failure modes, and do not trust the output on open-ended problems without reading the thinking trace. The trace is the feature. If the trace is coherent, the answer is probably good. If the trace starts to wander, stop reading and retry.&lt;/p&gt;
&lt;p&gt;The model runs on my desk, generates ten tokens per second, costs nothing per query, and shows its work. For a sixteen-gigabyte download and ten minutes of setup time, that is a reasonable deal—as long as you know what you are buying.&lt;/p&gt;</description><category>amd</category><category>chain-of-thought</category><category>claude</category><category>distillation</category><category>gguf</category><category>inference</category><category>llm</category><category>ollama</category><category>open-source</category><category>quantization</category><category>qwen</category><category>reasoning</category><category>strix halo</category><guid>https://tinycomputers.io/posts/distilled-reasoning-on-strix-halo-qwen35-claude-thinking.html</guid><pubDate>Sun, 29 Mar 2026 14:00:00 GMT</pubDate></item><item><title>The Mathematics of PCB Trace Routing</title><link>https://tinycomputers.io/posts/the-mathematics-of-pcb-trace-routing.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/the-mathematics-of-pcb-trace-routing_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;24 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Every PCB design eventually arrives at the same moment. Components are placed. Nets are defined. The ratsnest of thin lines connecting pad to pad looks like a plate of spaghetti dropped on a cutting board. Now someone, or something, has to turn that mess into real copper traces that don't cross, don't short, and fit within the design rules. That's the routing problem.&lt;/p&gt;
&lt;p&gt;For hobbyists and professionals alike, autorouters do this work. You press a button, wait, and traces appear. But what actually happens during that wait? The answer turns out to involve some of the most elegant mathematics in computer science, and some surprisingly hard geometric constraints that no algorithm can finesse.&lt;/p&gt;
&lt;p&gt;I've been using &lt;a href="https://baud.rs/bdZw62"&gt;Freerouting&lt;/a&gt;, the open-source Specctra autorouter, for two PCB projects now: a &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;dual Z80 RetroShield&lt;/a&gt; and a &lt;a href="https://tinycomputers.io/posts/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1.html"&gt;level-shifter shield for the Arduino Giga R1&lt;/a&gt;. The second project pushed Freerouting to its limits in ways that forced me to understand how it works internally. This is what I found when I read the source code.&lt;/p&gt;
&lt;h3&gt;Not a Grid, Not a Maze&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/giga-shield/giga_shield_freerouted_top.png" alt="Freerouting result on the Giga Shield: 2-layer board with 45-degree trace routing between TSSOP-24 ICs and pin headers, rendered in pcb-rnd photo mode" style="float: right; max-width: 420px; margin: 0 0 1em 1.5em; border-radius: 4px; box-shadow: 0 4px 12px rgba(0,0,0,0.15);"&gt;
&lt;em&gt;Giga Shield routed by Freerouting in 45-degree mode. Top layer, rendered in pcb-rnd.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Most descriptions of PCB autorouting start with &lt;a href="https://baud.rs/pblLmT"&gt;Lee's maze algorithm&lt;/a&gt; from 1961. Place the board on a grid. Flood-fill from the source pad. When the wave hits the destination, backtrack along the shortest path. It's intuitive, easy to implement, and used in introductory EDA courses everywhere.&lt;/p&gt;
&lt;p&gt;Freerouting doesn't do this.&lt;/p&gt;
&lt;p&gt;Instead of discretizing the board into a grid of cells, Freerouting operates on a continuous geometric plane. The routing space is partitioned into convex polygonal regions called expansion rooms. Each room is a chunk of free space on one layer of the board, bounded by the edges of existing obstacles (traces, vias, pads) plus their clearance halos. The rooms aren't precomputed. They're generated lazily during the search, grown on demand as the router explores new areas.&lt;/p&gt;
&lt;p&gt;This is a shape-based router, sometimes called a free-space router. The distinction matters. A grid-based router's resolution is fixed: if your grid is 0.1mm, you can't route a trace at 0.05mm offset from an obstacle, even if the design rules would allow it. A shape-based router has no such limitation. It works with exact geometry (integer-valued coordinates for precision), and the routing channels it discovers are as wide or narrow as the physical clearances actually allow.&lt;/p&gt;
&lt;p&gt;Three geometry modes control the shape of the rooms:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Room Shape&lt;/th&gt;
&lt;th&gt;Allowed Trace Angles&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;90-degree&lt;/td&gt;
&lt;td&gt;Axis-aligned rectangles&lt;/td&gt;
&lt;td&gt;Horizontal, vertical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;45-degree&lt;/td&gt;
&lt;td&gt;Octagons&lt;/td&gt;
&lt;td&gt;Plus 45-degree diagonals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Any-angle&lt;/td&gt;
&lt;td&gt;General convex polygons&lt;/td&gt;
&lt;td&gt;Unrestricted&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The choice affects both routing quality and performance. Axis-aligned rectangles are fastest to compute and intersect. Octagons allow the 45-degree traces common in modern PCBs. General polygons give the router maximum freedom but at a computational cost.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/giga-shield/giga_shield_freerouted_anyangle.png" alt="Freerouting any-angle mode: traces radiate from pads at arbitrary angles rather than snapping to a 45-degree grid, showing the difference between shape-based and grid-based routing" style="float: left; max-width: 420px; margin: 0 1.5em 1em 0; border-radius: 4px; box-shadow: 0 4px 12px rgba(0,0,0,0.15);"&gt;
&lt;em&gt;Same board in any-angle mode. Traces follow direct paths instead of 45-degree snapping.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The image to the left shows the same board routed in any-angle mode. Notice how traces leave pads at arbitrary angles, following straight-line paths toward their destinations rather than snapping to a 45-degree grid. Compare this with the image above, which used the standard 45-degree octagon mode. The any-angle result has shorter total trace length but can be harder to manufacture cleanly at tight tolerances.&lt;/p&gt;
&lt;h3&gt;The A* Core&lt;/h3&gt;
&lt;p&gt;At its heart, Freerouting's search algorithm is A*, the same algorithm that drives pathfinding in video games, robot navigation, GPS routing, and network packet delivery. A* was published by Peter Hart, Nils Nilsson, and Bertram Raphael at the &lt;a href="https://baud.rs/pqg9oG"&gt;Stanford Research Institute&lt;/a&gt; in 1968. Nearly sixty years later, it remains the standard algorithm for finding &lt;a href="https://baud.rs/XEVv2I"&gt;shortest paths in weighted graphs&lt;/a&gt; where a heuristic estimate of remaining distance is available.&lt;/p&gt;
&lt;p&gt;The mathematical foundation is straightforward. A* maintains a priority queue of candidate states, each with a cost value:&lt;/p&gt;
&lt;p&gt;$$f(n) = g(n) + h(n)$$&lt;/p&gt;
&lt;p&gt;Where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;g(n)&lt;/code&gt; is the actual accumulated cost from the start to state &lt;em&gt;n&lt;/em&gt;. In PCB routing, this includes trace length, layer changes (vias), preferred-direction penalties, and any ripped-up obstacle costs.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;h(n)&lt;/code&gt; is a heuristic estimate of the remaining cost from &lt;em&gt;n&lt;/em&gt; to the destination. This must be admissible: it must never overestimate the true remaining cost.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;f(n)&lt;/code&gt; is the total estimated cost of the best path through &lt;em&gt;n&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At each step, A* pops the state with the lowest f(n) from the queue, expands its neighbors, and adds them back with updated costs. When the destination is popped, the algorithm has found the optimal path (given an admissible heuristic).&lt;/p&gt;
&lt;p&gt;The key insight is the heuristic. Without it, A* degenerates into &lt;a href="https://baud.rs/XEVv2I"&gt;Dijkstra's algorithm&lt;/a&gt;, which explores in all directions equally. A good heuristic focuses the search toward the destination. In Freerouting's case, &lt;code&gt;DestinationDistance.calculate()&lt;/code&gt; estimates the minimum cost to reach the target, accounting for both planar distance and any required layer transitions. The sorting value in the priority queue is computed as:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sorting_value&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;expansion_value&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;destination_distance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;calculate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shape_entry_middle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;layer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Where &lt;code&gt;expansion_value&lt;/code&gt; is the g(n) accumulated cost, and the distance calculation is h(n). This is textbook A*.&lt;/p&gt;
&lt;h4&gt;Why A* Works So Well&lt;/h4&gt;
&lt;p&gt;A* has a remarkable optimality guarantee. If the heuristic h(n) is admissible (never overestimates), A* is guaranteed to find the shortest path. If h(n) is also consistent (satisfying the triangle inequality: h(n) &amp;lt;= cost(n, n') + h(n') for every neighbor n'), then A* never needs to re-expand a state it has already visited. This makes it both optimal and efficient.&lt;/p&gt;
&lt;p&gt;For PCB routing, the Euclidean distance between the current position and the destination pad is a natural admissible heuristic: a straight line is always shorter than any actual route that must navigate around obstacles. Freerouting's heuristic is somewhat more sophisticated, incorporating via costs for layer transitions, but the principle is the same.&lt;/p&gt;
&lt;p&gt;The efficiency gain over brute-force search is dramatic. &lt;a href="https://baud.rs/XEVv2I"&gt;Dijkstra's algorithm&lt;/a&gt; (A* with h(n) = 0) explores states in concentric rings outward from the source. On a board with N searchable regions, it visits O(N) states. A* with a good heuristic carves a narrow corridor from source to destination, visiting far fewer states. In practice, on a moderately complex board, this is the difference between milliseconds and minutes per connection.&lt;/p&gt;
&lt;h4&gt;A* Is Everywhere&lt;/h4&gt;
&lt;p&gt;The same algorithm, with different cost functions and heuristics, solves an astonishing range of problems:&lt;/p&gt;
&lt;p&gt;Game pathfinding. Every real-time strategy game since the 1990s uses A* to move units around obstacles. The grid cells are the states, movement cost is g(n), and Manhattan or Euclidean distance to the target is h(n).&lt;/p&gt;
&lt;p&gt;GPS navigation. Road networks are weighted graphs. Edge weights are travel times. A* with geographic distance as the heuristic finds near-optimal routes across millions of road segments.&lt;/p&gt;
&lt;p&gt;Robot motion planning. A robot's configuration space (position, orientation, joint angles) is the state space. A* finds collision-free paths from one configuration to another.&lt;/p&gt;
&lt;p&gt;Natural language processing. Viterbi decoding, which finds the most likely sequence of hidden states in a Hidden Markov Model, is structurally similar to A* over a trellis graph.&lt;/p&gt;
&lt;p&gt;Puzzle solving. The 15-puzzle, &lt;a href="https://baud.rs/sKzgs4"&gt;Rubik's Cube&lt;/a&gt;, Sokoban. A* with an appropriate heuristic solves them all optimally, and the heuristic is what makes the search tractable rather than exponential.&lt;/p&gt;
&lt;p&gt;What makes A* general is the abstraction. It doesn't care whether the "states" are grid squares, road intersections, robot poses, or polygonal rooms on a PCB layer. It only needs a cost function, a heuristic, and a neighbor-expansion rule. Freerouting provides all three, with the unusual twist that its states are dynamically-computed convex polygons rather than fixed graph nodes.&lt;/p&gt;
&lt;h3&gt;But A* Only Routes One Net&lt;/h3&gt;
&lt;p&gt;Here's the catch. A* finds the optimal path for a single source-destination pair. A PCB has hundreds of nets, all competing for the same physical space. Route net A first, and it might block the optimal path for net B. Route net B first, and net A suffers instead. The quality of the overall routing depends heavily on the order in which nets are processed.&lt;/p&gt;
&lt;p&gt;Freerouting handles this with rip-up-and-reroute, a strategy from the 1970s that remains the standard approach. The idea is simple: route all nets in some initial order. When a net fails (no path exists without violating design rules), rip up one or more blocking traces and add them to a retry queue. Then try again with different priorities.&lt;/p&gt;
&lt;p&gt;The implementation in &lt;code&gt;BatchAutorouter.java&lt;/code&gt; runs multiple passes over the board. On each pass, every unrouted connection is attempted. The critical detail is how ripup decisions are made. Each existing trace has a ripup cost, and the cost increases linearly with the pass number:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;ripup_cost&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;start_ripup_costs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;passNumber&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Early passes are conservative: the router avoids tearing up existing routes. Later passes become progressively more aggressive, willing to rip up more traces to find solutions. This is a controlled escalation that prevents the router from thrashing (endlessly ripping and re-routing the same nets) while still allowing it to escape local minima.&lt;/p&gt;
&lt;p&gt;The scheduler also implements a limited form of backtracking. Every few passes, the router checks whether the board score (total unrouted connections, via count, trace length) has improved. If not, it restores a previously saved board snapshot and continues from that earlier state. This is a coarse approximation of simulated annealing: occasionally accepting a worse intermediate state to explore a different region of the solution space.&lt;/p&gt;
&lt;h4&gt;Net Ordering: The Hidden Variable&lt;/h4&gt;
&lt;p&gt;The order in which nets are routed has an outsized effect on the result. By default, Freerouting routes nets in the order they appear in the DSN file, which is typically the order they were defined in the schematic. There's no sorting by airline length, fan-out degree, or criticality. The router's source code contains a commented-out sort-by-distance that was disabled in v2.3 because it "negatively impacts convergence."&lt;/p&gt;
&lt;p&gt;This means the same board can produce different routing results depending on how the DSN file was generated. I exploited this during the Giga Shield project by writing a script (&lt;code&gt;shuffle_dsn.py&lt;/code&gt;) that generates dozens of copies of the same DSN file with randomized net ordering:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n_copies&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;shuffled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;31337&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shuffle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shuffled&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# write shuffled DSN...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Each copy routes nets in a different sequence, converging to a different local optimum. Running 128 parallel Freerouting instances across three machines (a local Mac, a 64-core server, and a 32-core workstation) explored 128 different regions of the solution space simultaneously. The best result was measurably better than any single run. This is an embarrassingly parallel optimization: each job is independent, and you keep the best answer.&lt;/p&gt;
&lt;p&gt;The takeaway: if your autorouter isn't finding a clean solution, the problem might not be the algorithm. It might be the ordering. Changing the input order is cheaper than changing the router.&lt;/p&gt;
&lt;h3&gt;The Optimization Phase&lt;/h3&gt;
&lt;p&gt;After the initial routing passes, Freerouting enters an optimization phase controlled by the &lt;code&gt;-mp&lt;/code&gt; flag. This phase iterates over every existing via and trace in the design, processing them in a left-to-right spatial scan.&lt;/p&gt;
&lt;p&gt;For each item, the optimizer:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Rips up the item's entire connection (all traces and vias for that net segment)&lt;/li&gt;
&lt;li&gt;Re-runs up to 6 passes of the A*-based autorouter on just that connection&lt;/li&gt;
&lt;li&gt;Accepts the result only if it reduces via count or total trace length&lt;/li&gt;
&lt;li&gt;Restores the previous state if the re-route was no better&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Vias are visited before traces, reflecting the priority of via reduction. Each unnecessary via adds manufacturing cost, signal integrity degradation, and parasitic capacitance. The optimizer also alternates between preferred and non-preferred trace directions on successive passes, preventing the solution from getting stuck in a directional rut.&lt;/p&gt;
&lt;p&gt;Via positions themselves are fine-tuned by a separate algorithm (&lt;code&gt;OptViaAlgo&lt;/code&gt;). For vias connecting exactly two traces, the optimizer searches for the position that minimizes the combined weighted trace length on both layers, iteratively nudging the via toward the geometric optimum.&lt;/p&gt;
&lt;p&gt;The result of the optimization phase is typically a 15-30% reduction in via count and a 10-20% reduction in total trace length compared to the initial routing. On the Giga Shield, 60 optimization passes ran for about 45 minutes and brought the via count from ~220 down to ~158.&lt;/p&gt;
&lt;h3&gt;Why Freerouting Can't Do Copper Pours&lt;/h3&gt;
&lt;p&gt;This is where the elegance of the algorithm runs headfirst into a hard architectural limit.&lt;/p&gt;
&lt;p&gt;Every non-trivial PCB has a ground net that connects to dozens or hundreds of pads. The standard solution in commercial EDA tools is a copper pour: a filled polygon that covers an entire layer (or most of it), with clearance cutouts around non-ground features and thermal relief connections to ground pads. You don't route GND with traces. You flood-fill it.&lt;/p&gt;
&lt;p&gt;Freerouting cannot do this.&lt;/p&gt;
&lt;p&gt;The limitation isn't a missing feature that could be added with a few hundred lines of code. It's structural. Freerouting's entire architecture is built around point-to-point trace routing. The maze search, the rip-up scheduler, the optimizer: they all operate on individual connections between pairs of pads. A copper pour is a fundamentally different object. It's not a path from A to B. It's a region that grows to fill available space, adapting its shape around every obstacle on the layer.&lt;/p&gt;
&lt;p&gt;In the source code, copper pours are represented as &lt;code&gt;ConductionArea&lt;/code&gt; objects with a fixed shape set at import time. When the autorouter encounters a net that already has a &lt;code&gt;ConductionArea&lt;/code&gt;, it simply returns &lt;code&gt;CONNECTED_TO_PLANE&lt;/code&gt; and considers the job done. There's no flood-fill algorithm. There's no thermal relief generation. The router expects that the EDA tool (KiCad, pcb-rnd, etc.) has already computed the pour geometry before the DSN file was exported.&lt;/p&gt;
&lt;p&gt;For foreign nets (anything that isn't the pour's net), the &lt;code&gt;ConductionArea&lt;/code&gt; is treated as a hard obstacle. Traces can't cross it. Vias can't be placed inside it. The router routes around it as if it were a solid wall. This is exactly right from a clearance perspective, but it means the router has no ability to create, modify, or extend a pour during the routing process.&lt;/p&gt;
&lt;p&gt;The practical impact is severe for boards with fine-pitch surface-mount parts. On the Giga Shield, each &lt;a href="https://baud.rs/zQqo34"&gt;SN74LVC8T245PW&lt;/a&gt; (TSSOP-24) has three GND pins at 0.65mm pitch. The gap between adjacent pads is roughly 0.25mm. A via needs approximately 0.9mm of space (drill diameter plus annular ring plus clearance). There is physically no room to place a via next to a TSSOP-24 GND pad and connect it to a GND trace on another layer. The router can see the GND pad, it can see that it needs to be connected to other GND pads, but it cannot find a valid path because there is no valid path using its vocabulary of traces and vias.&lt;/p&gt;
&lt;p&gt;A copper pour solves this trivially. The pad sits directly on (or thermally connects to) the pour polygon. No via needed. No trace routing needed. The connectivity is implied by physical overlap. But this is a concept that simply doesn't exist in Freerouting's model of the world.&lt;/p&gt;
&lt;p&gt;On the Giga Shield project, this limitation manifested as a hard floor of 5-6 unrouted GND connections that no amount of optimization could resolve. I threw 128 parallel instances at the problem across three machines. I tried 2-layer, 4-layer, and 6-layer board configurations. I wrote custom post-processing scripts to add GND vias and MST-based bottom-layer routing. None of it worked within DRC constraints. The geometry was simply too tight. We ended up solving it with a different tool entirely, which is a story for the &lt;a href="https://tinycomputers.io/posts/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1.html"&gt;next article in this series&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;The Shove Machine&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/giga-shield/giga_shield_freerouted_bottom.png" alt="Bottom layer of the Freerouting result: dense trace routing showing how the shove algorithm packs traces tightly between through-hole pin rows" style="float: right; max-width: 420px; margin: 0 0 1em 1.5em; border-radius: 4px; box-shadow: 0 4px 12px rgba(0,0,0,0.15);"&gt;
&lt;em&gt;Bottom layer. The shove algorithm packs traces tightly between through-hole pin rows.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;One of Freerouting's more sophisticated subsystems is its forced insertion with shove mechanism. When the A* search finds that the optimal path for a new trace passes through space occupied by an existing trace, the router doesn't immediately give up or rip up the obstacle. Instead, it tries to push the obstacle aside.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;ForcedPadAlgo&lt;/code&gt; and &lt;code&gt;ShoveTraceAlgo&lt;/code&gt; classes implement this recursively. When a new trace needs to go where an existing trace is, the existing trace is nudged perpendicular to the new trace's path. If that nudge collides with a third trace, the third trace is nudged too, and so on, up to a configurable recursion depth (default: 20 levels for traces, 5 for vias). Only if the shove cascade exceeds this depth does the router fall back to ripping up the blocking item.&lt;/p&gt;
&lt;p&gt;This is the routing equivalent of parallel parking in a tight spot. Instead of abandoning the space, you bump the neighboring cars just enough to fit. It produces much denser routing than a pure rip-up approach, especially on boards with tight clearances and many competing nets.&lt;/p&gt;
&lt;p&gt;After every trace insertion, a pull-tight pass (&lt;code&gt;PullTightAlgo&lt;/code&gt;) smooths and shortens all traces in the affected area. This is a local optimization that removes unnecessary corners, straightens diagonal segments, and reduces total trace length. The combination of global A* search, local shove, and pull-tight smoothing produces routing quality that is competitive with commercial autorouters.&lt;/p&gt;
&lt;h3&gt;Clearance Compensation: Geometry Trick&lt;/h3&gt;
&lt;p&gt;One implementation detail worth highlighting is how Freerouting handles clearance checking. Rather than testing "does this trace violate clearance with that via?" as a separate geometric predicate, Freerouting inflates every item's shape by its clearance value when storing it in the search tree. A trace with 0.254mm clearance is stored as a shape 0.254mm wider on each side. A via with 0.127mm clearance is stored as a circle 0.127mm larger in radius.&lt;/p&gt;
&lt;p&gt;This transforms all clearance checks into simple overlap tests. If two inflated shapes overlap in the search tree, there's a clearance violation. If they don't, there isn't. No separate clearance computation is needed during routing. The free-space rooms computed by the maze search are automatically clearance-legal by construction, because they're defined as the gaps between pre-inflated obstacles.&lt;/p&gt;
&lt;p&gt;This is an instance of the &lt;a href="https://en.wikipedia.org/wiki/Minkowski_addition"&gt;Minkowski sum&lt;/a&gt; from &lt;a href="https://baud.rs/pOehEY"&gt;computational geometry&lt;/a&gt;. The inflated obstacle shape is the Minkowski sum of the original shape and a disc of radius equal to the clearance. The free space is the complement of the union of all inflated obstacles. It's mathematically clean and computationally efficient.&lt;/p&gt;
&lt;h3&gt;Strengths and Weaknesses&lt;/h3&gt;
&lt;p&gt;After reading through the source and pushing the router to its limits, here's my honest assessment.&lt;/p&gt;
&lt;p&gt;Strengths:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Gridless geometry. The shape-based approach produces routing that uses space optimally, without the artifacts of grid snapping. Traces can be placed at any position and any angle (in the selected mode), not just on grid points.&lt;/li&gt;
&lt;li&gt;Mathematically sound core. The A* search with admissible heuristic guarantees optimal single-net routing. The rip-up-and-reroute scheduler provides a practical framework for multi-net optimization. These are well-understood algorithms with decades of theoretical backing.&lt;/li&gt;
&lt;li&gt;Shove + pull-tight. The forced insertion mechanism and post-routing optimization produce dense, clean routing that competes with commercial tools for signal traces.&lt;/li&gt;
&lt;li&gt;Reproducibility. Deterministic algorithm, text-based input/output, command-line interface. Same input always produces the same output. You can script it, parallelize it, and integrate it into CI pipelines.&lt;/li&gt;
&lt;li&gt;Open source. You can read the code, modify the cost functions, change the heuristics, rebuild for different Java versions, and understand exactly what the tool is doing. That's rare in EDA.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Weaknesses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No copper pour support. The most significant limitation. Any board with a meaningful ground net requires manual post-processing or a different tool for GND connectivity. This eliminates Freerouting from the running for most production boards with fine-pitch ICs.&lt;/li&gt;
&lt;li&gt;Single-threaded core. The maze search is inherently sequential. Multi-threading exists in the codebase but only at the item level (different connections routed by different threads), not within the search itself. On modern multi-core machines, this leaves most of the CPU idle.&lt;/li&gt;
&lt;li&gt;Net ordering sensitivity. The same board produces meaningfully different results depending on input order, with no built-in intelligence about which order is likely to be best. The disabled sort-by-distance suggests the developers tried and found it counterproductive.&lt;/li&gt;
&lt;li&gt;GUI initialization in batch mode. Freerouting's Swing UI code initializes even when running headless with &lt;code&gt;-de&lt;/code&gt;/&lt;code&gt;-do&lt;/code&gt; flags. On servers without X11, this requires xvfb or a virtual framebuffer, adding deployment complexity to what should be a pure command-line tool.&lt;/li&gt;
&lt;li&gt;Version regression. Freerouting v2.1.0 produced dramatically worse results than v1.9.0 on the same board (152 unrouted vs 6). The newer version isn't always better.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;The board you see in the images above was v0.2: nine SN74LVC8T245PW shifters, 72 channels, fully routed by Freerouting on two layers. I was ready to submit it to &lt;a href="https://baud.rs/youwpy"&gt;PCBWay&lt;/a&gt; for fabrication. Then I counted the GPIO pins one more time.&lt;/p&gt;
&lt;p&gt;The Arduino Giga R1 has 76 digital I/O pins that need level shifting, plus a handful of analog and control lines. Nine 8-channel shifters give you 72 channels. That's not enough. I was four signals short. The board needed a tenth IC, which meant reworking the layout, adding more decoupling caps, and re-routing everything. The v0.2 design that Freerouting had spent hours optimizing was going in the bin.&lt;/p&gt;
&lt;p&gt;With ten shifters instead of nine, the board got denser. The GND problem got worse. And the copper pour limitation that was already a hard floor at 5-6 unrouted connections on the 9-IC board became completely impassable on the 10-IC version. I threw 128 parallel Freerouting instances at it across three machines. I tried 2-layer, 4-layer, and 6-layer configurations. I wrote custom post-processing scripts for MST-based ground routing and copper pour stitching. None of it produced a clean board within DRC constraints.&lt;/p&gt;
&lt;p&gt;The solution came from an unexpected direction: &lt;a href="https://quilter.ai"&gt;Quilter.ai&lt;/a&gt;, an AI-powered PCB router that understands copper zones. It routed the 10-IC, 6-layer board with zero unrouted nets on the first attempt. The full story of that journey, from massively parallel Freerouting across a home lab cluster to the moment Quilter solved it in one shot, is coming in Part 2 of the &lt;a href="https://tinycomputers.io/posts/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1.html"&gt;Giga Shield redesign series&lt;/a&gt;. If the mathematics of A* is the beauty of PCB routing, the GND problem is where theory meets the physical constraints of 0.65mm-pitch IC packages, and the theory blinks first.&lt;/p&gt;
&lt;p&gt;The source code for all of this, including the board generator, the net shuffler, the parallel routing scripts, and the post-processing tools, is available in the &lt;a href="https://github.com/ajokela/giga-shield"&gt;giga-shield repository&lt;/a&gt;.&lt;/p&gt;</description><category>a-star</category><category>algorithms</category><category>autorouting</category><category>eda</category><category>freerouting</category><category>hardware</category><category>mathematics</category><category>open-source</category><category>pcb design</category><guid>https://tinycomputers.io/posts/the-mathematics-of-pcb-trace-routing.html</guid><pubDate>Sun, 15 Mar 2026 16:00:00 GMT</pubDate></item><item><title>Redesigning a PCB with Claude Code and Open-Source EDA Tools (Part 1)</title><link>https://tinycomputers.io/posts/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;20 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;div class="sponsor-widget"&gt;
&lt;div class="sponsor-widget-header"&gt;&lt;a href="https://baud.rs/youwpy"&gt;&lt;img src="https://tinycomputers.io/images/pcbway-logo.png" alt="PCBWay" style="height: 22px; vertical-align: middle; margin-right: 8px;"&gt;&lt;/a&gt; Sponsored Hardware&lt;/div&gt;
&lt;p&gt;This project was made possible by &lt;a href="https://baud.rs/youwpy"&gt;PCBWay&lt;/a&gt;, who sponsored the fabrication of the redesigned GigaShield v0.2 level converter board. PCBWay offers PCB prototyping, assembly, CNC machining, and 3D printing services, from one-off prototypes to production runs. If you have a PCB design ready to go, check them out at &lt;a href="https://baud.rs/youwpy"&gt;pcbway.com&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img id="pcb-top-img" src="https://tinycomputers.io/images/giga-shield/giga-shield-v02-top.png" alt="GigaShield v0.2 PCB top view: routed two-layer board with 9 SN74LVC8T245PW level shifters, generated with Python and autorouted with Freerouting" style="float: right; max-width: 420px; margin: 0 0 1em 1.5em; border-radius: 4px; box-shadow: 0 4px 12px rgba(0,0,0,0.15); cursor: zoom-in;"&gt;&lt;/p&gt;
&lt;div id="img-modal" class="modal" onclick="this.style.display='none'"&gt;
&lt;span class="close" onclick="document.getElementById('img-modal').style.display='none'"&gt;×&lt;/span&gt;
&lt;img class="modal-content" id="modal-img"&gt;
&lt;div id="caption"&gt;&lt;/div&gt;
&lt;/div&gt;

&lt;script&gt;
(function() {
    var img = document.getElementById('pcb-top-img');
    var modal = document.getElementById('img-modal');
    var modalImg = document.getElementById('modal-img');
    var caption = document.getElementById('caption');
    img.onclick = function() {
        modal.style.display = 'block';
        modalImg.src = this.src;
        caption.textContent = this.alt;
    };
    document.addEventListener('keydown', function(e) {
        if (e.key === 'Escape' &amp;&amp; modal.style.display === 'block') {
            modal.style.display = 'none';
        }
    });
})();
&lt;/script&gt;

&lt;p&gt;In January, I &lt;a href="https://tinycomputers.io/posts/fiverr-pcb-design-arduino-giga-shield.html"&gt;spent $468 on Fiverr&lt;/a&gt; to have a professional design an &lt;a href="https://baud.rs/poSQeo"&gt;Arduino Giga R1&lt;/a&gt; shield with level shifters. It was a good design. Nine &lt;a href="https://baud.rs/y9JJt9"&gt;TXB0108PW&lt;/a&gt; bidirectional level translators, 72 channels of 3.3V-to-5V shifting, a clean two-layer board ready for fabrication. And then I started testing it with the &lt;a href="https://baud.rs/87wbBL"&gt;RetroShield Z80&lt;/a&gt;, and the auto-sensing level shifters fell apart.&lt;/p&gt;
&lt;p&gt;The TXB0108 is a clever chip. It detects signal direction automatically, so you don't need to tell it whether a pin is input or output. For most applications, that's a feature. For a Z80 bus interface, it's a fatal flaw. During bus cycles, the Z80 tri-states its address and data lines. The outputs go high-impedance. They're not driving high or low, they're floating. The TXB0108 can't determine drive direction from a floating signal. It guesses wrong, or it doesn't drive at all, and the Arduino on the other side sees garbage. The board was blind to half of what the Z80 was doing.&lt;/p&gt;
&lt;p&gt;The fix was clear: replace the TXB0108s with &lt;a href="https://baud.rs/zQqo34"&gt;SN74LVC8T245PW&lt;/a&gt; driven level shifters. The SN74LVC8T245 has an explicit DIR pin: you tell it which direction to translate, and it does exactly that, regardless of whether the signals are being actively driven. No guessing, no ambiguity, deterministic behavior during tri-state periods. The trade-off is that you need a direction control signal for each shifter IC, but that's a small price for reliability.&lt;/p&gt;
&lt;p&gt;What wasn't clear was how to execute the redesign. I could go back to Fiverr for another $400-500. I could spend weeks learning KiCad properly. Or I could try something that had worked surprisingly well on a &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;previous project&lt;/a&gt;: use AI and open-source command-line EDA tools to design the board from a terminal, without ever opening a graphical PCB editor.&lt;/p&gt;
&lt;p&gt;This is part one of a two-part series. This piece covers the design and toolchain: how I used &lt;a href="https://baud.rs/Z6Oq4k"&gt;Claude Code&lt;/a&gt;, the gEDA ecosystem, pcb-rnd, and &lt;a href="https://baud.rs/bdZw62"&gt;Freerouting&lt;/a&gt; to go from a failed design to production-ready Gerber files. Part two will cover the physical boards, assembly, and testing against the Z80.&lt;/p&gt;
&lt;h3&gt;The Toolchain Problem&lt;/h3&gt;
&lt;p&gt;The original Fiverr design was done in KiCad 9.0. My first instinct was to modify it directly: swap the TXB0108 footprints for SN74LVC8T245, update the pin mappings, add the DIR control header, and re-route. But there was a problem. My preferred command-line PCB tool, &lt;a href="https://baud.rs/1J64T5"&gt;pcb-rnd&lt;/a&gt;, is version 3.1.4 on Ubuntu. KiCad 9.0 uses a file format version (20241229) that pcb-rnd's &lt;code&gt;io_kicad&lt;/code&gt; plugin doesn't support. When I tried to open the KiCad PCB:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;unexpected layout version number (perhaps too new)
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Hard stop. No conversion path exists from KiCad 9.0 to pcb-rnd. The formats aren't just different versions. KiCad's S-expression format and pcb-rnd's text-based format are fundamentally different syntaxes.&lt;/p&gt;
&lt;p&gt;I could have started KiCad and used its GUI. But I'd already proven to myself with the &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;dual Z80 RetroShield project&lt;/a&gt; that text-based, AI-assisted PCB workflows are not only possible but sometimes preferable. The gEDA/pcb-rnd file format is human-readable. AI can parse it, reason about it, and generate it. A Python script can manipulate it. You can &lt;code&gt;diff&lt;/code&gt; two boards and see exactly what changed. None of that is true for a graphical-only workflow.&lt;/p&gt;
&lt;p&gt;So the plan became: extract everything useful from the KiCad source files, then rebuild the board from scratch in pcb-rnd's native format using Python. Sound insane? It kind of is. But it worked.&lt;/p&gt;
&lt;h3&gt;Extracting the DNA&lt;/h3&gt;
&lt;p&gt;Even though pcb-rnd couldn't read the KiCad files directly, the KiCad files contained all the design intelligence I needed. Component positions, net assignments, pin mappings, board dimensions. It was all there, just in a format I couldn't import.&lt;/p&gt;
&lt;p&gt;KiCad's CLI tools (&lt;code&gt;kicad-cli&lt;/code&gt;) could export what I needed:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Component positions (X, Y, rotation for each part)&lt;/span&gt;
kicad-cli&lt;span class="w"&gt; &lt;/span&gt;pcb&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pos&lt;span class="w"&gt; &lt;/span&gt;AlexJ_bz_ArduinoGigaShield.kicad_pcb&lt;span class="w"&gt; &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;giga_pos.csv

&lt;span class="c1"&gt;# Netlist connectivity&lt;/span&gt;
kicad-cli&lt;span class="w"&gt; &lt;/span&gt;pcb&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;ipc2581&lt;span class="w"&gt; &lt;/span&gt;AlexJ_bz_ArduinoGigaShield.kicad_pcb&lt;span class="w"&gt; &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;giga_netlist.d356
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The schematic file (&lt;code&gt;AlexJ_bz_ArduinoGigaShield.kicad_sch&lt;/code&gt;) was an S-expression text file I could parse to extract the signal mappings: which Giga pin connects to which 5V header pin through which level shifter channel. This was the most critical piece: getting the net assignments wrong would mean the board physically connects but logically doesn't work.&lt;/p&gt;
&lt;p&gt;This is where Claude Code earned its keep. I described the KiCad schematic structure and asked it to help me parse out the signal mappings. The KiCad schematic uses hierarchical sheets with positional net connections, which isn't the simplest format to work with manually, but straightforward for an AI that can read S-expressions and track net names across sheets. Within an hour, I had a complete mapping of all 72 signal channels across the 9 shifter ICs.&lt;/p&gt;
&lt;h3&gt;Generating the Board with Python&lt;/h3&gt;
&lt;p&gt;With positions and nets extracted, I wrote &lt;code&gt;build_giga_shield.py&lt;/code&gt;, a single Python script that generates the entire pcb-rnd board from scratch. No GUI involved. Every component footprint, every pin, every net connection is defined programmatically.&lt;/p&gt;
&lt;p&gt;The script is structured around four generator functions:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;tssop24_element()&lt;/code&gt;&lt;/strong&gt; generates the SN74LVC8T245PW footprint. TSSOP-24 is a precise geometry: 0.65mm pin pitch, 6.4mm pad-to-pad span, 24 pins. The function calculates pad positions mathematically: 12 pins on the left, 12 on the right, with pin 1 marked as square per convention. Getting the pin numbering right was critical. The SN74LVC8T245's datasheet shows pins 1-12 on the left (DIR, A1-A4, GND, A5-A8, OE#, GND) and pins 13-24 on the right counting bottom-to-top (B8-B5, VCCB, B4-B1, VCCA, VCCA, VCCB).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;pin_header_element()&lt;/code&gt;&lt;/strong&gt; handles through-hole pin headers with rotation support. The Arduino Giga R1 has an unusual form factor: the long pin headers run along the board edges horizontally, not vertically. In the original KiCad design, these were placed with 90-degree or -90-degree rotation. Without matching that rotation, a 26-pin header at y=84mm would extend 63.5mm downward to y=148mm, well past the 90mm board edge. The rotation transform was simple once identified:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;rotate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rot&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;rot&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;smd_0603_element()&lt;/code&gt;&lt;/strong&gt; creates the 0603 footprint shared by all 27 decoupling capacitors and 9 pull-down resistors. Small SMD parts, simple geometry.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;mounting_hole_element()&lt;/code&gt;&lt;/strong&gt; places the four 3.2mm mounting holes that align with the Arduino Giga's standoff positions.&lt;/p&gt;
&lt;p&gt;The coordinate system was the trickiest part. KiCad uses an arbitrary origin; in this design, x=106mm, y=30.5mm. pcb-rnd uses (0,0). Every KiCad coordinate had to be translated:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;KX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;106.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;30.5&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;kpos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ky&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;kx&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;KX&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;mm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ky&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;KY&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;build_pcb()&lt;/code&gt; function ties everything together: place components, assign nets, build the symbol table, generate the layer stack, and write out a valid pcb-rnd &lt;code&gt;.pcb&lt;/code&gt; file. Running the script produces a complete, unrouted board: components placed, netlist defined, silkscreen text positioned, board outline drawn. Ready for routing.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;$&lt;span class="w"&gt; &lt;/span&gt;python3&lt;span class="w"&gt; &lt;/span&gt;build_giga_shield.py
Generated&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb
Board:&lt;span class="w"&gt; &lt;/span&gt;155mm&lt;span class="w"&gt; &lt;/span&gt;x&lt;span class="w"&gt; &lt;/span&gt;90mm
9x&lt;span class="w"&gt; &lt;/span&gt;SN74LVC8T245PW&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;TSSOP-24&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;level&lt;span class="w"&gt; &lt;/span&gt;shifters
DIR&lt;span class="w"&gt; &lt;/span&gt;control&lt;span class="w"&gt; &lt;/span&gt;via&lt;span class="w"&gt; &lt;/span&gt;J11&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;1x10&lt;span class="w"&gt; &lt;/span&gt;header&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;The Format Wars&lt;/h3&gt;
&lt;p&gt;Getting pcb-rnd to actually accept the generated file was its own adventure. pcb-rnd's parser is strict about things that look optional in the documentation, and its error messages are sometimes misleading. An error in an Element definition might be reported as a syntax error in the Layer section fifty lines later.&lt;/p&gt;
&lt;p&gt;Three format issues bit me hardest:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The &lt;code&gt;"smd"&lt;/code&gt; flag.&lt;/strong&gt; I initially generated elements with &lt;code&gt;Element["smd" "TSSOP24" "U1" ...]&lt;/code&gt;, which seemed logical for surface-mount parts. pcb-rnd rejected it with "Unknown flag: smd ignored," which cascaded into a complete parse failure. The fix: use an empty string &lt;code&gt;Element["" "TSSOP24" "U1" ...]&lt;/code&gt;. The SMD-ness is implicit from using &lt;code&gt;Pad[]&lt;/code&gt; entries instead of &lt;code&gt;Pin[]&lt;/code&gt; entries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Bare zeros.&lt;/strong&gt; pcb-rnd is inconsistent about whether &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;0nm&lt;/code&gt; are interchangeable. In some contexts, bare &lt;code&gt;0&lt;/code&gt; works fine. In others, it causes a silent parse error that manifests as a syntax error dozens of lines later. The defensive fix: always use &lt;code&gt;0nm&lt;/code&gt;, never bare &lt;code&gt;0&lt;/code&gt;, everywhere.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Missing flags on Layer lines.&lt;/strong&gt; The &lt;code&gt;Line[]&lt;/code&gt; entry inside Layer blocks needs 7 fields, not 6. The seventh is a flags string like &lt;code&gt;"clearline"&lt;/code&gt;. My generator omitted it, producing &lt;code&gt;Line[x1 y1 x2 y2 thickness clearance]&lt;/code&gt;. The parser's error message: &lt;code&gt;syntax error, unexpected ']', expecting INTEGER or STRING&lt;/code&gt;, reported at the layer definition, not at the malformed line.&lt;/p&gt;
&lt;p&gt;I found these bugs using a binary search approach, truncating the file with &lt;code&gt;head -N&lt;/code&gt; and testing each truncation point until I isolated which section introduced the failure. It's crude but effective when error reporting is unhelpful. Claude Code helped enormously here. I'd paste the error and the surrounding file content, and it would spot the structural issue faster than I could.&lt;/p&gt;
&lt;h3&gt;The pcb-rnd Ecosystem&lt;/h3&gt;
&lt;p&gt;For anyone unfamiliar with the tools involved, a brief orientation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;gEDA&lt;/strong&gt; (GNU Electronic Design Automation) is a suite of open-source tools for electronic design. The original project dates to the late 1990s and includes &lt;code&gt;gschem&lt;/code&gt; (schematic capture), &lt;code&gt;pcb&lt;/code&gt; (PCB layout), and various utilities. The file formats are text-based and human-readable, a deliberate design choice that makes them scriptable and version-control-friendly. The original &lt;code&gt;pcb&lt;/code&gt; program is now deprecated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;pcb-rnd&lt;/strong&gt; is the actively maintained successor to gEDA's &lt;code&gt;pcb&lt;/code&gt; program. It reads and writes the same text-based PCB format, but adds modern features: more export formats, better plugin support, and critically for this project, command-line export of Gerber files, PNG renderings, and Specctra DSN files. It runs on Linux (packaged for Ubuntu) but not macOS, which is why I ran it over SSH on a remote machine throughout this project.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Freerouting&lt;/strong&gt; is a Java-based autorouter that speaks the Specctra DSN/SES interchange format. You feed it a board definition with components and nets but no traces, and it computes the copper routing, finding paths for every net while respecting design rules for trace width, clearance, and via placement. It's the open-source standard for PCB autorouting and has been used in production for decades.&lt;/p&gt;
&lt;p&gt;The workflow chains these tools together:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;build_giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pcb&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pcb&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;rnd&lt;/span&gt; &lt;span class="n"&gt;DSN&lt;/span&gt; &lt;span class="n"&gt;export&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                     &lt;span class="n"&gt;giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dsn&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                   &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Freerouting&lt;/span&gt; &lt;span class="n"&gt;autorouter&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                     &lt;span class="n"&gt;giga_shield&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ses&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
              &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pcb&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;rnd&lt;/span&gt; &lt;span class="n"&gt;SES&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;Gerber&lt;/span&gt; &lt;span class="n"&gt;export&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                            &lt;span class="err"&gt;↓&lt;/span&gt;
                    &lt;span class="n"&gt;Production&lt;/span&gt; &lt;span class="n"&gt;files&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Every step is a command-line operation. Every intermediate file is text. Every transformation is reproducible. Change a component position in the Python script, re-run the pipeline, get new Gerber files. This is the power of text-based EDA: the entire design is version-controlled, diffable, and automatable.&lt;/p&gt;
&lt;h3&gt;Autorouting: The Machine Does the Tedious Part&lt;/h3&gt;
&lt;p&gt;With the board generated and validated in pcb-rnd, the next step was routing: connecting all 308 nets with actual copper traces across a two-layer board. This is where Freerouting comes in.&lt;/p&gt;
&lt;p&gt;The pipeline starts with exporting the unrouted board to Specctra DSN format. pcb-rnd handles this in batch mode on the remote Linux machine:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;dsn&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The DSN file contains the board geometry, component placements, pad definitions, and netlist, everything the autorouter needs to compute a routing solution. One subtlety I learned the hard way: the DSN's &lt;code&gt;(structure)&lt;/code&gt; section needs explicit &lt;code&gt;(rule)&lt;/code&gt; and &lt;code&gt;(via)&lt;/code&gt; definitions. pcb-rnd's DSN exporter puts the design rules inside the net class section, but Freerouting also expects them in the structure section. Without them, the router can see the nets but can't figure out what trace widths and via sizes are legal, and it silently fails to route most connections. A two-line addition fixed this:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;(via pstk_1)
(rule
  (width 0.254)
  (clearance 0.254)
)
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Freerouting itself is a Java application with both GUI and command-line modes. On my machine, I'm running a custom build from source. The current &lt;code&gt;main&lt;/code&gt; branch had a few issues I had to fix (a missing &lt;code&gt;static&lt;/code&gt; on the main method, a null pointer on &lt;code&gt;maxThreads&lt;/code&gt; in the GUI initialization, and a Gradle build compatibility issue). The v1.9 codepath was more reliable for headless routing:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;java&lt;span class="w"&gt; &lt;/span&gt;-jar&lt;span class="w"&gt; &lt;/span&gt;freerouting-1.9.0-executable.jar&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-de&lt;span class="w"&gt; &lt;/span&gt;giga_shield.dsn&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-do&lt;span class="w"&gt; &lt;/span&gt;giga_shield.ses
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The autorouter loaded the 308-net board, ran through its passes, and produced a Specctra Session file containing 2911 wire segments and 172 vias. Every net connected. Every design rule satisfied. The routing took about 10 seconds for initial placement followed by optimization passes.&lt;/p&gt;
&lt;video controls autoplay loop muted playsinline style="max-width: 100%; border-radius: 4px; box-shadow: 0 4px 12px rgba(0,0,0,0.15); margin: 1em 0;"&gt;
  &lt;source src="https://tinycomputers.io/images/giga-shield/routing-traces.mp4" type="video/mp4"&gt;
&lt;/source&gt;&lt;/video&gt;

&lt;p&gt;Importing the routes back into pcb-rnd was the final step. pcb-rnd can import SES files through its batch mode:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;--gui&lt;span class="w"&gt; &lt;/span&gt;hid_batch&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="s"&gt;ImportSes(giga_shield.ses)&lt;/span&gt;
&lt;span class="s"&gt;SaveTo(LayoutAs, giga_shield_routed.pcb)&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The result: a fully routed PCB with 2911 traces and 172 vias, ready for Gerber export.&lt;/p&gt;
&lt;h3&gt;Running pcb-rnd Over SSH&lt;/h3&gt;
&lt;p&gt;One of the more unusual aspects of this project is that all pcb-rnd operations happened on a remote Ubuntu 24.04 machine accessed over SSH. pcb-rnd isn't available on macOS via Homebrew (I tried; there's a deprecated &lt;code&gt;pcb&lt;/code&gt; package but no &lt;code&gt;pcb-rnd&lt;/code&gt;), and building from source on macOS looked like a rabbit hole I didn't want to enter.&lt;/p&gt;
&lt;p&gt;The remote workflow was straightforward:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Upload the PCB&lt;/span&gt;
scp&lt;span class="w"&gt; &lt;/span&gt;giga_shield.pcb&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27:/tmp/

&lt;span class="c1"&gt;# Export DSN for routing&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcb-rnd -x dsn /tmp/giga_shield.pcb"&lt;/span&gt;

&lt;span class="c1"&gt;# Import SES and export gerbers&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;'pcb-rnd --gui hid_batch /tmp/giga_shield.pcb &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="s1"&gt;ImportSes(/tmp/giga_shield.ses)&lt;/span&gt;
&lt;span class="s1"&gt;SaveTo(LayoutAs, /tmp/giga_shield_routed.pcb)&lt;/span&gt;
&lt;span class="s1"&gt;EOF'&lt;/span&gt;

&lt;span class="c1"&gt;# Export production files&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcb-rnd -x gerber --gerberfile /tmp/giga_shield /tmp/giga_shield_routed.pcb"&lt;/span&gt;
ssh&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcb-rnd -x png --dpi 600 --photo-mode --outfile /tmp/top.png /tmp/giga_shield_routed.pcb"&lt;/span&gt;

&lt;span class="c1"&gt;# Download results&lt;/span&gt;
scp&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27:/tmp/giga_shield.*.gbr&lt;span class="w"&gt; &lt;/span&gt;.
scp&lt;span class="w"&gt; &lt;/span&gt;alex@10.1.1.27:/tmp/top.png&lt;span class="w"&gt; &lt;/span&gt;giga_shield_top.png
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;It's more keystrokes than clicking Export in a GUI. But it's scriptable, repeatable, and fits into the same terminal where Claude Code is running. When I needed to iterate (move a component, re-route, re-export) I could do it in a single pipeline without switching contexts.&lt;/p&gt;
&lt;h3&gt;Claude Code as a Hardware Design Partner&lt;/h3&gt;
&lt;p&gt;I should be explicit about what Claude Code did and didn't do in this project, because the AI angle is the part people will either find most interesting or most suspicious.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What Claude Code did:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Parsed the KiCad schematic to extract the 72-channel signal mapping across 9 level shifter ICs&lt;/li&gt;
&lt;li&gt;Wrote the initial &lt;code&gt;build_giga_shield.py&lt;/code&gt; generator script, including all four footprint generators and the net assignment logic&lt;/li&gt;
&lt;li&gt;Debugged pcb-rnd format issues by analyzing error messages and file structure&lt;/li&gt;
&lt;li&gt;Managed the remote SSH workflow: uploading files, running pcb-rnd commands, downloading results&lt;/li&gt;
&lt;li&gt;Fixed bugs in the Freerouting build (the &lt;code&gt;static main&lt;/code&gt; issue, the null &lt;code&gt;maxThreads&lt;/code&gt;, the Gradle &lt;code&gt;fileMode&lt;/code&gt; API change)&lt;/li&gt;
&lt;li&gt;Handled iterative changes: "move tinycomputers.io down by a millimeter" became an edit to the Python script, a regeneration, a re-import, and a re-export, all executed as a single flow&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;What Claude Code didn't do:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Make architectural decisions. The choice to use SN74LVC8T245 over TXB0108, the DIR control header design, the decision to use pull-down resistors defaulting to A-to-B direction. Those were my decisions based on understanding the Z80 bus protocol; it is also on me for selecting the TXB0108 in the first place&lt;/li&gt;
&lt;li&gt;Verify electrical correctness. I checked the SN74LVC8T245 datasheet pin mapping myself. I verified that OE# tied to GND means always-enabled. I confirmed the 10K pull-down value was appropriate for the DIR pin&lt;/li&gt;
&lt;li&gt;Replace domain knowledge. I knew why the TXB0108 failed during tri-state periods because I understand Z80 bus cycles. Claude Code could have looked up the TXB0108 datasheet, but it couldn't have diagnosed the real-world failure mode from first principles&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The pattern that emerged was: I made design decisions, Claude Code implemented them. I said "the DIR pins need pull-down resistors to default A-to-B direction," Claude Code generated the pcb-rnd Element entries with the correct footprint, position, and net assignments. I said "export gerbers at 600 DPI with photo mode," Claude Code ran the right &lt;code&gt;pcb-rnd&lt;/code&gt; command on the remote machine.&lt;/p&gt;
&lt;p&gt;This is the same division of labor I described in the &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;dual Z80 post&lt;/a&gt;: I bring the domain knowledge, the AI handles the format translation. The text-based nature of gEDA files makes this work. If the design lived in a binary format or required mouse interactions, the AI would have been far less useful.&lt;/p&gt;
&lt;h3&gt;The New Design&lt;/h3&gt;
&lt;p&gt;Here's what the redesigned board looks like:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;v0.1 (Fiverr/TXB0108)&lt;/th&gt;
&lt;th&gt;v0.2 (Claude Code/SN74LVC8T245)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Level Shifter IC&lt;/td&gt;
&lt;td&gt;TXB0108PW (TSSOP-20)&lt;/td&gt;
&lt;td&gt;SN74LVC8T245PW (TSSOP-24)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Direction Control&lt;/td&gt;
&lt;td&gt;Auto-sensing&lt;/td&gt;
&lt;td&gt;Explicit DIR pin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Channels&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shifter ICs&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Decoupling Caps&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pull-down Resistors&lt;/td&gt;
&lt;td&gt;9 (OE)&lt;/td&gt;
&lt;td&gt;9 (DIR)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DIR Control Header&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;J11 (1x10)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Board Dimensions&lt;/td&gt;
&lt;td&gt;155mm x 90mm&lt;/td&gt;
&lt;td&gt;155mm x 90mm&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layers&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design Tool&lt;/td&gt;
&lt;td&gt;KiCad 9.0 (GUI)&lt;/td&gt;
&lt;td&gt;Python + pcb-rnd (CLI)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design Cost&lt;/td&gt;
&lt;td&gt;$468.63&lt;/td&gt;
&lt;td&gt;$0 (open source tools)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design Time&lt;/td&gt;
&lt;td&gt;~10 days (outsourced)&lt;/td&gt;
&lt;td&gt;~2 days (with AI)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The J11 header is the key addition. It's a 1x10 pin header with 9 direction control pins (one per shifter IC) and a ground reference. Each DIR pin has a 10K pull-down resistor that defaults the direction to A-to-B (3.3V to 5V). To reverse a shifter's direction (for example, when the Arduino needs to read from the Z80's data bus) you drive the corresponding J11 pin high. The Arduino firmware manages this dynamically during bus cycles.&lt;/p&gt;
&lt;p&gt;The board carries "tinycomputers.io" and "v0.2" on the silkscreen, placed near the bottom edge. Version tracking on the physical board, a lesson learned from the Fiverr experience, where I had to pay $57 for a revision just to add version text to the silkscreen.&lt;/p&gt;
&lt;h3&gt;Generating Production Files&lt;/h3&gt;
&lt;p&gt;With the routed board in hand, the final step was generating files suitable for manufacturing. pcb-rnd handles this with command-line exporters:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Gerber files (9 layers: top/bottom copper, mask, silk, paste, outline, drill, fab)&lt;/span&gt;
pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;gerber&lt;span class="w"&gt; &lt;/span&gt;--gerberfile&lt;span class="w"&gt; &lt;/span&gt;giga_shield&lt;span class="w"&gt; &lt;/span&gt;giga_shield_routed.pcb

&lt;span class="c1"&gt;# Photo-realistic renderings&lt;/span&gt;
pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;png&lt;span class="w"&gt; &lt;/span&gt;--dpi&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--photo-mode&lt;span class="w"&gt; &lt;/span&gt;--outfile&lt;span class="w"&gt; &lt;/span&gt;top.png&lt;span class="w"&gt; &lt;/span&gt;giga_shield_routed.pcb
pcb-rnd&lt;span class="w"&gt; &lt;/span&gt;-x&lt;span class="w"&gt; &lt;/span&gt;png&lt;span class="w"&gt; &lt;/span&gt;--dpi&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--photo-mode&lt;span class="w"&gt; &lt;/span&gt;--photo-flip-x&lt;span class="w"&gt; &lt;/span&gt;--outfile&lt;span class="w"&gt; &lt;/span&gt;bottom.png&lt;span class="w"&gt; &lt;/span&gt;giga_shield_routed.pcb
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Gerber output includes everything a fab house needs: top and bottom copper, solder mask, silkscreen, paste stencil, board outline, and drill locations. The photo-realistic PNG renderings use pcb-rnd's built-in renderer: green solder mask, gold-plated pads, white silkscreen text. They're useful for documentation and for sanity-checking the layout before sending it to fabrication.&lt;/p&gt;
&lt;p&gt;The BOM and centroid files were generated separately from the Python script's component data. The centroid file lists every SMD component's X/Y position and rotation, which is essential if you're having the boards assembled by a service rather than hand-soldering.&lt;/p&gt;
&lt;h3&gt;What's Different About This Approach&lt;/h3&gt;
&lt;p&gt;The standard way to design a PCB in 2026 is: open KiCad or Altium, draw a schematic, assign footprints, lay out the board, route traces (manually or with the built-in autorouter), and export Gerbers. It's a visual, interactive process that works well for most people and most projects.&lt;/p&gt;
&lt;p&gt;What I did is different in a few ways that I think are worth noting, even if they're not universally applicable:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The entire design is a Python script.&lt;/strong&gt; &lt;code&gt;build_giga_shield.py&lt;/code&gt; is the single source of truth. Want to move a component? Change a coordinate in the script. Want to add a net? Add it to the dictionary. Want to change every decoupling cap from 0.1uF to 0.22uF? Change a string. Then re-run the pipeline. There's no "did I save the layout?" ambiguity, no undo history to worry about, no risk of accidentally moving something with a stray mouse click.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Every intermediate file is text.&lt;/strong&gt; The &lt;code&gt;.pcb&lt;/code&gt; file, the &lt;code&gt;.dsn&lt;/code&gt; file, the &lt;code&gt;.ses&lt;/code&gt; file. All text, all diffable, all version-controllable. When I moved a component and re-routed, I could &lt;code&gt;git diff&lt;/code&gt; the PCB file and see exactly what changed. Try that with a binary PCB format.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;AI can participate meaningfully.&lt;/strong&gt; Because the files are text, Claude Code could read them, modify them, and verify them. It could grep for a component reference in the PCB file, find its coordinates, suggest a new position, and make the edit. It could read the Freerouting log and diagnose why routing failed. This level of AI participation simply isn't possible with graphical-only workflows.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The workflow is reproducible.&lt;/strong&gt; I can hand someone the Python script and the Freerouting JAR and they can regenerate the entire board from scratch, on any machine with Python and Java. No KiCad version compatibility issues, no plugin dependencies, no "works on my machine" problems.&lt;/p&gt;
&lt;p&gt;The trade-off is obvious: this approach requires understanding file formats at a level that graphical tools abstract away. If pcb-rnd's parser rejects your file with a misleading error message, you need to debug the file format, not just re-click a button. It's a power-user workflow. But for someone comfortable with text editors and command lines (which describes most of the audience reading a blog called tinycomputers.io), it's a viable alternative.&lt;/p&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;The Gerber files are ready for fabrication. In part two, I'll cover ordering the boards from &lt;a href="https://baud.rs/youwpy"&gt;PCBWay&lt;/a&gt;, sourcing the SN74LVC8T245PW and passive components, and the moment of truth: plugging the RetroShield Z80 into the new shield and seeing if the Arduino can finally see the Z80's bus cycles clearly.&lt;/p&gt;
&lt;p&gt;I'll also compare the v0.2 board side-by-side with the original Fiverr v0.1 board: the TXB0108 auto-sensing design versus the SN74LVC8T245 driven design. Same board dimensions, same connector layout, fundamentally different level-shifting approach. The comparison should be instructive for anyone choosing between auto-sensing and driven level translators for bus interfaces.&lt;/p&gt;
&lt;p&gt;The Python build script, pcb-rnd source files, Gerber outputs, and all helper scripts are open source:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://baud.rs/pOawfA"&gt;giga-shield&lt;/a&gt;&lt;/strong&gt;: Complete design files, build pipeline, and production outputs&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;This is part one of a two-part series. Part two will cover fabrication, assembly, and testing.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Previous posts in this series: &lt;a href="https://tinycomputers.io/posts/fiverr-pcb-design-arduino-giga-shield.html"&gt;Fiverr PCB Design ($468)&lt;/a&gt; · &lt;a href="https://tinycomputers.io/posts/designing-a-dual-z80-retroshield-part-1.html"&gt;Dual Z80 RetroShield&lt;/a&gt; · &lt;a href="https://tinycomputers.io/posts/cpm-on-arduino-giga-r1-wifi.html"&gt;CP/M on the Giga R1&lt;/a&gt; · &lt;a href="https://tinycomputers.io/posts/zork-on-retroshield-z80-arduino-giga.html"&gt;Zork on the Giga&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description><category>ai</category><category>arduino</category><category>arduino giga</category><category>claude code</category><category>freerouting</category><category>geda</category><category>hardware</category><category>level shifter</category><category>open-source</category><category>pcb design</category><category>pcb-rnd</category><category>retroshield</category><category>z80</category><guid>https://tinycomputers.io/posts/redesigning-a-pcb-with-claude-code-and-open-source-eda-part-1.html</guid><pubDate>Fri, 13 Mar 2026 16:00:00 GMT</pubDate></item><item><title>The Cathedral and the Bazaar, Nearly 30 Years Later</title><link>https://tinycomputers.io/posts/the-cathedral-and-the-bazaar-nearly-30-years-later.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/the-cathedral-and-the-bazaar-nearly-30-years-later_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;20 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src="https://tinycomputers.io/images/cathedral-bazaar/eric-raymond.jpg" alt="Eric S. Raymond" style="float: right; margin: 0 0 15px 20px; max-width: 240px; border-radius: 6px;" title="Eric S. Raymond, author of 'The Cathedral and the Bazaar.' Photo by jerone2, CC BY-SA 2.0, via Wikimedia Commons."&gt;&lt;/p&gt;
&lt;p&gt;In 1997, Eric S. Raymond presented a paper at the Linux Kongress in Bavaria that would reshape how an entire industry thought about building software. "The Cathedral and the Bazaar" drew a sharp line between two models of development. The cathedral: careful, centralized, release-when-ready. The bazaar: open, decentralized, release-early-and-often. Raymond argued, with considerable evidence from the Linux kernel and his own fetchmail project, that the bazaar would win.&lt;/p&gt;
&lt;p&gt;Nearly three decades later, we can evaluate the claim. And the answer is more interesting than a simple yes or no.&lt;/p&gt;
&lt;h3&gt;What Raymond Actually Argued&lt;/h3&gt;
&lt;p&gt;The essay's core thesis was that certain software problems (particularly large, complex ones) were better solved by decentralized communities than by centralized teams. Raymond distilled this into several principles, the most famous being Linus's Law: "Given enough eyeballs, all bugs are shallow." With enough contributors examining source code, every bug would be obvious to someone.&lt;/p&gt;
&lt;p&gt;He identified several supporting dynamics. Release early and often. Treat your users as co-developers. If you treat beta testers as your most valuable resource, they'll respond by becoming your most valuable resource. Keep the architecture modular enough that contributors can work on pieces independently.&lt;/p&gt;
&lt;p&gt;The implicit assumption was ideological as much as technical. Open-source development would succeed because it aligned individual motivation (scratching a personal itch, building reputation, the intellectual satisfaction of solving problems) with collective benefit. No corporate hierarchy required. No cathedral architects directing the work from above.&lt;/p&gt;
&lt;p&gt;It was, in its way, a profoundly optimistic vision of human coordination.&lt;/p&gt;
&lt;p&gt;Here's what makes the timing remarkable: when Raymond presented his paper in 1997, the term "open source" didn't exist. He was writing about "free software" and the Linux development model. The phrase was coined months later, in February 1998, at a strategy session in Palo Alto, partly catalyzed by the essay's success and Netscape's decision to release the Navigator source code (a decision Raymond's essay directly influenced). The Open Source Initiative followed weeks after that, co-founded by Raymond and Bruce Perens.&lt;/p&gt;
&lt;p&gt;The movement Raymond was describing was young. The GNU Project was fourteen years old. The Free Software Foundation was twelve. The GPL was eight. Linux itself was only six. BSD had been circulating since the late 1970s, but the &lt;a href="https://tinycomputers.io/posts/how-bsds-licensing-issues-paved-the-way-for-linuxs-rise-to-prominence.html"&gt;legal battles&lt;/a&gt; that nearly killed it were barely resolved. There was no GitHub, no SourceForge, no standardized workflow for distributed contribution. The bazaar Raymond championed was a handful of mailing lists, FTP servers, and the sheer force of Linus Torvalds's integrative judgment.&lt;/p&gt;
&lt;p&gt;The essay didn't just describe a revolution. It named one that hadn't named itself yet.&lt;/p&gt;
&lt;h3&gt;The Bazaar Won&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/cathedral-bazaar/grand-bazaar-istanbul.jpg" alt="Interior of the Grand Bazaar, Istanbul" style="max-width: 100%; border-radius: 6px; margin-bottom: 15px;" title="The Grand Bazaar in Istanbul, one of the oldest and largest covered markets in the world. Photo by Slyronit, CC BY-SA 4.0, via Wikimedia Commons."&gt;&lt;/p&gt;
&lt;p&gt;By any quantitative measure, Raymond was right. Linux runs the cloud. Android runs the phone. Firefox and then Chromium reshaped the browser. Apache and then Nginx served the web. PostgreSQL and MySQL handled the data. Python, Ruby, Node.js, Rust, Go: the languages that define modern development are overwhelmingly open-source.&lt;/p&gt;
&lt;p&gt;The numbers are staggering. GitHub hosts over 400 million repositories. The Linux kernel has received contributions from over 20,000 individual developers. Every major cloud provider (Amazon, Microsoft, Google) runs on open-source infrastructure. Even Microsoft, which once called Linux a "cancer," now contributes to it, acquired GitHub, and ships a Linux kernel inside Windows.&lt;/p&gt;
&lt;p&gt;If you'd told someone in 1997 that the world's most valuable companies would run their businesses on software they didn't own and couldn't fully control, they would have questioned your judgment. Raymond's prediction wasn't just right. It was conservative.&lt;/p&gt;
&lt;h3&gt;The Cathedral Came Back Wearing Bazaar Clothes&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/cathedral-bazaar/toledo-cathedral.jpg" alt="Interior of the Gothic Cathedral of Toledo, Spain" style="float: left; margin: 0 20px 15px 0; max-width: 300px; border-radius: 6px;" title="Interior of the Cathedral of Toledo, Spain. Photo by Adam Jones, CC BY-SA 3.0, via Wikimedia Commons."&gt;&lt;/p&gt;
&lt;p&gt;Here's where Raymond's vision diverges from what actually happened. The bazaar won, but the cathedrals adapted.&lt;/p&gt;
&lt;p&gt;Meta open-sources PyTorch and Llama. Google open-sources TensorFlow, Kubernetes, Android, and Chromium. Microsoft open-sources VS Code, TypeScript, and .NET. Amazon builds its most profitable business on top of open-source databases, then offers them as managed services. These are not acts of ideological commitment. They are strategic decisions made by organizations with cathedral-scale resources and cathedral-scale ambitions.&lt;/p&gt;
&lt;p&gt;The pattern is consistent: open-source the layer you want to commoditize, then capture value at the layer above. Google open-sources Android to commoditize mobile operating systems, then captures value through the Play Store and advertising. Meta open-sources PyTorch to commoditize the AI framework layer, then captures value through the models and services built on top. Amazon doesn't need to own the database; it needs to own the infrastructure the database runs on.&lt;/p&gt;
&lt;p&gt;This is what Raymond didn't anticipate. The bazaar model wasn't just adopted by idealists scratching personal itches. It was weaponized by the most powerful corporations in history as a competitive strategy. The &lt;a href="https://tinycomputers.io/posts/how-bsds-licensing-issues-paved-the-way-for-linuxs-rise-to-prominence.html"&gt;BSD licensing disputes&lt;/a&gt; that shaped early open-source history look almost quaint compared to the strategic licensing wars that followed.&lt;/p&gt;
&lt;p&gt;There's a personal irony here too. Raymond himself wasn't immune to the cathedral's gravitational pull. He received 150,000 pre-IPO shares of VA Linux, briefly making him worth approximately \$36 million. He wrote an essay called "Surprised by Wealth" about the experience, pledging that the money wouldn't change him. By April 2002, the shares were &lt;a href="https://workbench.cadenhead.org/news/3149/eric-s-raymond-bazaar-financial-advisor"&gt;worth \$195,000&lt;/a&gt;; he'd held through the entire collapse without selling. The bazaar's chief evangelist got rich, briefly, through the most cathedral-scale financial mechanism in capitalism: Wall Street pre-IPO allocations. The wealth came and went through institutions the bazaar model was supposed to make irrelevant.&lt;/p&gt;
&lt;p&gt;Joel Spolsky described this dynamic in 2002 as "commoditize your complements." Open-source your competitors' profit center, and your own products become more valuable. But even Spolsky didn't fully see how far it would go. In 2026, the bazaar is less a revolutionary alternative to the cathedral than a resource the cathedral harvests.&lt;/p&gt;
&lt;h3&gt;The Efficiency That Created More, Not Less&lt;/h3&gt;
&lt;p&gt;Raymond's essay focused on the development model: how code gets written, reviewed, and shipped. What he didn't explore was the economic consequence of making infrastructure-quality software free.&lt;/p&gt;
&lt;p&gt;When the bazaar model succeeded, it didn't just change how software was built. It changed how much software existed. By making operating systems, web servers, databases, programming languages, and frameworks available at zero marginal cost, the bazaar removed the floor from the cost of building new things. A startup in 2005 could do what a well-funded company in 1995 could not, because the entire stack was free.&lt;/p&gt;
&lt;p&gt;The result wasn't less total development effort. It was dramatically more. Linux didn't consolidate the operating system landscape into one efficient platform; it spawned hundreds of distributions, each with its own community, its own design philosophy, its own ecosystem. Free databases didn't mean fewer databases. It meant PostgreSQL, MySQL, MariaDB, SQLite, MongoDB, Redis, CockroachDB, and dozens more, each serving demand that wouldn't have existed if everyone had to pay Oracle prices.&lt;/p&gt;
&lt;p&gt;This pattern (efficiency gains leading to expanded consumption rather than reduced effort) &lt;a href="https://tinycomputers.io/posts/jevons-paradox.html"&gt;has a name in economics&lt;/a&gt;, and it shows up everywhere technology reduces the cost of a critical input. The bazaar made software infrastructure cheap, and the world responded by building more software than anyone in 1997 could have imagined.&lt;/p&gt;
&lt;p&gt;There's a second-order effect too. By making infrastructure free, the bazaar lowered the cost of building &lt;em&gt;on top of&lt;/em&gt; that infrastructure. Entire industries (SaaS, cloud computing, the modern startup ecosystem) simply wouldn't have been viable if everyone had to pay cathedral-model prices for their stack. The &lt;a href="https://tinycomputers.io/posts/what-visicalc-teaches-us-about-ai.html"&gt;VisiCalc pattern&lt;/a&gt; repeated itself: a tool that was supposed to eliminate work created new categories of work that dwarfed the original.&lt;/p&gt;
&lt;p&gt;And Raymond's own principle (treat users as co-developers) is itself a demand-expanding dynamic. Converting consumers of software into producers means the resource (developer attention) gets deployed more broadly, not more efficiently. More people write code because more people &lt;em&gt;can&lt;/em&gt; write code, because the tools are free, the examples are public, and the barrier to participation is a GitHub account.&lt;/p&gt;
&lt;h3&gt;What Raymond Got Wrong&lt;/h3&gt;
&lt;p&gt;The essay's blind spots have become painfully clear.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Maintainer burnout.&lt;/strong&gt; Raymond assumed that contributor motivation was self-sustaining, that people would keep showing up because the work was interesting. He didn't account for the dynamics that emerge when a hobby project becomes critical infrastructure. The OpenSSL library, maintained for years by a handful of volunteers, secured the majority of encrypted web traffic until the Heartbleed vulnerability in 2014 revealed how thin the maintenance layer really was. The left-pad incident, the core-js crisis, the Log4j vulnerability: each demonstrated that the bazaar's supply of labor is not inexhaustible. It concentrates on the exciting work and neglects the essential work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Free-riding at scale.&lt;/strong&gt; The essay assumed a rough symmetry between use and contribution. The reality is asymmetric: billions of dollars in commercial value extracted from projects maintained by unpaid or underpaid developers. Amazon took Elasticsearch and offered it as a managed service. When Elastic changed their license to prevent this, the open-source community split. MongoDB, Redis, and HashiCorp followed similar paths, companies that built open-source projects, watched cloud providers commoditize them, and responded by restricting their licenses.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Security supply chains.&lt;/strong&gt; A bazaar has no gatekeepers, which Raymond saw as a feature. It's also a vulnerability. The SolarWinds attack, dependency confusion attacks, typosquatting on npm: these exploit the trust model that makes the bazaar work. When anyone can contribute, anyone includes adversaries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Governance.&lt;/strong&gt; Raymond wrote about the bazaar as if the only governance question was technical: who decides which patches get merged? The real governance questions turned out to be social and economic: who funds maintenance? Who decides licensing changes? Who gets to use the work commercially? These questions have no bazaar-native answers. They require institutions (foundations, companies, legal frameworks) which is to say, they require cathedrals.&lt;/p&gt;
&lt;h3&gt;The Licensing Wars&lt;/h3&gt;
&lt;p&gt;The clearest evidence that Raymond's framework was incomplete is the licensing landscape of 2026.&lt;/p&gt;
&lt;p&gt;The GPL, which Richard Stallman designed to ensure that modified software remained free, worked well in a world where software was distributed as binaries. The cloud broke that model. If you run GPL software as a service, you never "distribute" it; users interact with the output, not the code. The software is free in theory and proprietary in practice.&lt;/p&gt;
&lt;p&gt;The response was a proliferation of new licenses. The AGPL closed the cloud loophole by requiring source availability for network services. The Business Source License (BSL) made code available to read but restricted commercial use until a time-delayed release to open source. The Server Side Public License (SSPL) required that anyone offering the software as a service must open-source their entire stack.&lt;/p&gt;
&lt;p&gt;Each of these represents a partial retreat from the bazaar model. Not back to the cathedral (the code is still visible, forkable, auditable) but to something Raymond didn't envision: a commons with fences. The ideological purity of "free as in freedom" collided with the economic reality that freedom without reciprocity becomes exploitation.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://tinycomputers.io/posts/how-bsds-licensing-issues-paved-the-way-for-linuxs-rise-to-prominence.html"&gt;BSD licensing story&lt;/a&gt; foreshadowed this. The permissive BSD license allowed commercial forks without contribution back. This wasn't a problem when the commercial ecosystem was small. When the commercial ecosystem became the entire cloud computing industry, the lack of reciprocity became untenable for projects that couldn't attract cathedral-scale corporate sponsorship.&lt;/p&gt;
&lt;h3&gt;What Raymond Got Right&lt;/h3&gt;
&lt;p&gt;Despite these blind spots, the essay's core insight has proven durable: for certain classes of problems, decentralized coordination outperforms centralized planning.&lt;/p&gt;
&lt;p&gt;This isn't because decentralized systems are morally superior. It's because they solve the information problem differently. A cathedral architect must understand the entire system well enough to direct work from above. A bazaar participant only needs to understand their local patch well enough to improve it. As systems grow in complexity, the information burden on the cathedral architect grows faster than the burden on any individual bazaar participant.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/cathedral-bazaar/linus-torvalds.jpg" alt="Linus Torvalds at LinuxCon Europe 2014" style="float: right; margin: 0 0 15px 20px; max-width: 260px; border-radius: 6px;" title="Linus Torvalds at LinuxCon Europe, 2014. Photo by Krd, CC BY-SA 4.0, via Wikimedia Commons."&gt;&lt;/p&gt;
&lt;p&gt;The Linux kernel is the proof. No single person understands the entire Linux kernel. It's too large, too complex, spanning too many hardware architectures and subsystems. But the kernel works, and works remarkably well, because the development model doesn't require any single person to understand it all. Subsystem maintainers understand their domains. Linus Torvalds understands the integration points. Contributors understand the specific problems they're solving. The architecture of the development process mirrors the architecture of the software.&lt;/p&gt;
&lt;p&gt;This insight extends beyond software. Wikipedia works on bazaar principles. Citizen science projects like Galaxy Zoo and Foldit leverage distributed human attention. Even hardware design is slowly moving in this direction, though the marginal cost of atoms versus bits &lt;a href="https://tinycomputers.io/posts/why-some-chips-last-40-years.html"&gt;creates structural barriers&lt;/a&gt; that software doesn't face. The concept of &lt;a href="https://tinycomputers.io/posts/why-some-chips-last-40-years.html"&gt;second-sourcing&lt;/a&gt; (multiple manufacturers producing compatible chips) is, in a sense, the hardware world's version of the bazaar. The Z80 survived for nearly fifty years partly because Zilog couldn't monopolize it.&lt;/p&gt;
&lt;p&gt;Raymond also got the motivational model roughly right, even if the details were off. People do contribute to open-source projects for intrinsic reasons: intellectual satisfaction, reputation, the desire to solve problems that matter to them personally. The mistake was assuming these motivations were sufficient at industrial scale, without institutional support.&lt;/p&gt;
&lt;h3&gt;The Bazaar in 2026&lt;/h3&gt;
&lt;p&gt;The open-source landscape of 2026 bears little resemblance to what Raymond described in 1997, but the dynamics he identified are still operating.&lt;/p&gt;
&lt;p&gt;The bazaar model made software infrastructure so cheap that it created more demand for software than any cathedral could have supplied. It enabled the cloud, the startup ecosystem, the AI revolution, all built on free foundations. The efficiency didn't reduce consumption. It unlocked latent demand that dwarfed the original market.&lt;/p&gt;
&lt;p&gt;At the same time, the cathedral never disappeared. It adapted. The most sophisticated cathedrals now build bazaars strategically, open-sourcing frameworks and tools that make their own proprietary services more valuable. Meta's contribution to PyTorch isn't charity. Google's contribution to Kubernetes isn't ideology. They're infrastructure investments that make the entire ecosystem dependent on capabilities only cathedral-scale organizations can provide.&lt;/p&gt;
&lt;p&gt;The result is a layered system more nuanced than Raymond's binary. At the bottom: genuine bazaar-model projects maintained by communities (the Linux kernel, PostgreSQL, countless libraries). In the middle: corporate-sponsored projects that look like bazaars but serve cathedral strategies (Kubernetes, Chromium, Llama). At the top: proprietary services built on open foundations (AWS, Google Cloud, OpenAI's API).&lt;/p&gt;
&lt;p&gt;Each layer depends on the ones below it. Each layer captures value differently. And the whole structure is held together by a web of licenses, foundations, corporate agreements, and social norms that Raymond's 1997 essay couldn't have anticipated.&lt;/p&gt;
&lt;p&gt;What's strangest about this arrangement is its circularity. Corporations adopted open source because it was free and good. Volunteer maintainers couldn't scale to meet corporate demand; Heartbleed and Log4j proved that. So corporations began funding open-source projects to keep their own infrastructure stable. But funding brought governance influence. The top Linux kernel contributors aren't hobbyists scratching personal itches. They're engineers employed by Google, Microsoft, Red Hat, Intel, and Huawei, steering the roadmap toward their employers' needs. Kubernetes evolves in ways that benefit Google Cloud. PyTorch evolves in ways that benefit Meta's AI stack.&lt;/p&gt;
&lt;p&gt;The projects became dependent on corporate funding. But the corporations became equally dependent on the projects. If Google pulled out of Kubernetes, the project would struggle. If Kubernetes collapsed, Google Cloud would struggle. So Google funds it more, which deepens the entanglement, which makes withdrawal more costly, which demands more funding. The snake eats its own tail.&lt;/p&gt;
&lt;p&gt;Google and Amazon compete ferociously in cloud computing, but they cooperate on the same open-source infrastructure that both their businesses require. They're rivals building on shared foundations that neither can afford to let fail and neither fully controls. The commons isn't independent anymore, but neither are the corporations.&lt;/p&gt;
&lt;p&gt;Raymond imagined the bazaar as freedom from institutional dependency. What emerged is mutual capture. The cathedral could fire its architects. The bazaar's corporate sponsors can't walk away from the bazaar, and the bazaar can't survive without them. Independence became entanglement, and the entanglement, paradoxically, is what makes the system work.&lt;/p&gt;
&lt;h3&gt;The Essay Worth Rereading&lt;/h3&gt;
&lt;p&gt;Raymond saw something real about how coordination works in networks. He was right that the bazaar model could produce software of extraordinary quality and scale. He was right that decentralized development could solve problems that centralized approaches couldn't. He was right that open-source would reshape the industry.&lt;/p&gt;
&lt;p&gt;He was wrong about the institutional vacuum. The bazaar didn't eliminate the need for cathedrals; it changed what cathedrals do. They no longer build the infrastructure. They build on top of it, around it, and through it. The most powerful technology companies in the world are cathedral organizations that have learned to cultivate bazaars for strategic advantage.&lt;/p&gt;
&lt;p&gt;"The Cathedral and the Bazaar" is worth rereading in 2026 not because it predicted the future correctly (no essay could, across three decades) but because it identified dynamics that, once set in motion, produced outcomes no one predicted. The bazaar made software free, and free software made more software. The cathedrals adapted, and their adaptation made the bazaar more important, not less. Raymond's binary became a symbiosis that neither model, alone, could have produced.&lt;/p&gt;
&lt;p&gt;The essay ends with Raymond quoting Robert Browning: "A man's reach should exceed his grasp, or what's a heaven for?" The reach exceeded. The grasp caught something different than expected. That's not a failure of vision. That's how ideas work when they meet reality.&lt;/p&gt;</description><category>corporate strategy</category><category>economics</category><category>eric raymond</category><category>free software</category><category>gpl</category><category>history</category><category>licensing</category><category>linux</category><category>open-source</category><category>software</category><guid>https://tinycomputers.io/posts/the-cathedral-and-the-bazaar-nearly-30-years-later.html</guid><pubDate>Mon, 09 Mar 2026 16:00:00 GMT</pubDate></item><item><title>Part 2: Implementing Sampo on the ULX3S FPGA</title><link>https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;p&gt;After designing the &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Sampo RISC architecture&lt;/a&gt; on paper (complete with a working assembler and emulator) it's time to bring it to life in silicon. Or at least, in programmable logic. This post documents the hardware selection and implementation planning for synthesizing Sampo on an FPGA.&lt;/p&gt;
&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/sampo-fpga-implementation-ulx3s_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;7 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h3&gt;The Story So Far&lt;/h3&gt;
&lt;p&gt;If you haven't read &lt;a href="https://tinycomputers.io/posts/sampo-16-bit-risc-cpu-part-1.html"&gt;Part 1 of this series&lt;/a&gt;, here's the quick version: Sampo is a 16-bit RISC CPU designed to bridge the gap between clean RISC design principles and Z80-friendly features. It has 16 general-purpose registers, ~66 instructions, port-based I/O, block operations (LDIR, LDDR), alternate registers for fast interrupt handling, and hardware multiply/divide.&lt;/p&gt;
&lt;p&gt;The project already includes working tools written in Rust:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;sasm&lt;/strong&gt; - A full assembler&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;semu&lt;/strong&gt; - An emulator with TUI debugger (step, breakpoints, memory inspection)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And for hardware implementation, we now have two complete RTL implementations:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Amaranth HDL&lt;/strong&gt; (&lt;code&gt;/rtl/&lt;/code&gt;):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;cpu.py&lt;/code&gt;, &lt;code&gt;alu.py&lt;/code&gt;, &lt;code&gt;decode.py&lt;/code&gt;, &lt;code&gt;regfile.py&lt;/code&gt;, &lt;code&gt;soc.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Python-based, excellent for rapid iteration&lt;/li&gt;
&lt;li&gt;Generates Verilog for synthesis&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;AI Assisted Hand-written Verilog&lt;/strong&gt; (&lt;code&gt;/verilog/rtl/&lt;/code&gt;):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;cpu.v&lt;/code&gt;, &lt;code&gt;alu.v&lt;/code&gt;, &lt;code&gt;decode.v&lt;/code&gt;, &lt;code&gt;regfile.v&lt;/code&gt;, &lt;code&gt;shifter.v&lt;/code&gt;, &lt;code&gt;uart.v&lt;/code&gt;, &lt;code&gt;ram.v&lt;/code&gt;, &lt;code&gt;soc.v&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Readable, portable, works with any toolchain&lt;/li&gt;
&lt;li&gt;Includes testbenches for Icarus Verilog and Verilator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now it's time to synthesize it to real hardware.&lt;/p&gt;
&lt;h3&gt;Choosing an FPGA Platform&lt;/h3&gt;
&lt;p&gt;The FPGA world is split between proprietary toolchains (Xilinx Vivado, Intel Quartus) and the growing open source ecosystem. For a project like Sampo, where understanding every layer of the stack matters, open source tooling is the clear choice.&lt;/p&gt;
&lt;h4&gt;Open Source FPGA Options&lt;/h4&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;FPGA Family&lt;/th&gt;
&lt;th&gt;Capacity&lt;/th&gt;
&lt;th&gt;Toolchain&lt;/th&gt;
&lt;th&gt;Maturity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gowin GW1N/GW2A&lt;/td&gt;
&lt;td&gt;1K-55K LUTs&lt;/td&gt;
&lt;td&gt;Project Apicula&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lattice iCE40&lt;/td&gt;
&lt;td&gt;1K-8K LUTs&lt;/td&gt;
&lt;td&gt;Project IceStorm&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lattice ECP5&lt;/td&gt;
&lt;td&gt;12K-85K LUTs&lt;/td&gt;
&lt;td&gt;Project Trellis&lt;/td&gt;
&lt;td&gt;Excellent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Xilinx 7-series&lt;/td&gt;
&lt;td&gt;10K-200K+ LUTs&lt;/td&gt;
&lt;td&gt;Project X-Ray (partial)&lt;/td&gt;
&lt;td&gt;Experimental&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;For Sampo, which estimates at &lt;strong&gt;~1,500-2,500 LUTs&lt;/strong&gt; for the basic CPU, even the smaller FPGAs have more than enough capacity. But if we want room to grow (adding caches, more peripherals, maybe even multi-core experiments) a larger device makes sense.&lt;/p&gt;
&lt;h3&gt;The ULX3S Board&lt;/h3&gt;
&lt;p&gt;The &lt;a href="https://baud.rs/Ij7oaR"&gt;ULX3S&lt;/a&gt; is an open hardware development board built around the ECP5 FPGA. It's designed by &lt;a href="https://baud.rs/v9aiPd"&gt;Radiona.org&lt;/a&gt; and has become the de facto standard for open source FPGA development.&lt;/p&gt;
&lt;h4&gt;Specifications&lt;/h4&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Specification&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;FPGA&lt;/td&gt;
&lt;td&gt;Lattice ECP5 (LFE5U-85F/45F/12F-6BG381C)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LUTs&lt;/td&gt;
&lt;td&gt;12K / 44K / 84K (depending on variant)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;USB&lt;/td&gt;
&lt;td&gt;FTDI FT231XS (500 kbit JTAG, 3 Mbit serial)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPIO&lt;/td&gt;
&lt;td&gt;56 pins (28 differential pairs), PMOD-compatible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAM&lt;/td&gt;
&lt;td&gt;32 MB SDRAM @ 166 MHz&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flash&lt;/td&gt;
&lt;td&gt;4-16 MB Quad-SPI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;microSD slot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LEDs&lt;/td&gt;
&lt;td&gt;11 total (8 user, 2 USB, 1 WiFi)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Buttons&lt;/td&gt;
&lt;td&gt;7 (4 direction, 2 fire, 1 power)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Audio&lt;/td&gt;
&lt;td&gt;3.5mm jack (stereo + digital/composite)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video&lt;/td&gt;
&lt;td&gt;GPDI (HDMI-compatible) with level shifter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Display&lt;/td&gt;
&lt;td&gt;Header for 0.96" SPI OLED (SSD1331)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wireless&lt;/td&gt;
&lt;td&gt;ESP32-WROOM-32 (WiFi/Bluetooth, standalone JTAG)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ADC&lt;/td&gt;
&lt;td&gt;8 channels, 12-bit, 1 MS/s (MAX11125)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Clock&lt;/td&gt;
&lt;td&gt;25 MHz onboard, differential input available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Power&lt;/td&gt;
&lt;td&gt;3 switching regulators (1.1V, 2.5V, 3.3V)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sleep&lt;/td&gt;
&lt;td&gt;5 µA standby, RTC wake-up with battery backup&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dimensions&lt;/td&gt;
&lt;td&gt;94mm × 51mm&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h4&gt;Why ULX3S for Sampo&lt;/h4&gt;
&lt;p&gt;The ULX3S isn't just an FPGA breakout board; it's a complete system:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;32MB SDRAM&lt;/strong&gt;: Real memory, not just block RAM. Essential for running actual programs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HDMI output&lt;/strong&gt;: Video terminal without external hardware.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;microSD slot&lt;/strong&gt;: Load programs, implement a filesystem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ESP32 co-processor&lt;/strong&gt;: WiFi-based JTAG debugging from any device.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Buttons and LEDs&lt;/strong&gt;: Instant I/O for testing without wiring anything.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Audio output&lt;/strong&gt;: Even supports composite video through the audio jack.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Budget Alternative: Tang Nano 9K&lt;/h3&gt;
&lt;p&gt;Before we dive into the ULX3S, it's worth mentioning a much cheaper option. The &lt;strong&gt;Tang Nano 9K&lt;/strong&gt; (~$15 on AliExpress) uses a Gowin GW1NR-9 FPGA with 8,640 LUTs, more than enough for Sampo:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;8,640 LUTs&lt;/li&gt;
&lt;li&gt;64Mbit PSRAM (can serve as the full 64KB address space and then some)&lt;/li&gt;
&lt;li&gt;HDMI output for a video terminal&lt;/li&gt;
&lt;li&gt;USB-C programming&lt;/li&gt;
&lt;li&gt;Fully supported by open-source toolchain (Yosys + nextpnr-gowin)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For initial development and testing, the Tang Nano 9K is hard to beat on price. But the ULX3S offers more I/O, more RAM, and a richer peripheral set, making it the better choice for a more complete Sampo system.&lt;/p&gt;
&lt;h3&gt;LUT Budget Planning&lt;/h3&gt;
&lt;p&gt;The Sampo RTL implementation is designed to be compact. Here's the resource breakdown:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Estimated LUTs&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;16 × 16-bit registers&lt;/td&gt;
&lt;td&gt;~256 FFs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ALU (16-bit)&lt;/td&gt;
&lt;td&gt;200 - 400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Control logic&lt;/td&gt;
&lt;td&gt;500 - 1,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Instruction decode&lt;/td&gt;
&lt;td&gt;300 - 500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sampo CPU core&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~1,500 - 2,500&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UART (115200 baud)&lt;/td&gt;
&lt;td&gt;200 - 300&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPI controller (SD card)&lt;/td&gt;
&lt;td&gt;300 - 500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPIO controller&lt;/td&gt;
&lt;td&gt;200 - 400&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Basic system&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~2,500 - 4,000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SDRAM controller&lt;/td&gt;
&lt;td&gt;500 - 1,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Instruction cache&lt;/td&gt;
&lt;td&gt;1,000 - 2,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data cache&lt;/td&gt;
&lt;td&gt;1,000 - 2,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Full system&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~6,000 - 10,000&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;These estimates are based on typical RISC CPU implementations. The actual numbers will depend on optimization choices and synthesis settings.&lt;/p&gt;
&lt;h4&gt;Variant Recommendations&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;12K LUTs&lt;/strong&gt; (ULX3S-12F): Plenty for basic Sampo + peripherals, tight for caches.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;45K LUTs&lt;/strong&gt; (ULX3S-45F): Comfortable. Full CPU with cache, room for experiments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;85K LUTs&lt;/strong&gt; (ULX3S-85F): Luxurious. Multi-core experiments, extensive peripherals.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Toolchain Setup&lt;/h3&gt;
&lt;p&gt;The ECP5 toolchain is fully open source:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# macOS (Homebrew)&lt;/span&gt;
brew&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;yosys&lt;span class="w"&gt; &lt;/span&gt;nextpnr-ecp5&lt;span class="w"&gt; &lt;/span&gt;ecpprog&lt;span class="w"&gt; &lt;/span&gt;fujprog

&lt;span class="c1"&gt;# Ubuntu/Debian&lt;/span&gt;
apt&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;yosys&lt;span class="w"&gt; &lt;/span&gt;nextpnr-ecp5&lt;span class="w"&gt; &lt;/span&gt;ecpprog

&lt;span class="c1"&gt;# Amaranth HDL (for our existing RTL)&lt;/span&gt;
pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;amaranth&lt;span class="w"&gt; &lt;/span&gt;amaranth-boards

&lt;span class="c1"&gt;# Or build FPGA tools from source for latest features&lt;/span&gt;
git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://github.com/YosysHQ/yosys
git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://github.com/YosysHQ/nextpnr
git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;https://github.com/YosysHQ/prjtrellis
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Tool Roles&lt;/h4&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Amaranth&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Python-based HDL (generates Verilog)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Yosys&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Verilog synthesis (RTL → netlist)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;nextpnr-ecp5&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Place and route (netlist → bitstream)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Project Trellis&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ECP5 bitstream documentation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ecpprog/fujprog&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Upload bitstream to board&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h4&gt;Amaranth Build Flow&lt;/h4&gt;
&lt;p&gt;Since Sampo's RTL is written in Amaranth, the build flow starts with Python:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Generate Verilog from Amaranth&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;rtl/
python&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;amaranth&lt;span class="w"&gt; &lt;/span&gt;generate&lt;span class="w"&gt; &lt;/span&gt;soc.py&lt;span class="w"&gt; &lt;/span&gt;&amp;gt;&lt;span class="w"&gt; &lt;/span&gt;sampo.v

&lt;span class="c1"&gt;# Then synthesize with standard tools&lt;/span&gt;
yosys&lt;span class="w"&gt; &lt;/span&gt;-p&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"synth_ecp5 -top sampo_soc -json sampo.json"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;sampo.v
nextpnr-ecp5&lt;span class="w"&gt; &lt;/span&gt;--85k&lt;span class="w"&gt; &lt;/span&gt;--package&lt;span class="w"&gt; &lt;/span&gt;CABGA381&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;--lpf&lt;span class="w"&gt; &lt;/span&gt;ulx3s.lpf&lt;span class="w"&gt; &lt;/span&gt;--json&lt;span class="w"&gt; &lt;/span&gt;sampo.json&lt;span class="w"&gt; &lt;/span&gt;--textcfg&lt;span class="w"&gt; &lt;/span&gt;sampo.config
ecppack&lt;span class="w"&gt; &lt;/span&gt;sampo.config&lt;span class="w"&gt; &lt;/span&gt;sampo.bit

&lt;span class="c1"&gt;# Program the board&lt;/span&gt;
fujprog&lt;span class="w"&gt; &lt;/span&gt;sampo.bit
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Hand-Written Verilog Implementation&lt;/h4&gt;
&lt;p&gt;In addition to the Amaranth RTL, we now have a complete ai-assisted hand-written Verilog implementation at &lt;code&gt;/verilog/&lt;/code&gt;. While Amaranth can generate Verilog, the auto-generated output isn't particularly readable. The hand-written version is designed for clarity and portability:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;verilog&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;rtl&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sampo_pkg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vh&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# Opcodes, constants, state definitions&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;alu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# 16-bit ALU with all operations&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;shifter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;# Barrel shifter (1/4/8-bit shifts, rotates)&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;regfile&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;# 16 registers + alternate set (EXX)&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;decode&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;# Instruction decoder&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# FSM-based CPU core (8 states)&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ram&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# 64KB synchronous RAM&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;uart&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;# Simple UART for serial I/O&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;soc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;# Top-level SoC integration&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;tb&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;alu_tb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;# ALU unit tests&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;regfile_tb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;# Register file tests&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sampo_tb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="c1"&gt;# Full system testbench&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;programs&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;
&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;hello&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hex&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;# Test program in Verilog hex format&lt;/span&gt;
&lt;span class="err"&gt;├──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Makefile&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;&lt;span class="c1"&gt;# Build automation&lt;/span&gt;
&lt;span class="err"&gt;└──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;bin2hex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;py&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;# Convert sasm output to Verilog $readmemh format&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Verilog implementation uses an 8-state FSM for the CPU: RESET → FETCH → FETCH_EXT → DECODE → EXECUTE → MEMORY → WRITEBACK → HALTED. This makes timing predictable and debugging straightforward.&lt;/p&gt;
&lt;h4&gt;Simulation with Icarus Verilog&lt;/h4&gt;
&lt;p&gt;The Verilog implementation includes a complete Makefile for testing:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;verilog/

&lt;span class="c1"&gt;# Run the main simulation (hello world)&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt;

&lt;span class="c1"&gt;# Run ALU unit tests&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;test-alu

&lt;span class="c1"&gt;# Run register file tests&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;test-regfile

&lt;span class="c1"&gt;# Build with Verilator (faster simulation)&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;verilate

&lt;span class="c1"&gt;# View waveforms in GTKWave&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;wave
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Sample output from &lt;code&gt;make test&lt;/code&gt;:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c"&gt;=== Sampo CPU Testbench ===&lt;/span&gt;
&lt;span class="c"&gt;RAM init file: &lt;/span&gt;&lt;span class="nt"&gt;..&lt;/span&gt;&lt;span class="c"&gt;/programs/hello&lt;/span&gt;&lt;span class="nt"&gt;.&lt;/span&gt;&lt;span class="c"&gt;hex&lt;/span&gt;

&lt;span class="c"&gt;CPU started at PC=0x0100&lt;/span&gt;
&lt;span class="c"&gt;UART output:&lt;/span&gt;
&lt;span class="nb"&gt;----------------------------------------&lt;/span&gt;
&lt;span class="c"&gt;Hello&lt;/span&gt;&lt;span class="nt"&gt;,&lt;/span&gt;&lt;span class="c"&gt; Sampo!&lt;/span&gt;
&lt;span class="nb"&gt;----------------------------------------&lt;/span&gt;

&lt;span class="c"&gt;Simulation complete:&lt;/span&gt;
&lt;span class="c"&gt;  Final PC:    0x011E&lt;/span&gt;
&lt;span class="c"&gt;  Cycles:      847&lt;/span&gt;
&lt;span class="c"&gt;  UART chars:  14&lt;/span&gt;
&lt;span class="c"&gt;  Status:      HALTED&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The Verilog version is portable to any FPGA toolchain (Xilinx, Intel, Lattice, Gowin) without requiring Amaranth or Python in the build chain.&lt;/p&gt;
&lt;h3&gt;Implementation Roadmap&lt;/h3&gt;
&lt;p&gt;With both Amaranth and Verilog implementations complete and tested in simulation, the roadmap is now about bringing them up on hardware.&lt;/p&gt;
&lt;h4&gt;Phase 1: Core Bring-up ✓ (Complete)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;✓ Instruction fetch and decode&lt;/li&gt;
&lt;li&gt;✓ ALU operations (all 16 operations)&lt;/li&gt;
&lt;li&gt;✓ Barrel shifter (1/4/8-bit shifts, rotates, RCL/RCR)&lt;/li&gt;
&lt;li&gt;✓ Register file with alternate set (EXX)&lt;/li&gt;
&lt;li&gt;✓ FSM-based CPU core (8 states)&lt;/li&gt;
&lt;li&gt;✓ RAM interface (64KB)&lt;/li&gt;
&lt;li&gt;✓ UART for serial I/O&lt;/li&gt;
&lt;li&gt;✓ SoC integration&lt;/li&gt;
&lt;li&gt;✓ Testbenches passing (ALU, regfile, full system)&lt;/li&gt;
&lt;li&gt;✓ Hello World runs in simulation&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 1.5: FPGA Bring-up (Current)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;○ ULX3S pin constraints (.lpf file)&lt;/li&gt;
&lt;li&gt;○ Clock setup (PLL from 25MHz)&lt;/li&gt;
&lt;li&gt;○ Map UART to FTDI&lt;/li&gt;
&lt;li&gt;○ LED heartbeat / debug outputs&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 2: Memory System&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;SDRAM controller for 32MB RAM&lt;/li&gt;
&lt;li&gt;Instruction cache (optional but helps timing)&lt;/li&gt;
&lt;li&gt;Basic interrupt handling&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 3: Peripherals&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;SPI controller for SD card boot&lt;/li&gt;
&lt;li&gt;GPIO controller (buttons, LEDs)&lt;/li&gt;
&lt;li&gt;Timer/counter module&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Phase 4: Advanced Features&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Data cache&lt;/li&gt;
&lt;li&gt;MMU for memory protection&lt;/li&gt;
&lt;li&gt;HDMI text console (VGA timing → GPDI)&lt;/li&gt;
&lt;li&gt;ESP32 WiFi integration for wireless debugging&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Recommended Tools &amp;amp; Books&lt;/h3&gt;
&lt;h4&gt;Hardware&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/HBq3zf"&gt;Tang Nano 9K FPGA&lt;/a&gt; - Budget-friendly FPGA board (~$25 on Amazon, ~$15 on AliExpress)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/BYIR58"&gt;USB Logic Analyzer&lt;/a&gt; - Essential for debugging signals (24MHz, 8 channels)&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Books&lt;/h4&gt;
&lt;p&gt;If you're new to Verilog or FPGA development, these are excellent starting points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/RGjpAj"&gt;&lt;em&gt;Getting Started with FPGAs&lt;/em&gt;&lt;/a&gt; by Russell Merrick - Beginner-friendly with Verilog and VHDL examples&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/tEyX95"&gt;&lt;em&gt;Programming FPGAs: Getting Started with Verilog&lt;/em&gt;&lt;/a&gt; by Simon Monk - Practical hands-on guide&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/6qfzvC"&gt;&lt;em&gt;Verilog by Example&lt;/em&gt;&lt;/a&gt; by Blaine Readler - Concise reference for working engineers&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Resources&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/VQxLTd"&gt;Sampo on GitHub&lt;/a&gt; - Full source including assembler, emulator, and RTL&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/JUjA8C"&gt;ULX3S GitHub&lt;/a&gt; - Schematics, examples, documentation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/JLKZBr"&gt;Project Trellis&lt;/a&gt; - ECP5 bitstream documentation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/0QCVAC"&gt;Amaranth HDL&lt;/a&gt; - Python-based hardware description&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/xlX31y"&gt;nextpnr&lt;/a&gt; - Place and route tool&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/LZdP4F"&gt;Yosys&lt;/a&gt; - Verilog synthesis&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Where to Buy&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;ULX3S:&lt;/strong&gt;
- &lt;a href="https://baud.rs/NClAGd"&gt;AliExpress&lt;/a&gt; - ~$100-150 depending on variant
- &lt;a href="https://baud.rs/AQB0Xg"&gt;Mouser&lt;/a&gt; - Official distribution
- &lt;a href="https://baud.rs/0gTuW6"&gt;CrowdSupply&lt;/a&gt; - Original campaign page&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tang Nano 9K (budget alternative):&lt;/strong&gt;
- &lt;a href="https://baud.rs/HBq3zf"&gt;Amazon&lt;/a&gt; - ~$25, faster shipping
- &lt;a href="https://baud.rs/9G7KR0"&gt;AliExpress&lt;/a&gt; - ~$15, slower shipping&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Next up: Getting our first instructions executing on real hardware. Both the Amaranth and Verilog implementations are ready and tested; Hello World runs in simulation and the testbenches pass. Now it's a matter of pin constraints, clock domains, and debugging the inevitable timing issues.&lt;/p&gt;</description><category>amaranth</category><category>cpu design</category><category>ecp5</category><category>fpga</category><category>hardware</category><category>lattice</category><category>open-source</category><category>risc</category><category>sampo</category><category>ulx3s</category><category>verilog</category><guid>https://tinycomputers.io/posts/sampo-fpga-implementation-ulx3s.html</guid><pubDate>Mon, 02 Feb 2026 18:00:00 GMT</pubDate></item><item><title>Real World Validation: How User Feedback Improved the Ballistics Engine CLI</title><link>https://tinycomputers.io/posts/real-world-validation-how-user-feedback-improved-the-ballistics-engine-cli.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/ballistics-cli-validation_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;10 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;One of the most rewarding aspects of building open-source tools is when users push them beyond your test cases. This week, a helpful early adopter did exactly that with the &lt;a href="https://baud.rs/2RblwQ"&gt;ballistics-engine CLI&lt;/a&gt;, and the results were both humbling and validating.&lt;/p&gt;
&lt;h3&gt;The Setup&lt;/h3&gt;
&lt;p&gt;This particular user had built an impressive workflow: CSV files defining gun profiles and location data, shell scripts to iterate through combinations, and a pipeline that generates beautifully formatted drop charts sized for &lt;a href="https://baud.rs/S95JBV"&gt;e-ink readers&lt;/a&gt; (old Nooks, specifically - brilliant for outdoor use with their daylight-readable screens and long battery life).&lt;/p&gt;
&lt;p&gt;Their setup included multiple rifles and locations, with real recorded dope (shooter's slang for verified bullet drop data at specific distances) from actual range sessions at 300, 665, 765, 847, 1004, and 1095 yards. This is exactly the kind of real-world validation that lab testing can't replicate.&lt;/p&gt;
&lt;h3&gt;Validating the Physics Solver&lt;/h3&gt;
&lt;p&gt;The user had been comparing the ballistics-engine output against JBM Ballistics, &lt;a href="https://baud.rs/I4Cw0C"&gt;Kestrel&lt;/a&gt; AB, and Hornady 4DOF - industry standards with years of refinement. Like most experienced long-range shooters, they "trued" their ballistic coefficient to match real-world observations:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;BC&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.270&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;BC_ADJ&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="err"&gt;→&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;TRUED_BC&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.2295&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For those unfamiliar with practical long-range shooting, "trueing" is the process of adjusting your ballistic coefficient (BC) to match real-world observations. Published BCs are measured under specific conditions, and real-world performance varies based on barrel harmonics, actual muzzle velocity, atmospheric conditions, and a dozen other factors. Most shooters apply a correction factor - typically 0.85 to 0.95 of the published BC - to get their solver to match their actual dope.&lt;/p&gt;
&lt;p&gt;With this trued BC, the physics solver was producing excellent results. Their verdict:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"So far, I've found this solver to be the most accurate (dealing with environmentals) based on what I've actually shot this past weekend and prior."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That's gratifying validation for the core physics engine. But I had something experimental I wanted them to try.&lt;/p&gt;
&lt;h3&gt;Testing a New Feature: The Online Solver&lt;/h3&gt;
&lt;p&gt;I had recently added an &lt;code&gt;--online&lt;/code&gt; flag to the CLI - a largely untested feature that sends trajectory calculations to a cloud API where machine learning models can apply corrections to the physics-based results. The ML models were trained on Doppler radar data and Doppler-derived datasets, and in theory should account for the systematic biases that make published BCs imperfect predictors of real-world performance.&lt;/p&gt;
&lt;p&gt;But theory and practice are different things. I asked the user if they'd be willing to kick the tires on this new feature with their real-world data.&lt;/p&gt;
&lt;p&gt;They agreed, and that's when things got interesting - in both good and bad ways.&lt;/p&gt;
&lt;h3&gt;The Surprising Result&lt;/h3&gt;
&lt;p&gt;The user's first tests with &lt;code&gt;--online&lt;/code&gt; mode revealed something unexpected. Remember that trued BC? The 0.85 correction factor they'd been applying to match their real-world dope?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;They didn't need it anymore.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With the online solver, the raw published BC of 0.27 produced accurate results without any manual adjustment. The ML correction was doing exactly what trueing does manually - accounting for the gap between laboratory-measured BCs and real-world performance.&lt;/p&gt;
&lt;p&gt;This was the first real-world validation that the ML enhancement actually works as intended. Not in a lab, not against synthetic test data, but against actual recorded dope from someone who shoots at 300 to 1100 yards and knows exactly where their bullets land.&lt;/p&gt;
&lt;h3&gt;But There Was a Problem&lt;/h3&gt;
&lt;p&gt;While the accuracy was spot-on, something else was wrong. When the user started batch-processing trajectories through their full suite of gun profiles and locations, trajectories were being truncated at varying distances depending on the rifle configuration.&lt;/p&gt;
&lt;p&gt;The root cause? When running with &lt;code&gt;--online&lt;/code&gt; mode, the &lt;code&gt;--ignore-ground-impact&lt;/code&gt; flag wasn't being passed to the Flask API backend. The API has a default ground threshold of -100 meters, so when trajectories dropped below that level, they were terminated early. Steeper trajectories (lighter bullets, lower velocities) hit the threshold sooner, which is why different gun profiles showed truncation at different distances.&lt;/p&gt;
&lt;p&gt;Here's an example of the command that was affected:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;ballistics&lt;span class="w"&gt; &lt;/span&gt;trajectory&lt;span class="w"&gt; &lt;/span&gt;--ignore-ground-impact&lt;span class="w"&gt; &lt;/span&gt;--mass&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;140&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--diameter&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.264&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;--wind-speed&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--wind-direction&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;90&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--humidity&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;45&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--altitude&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2506&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;--sight-height&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3&lt;/span&gt;.14&lt;span class="w"&gt; &lt;/span&gt;--twist-rate&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;8&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--sample-trajectory&lt;span class="w"&gt; &lt;/span&gt;--sample-interval&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;9&lt;/span&gt;.1440&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;--latitude&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;36&lt;/span&gt;.6&lt;span class="w"&gt; &lt;/span&gt;--auto-zero&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--max-range&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1530&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--velocity&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2875&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;--drag-model&lt;span class="w"&gt; &lt;/span&gt;g7&lt;span class="w"&gt; &lt;/span&gt;--bc&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.27&lt;span class="w"&gt; &lt;/span&gt;--pressure&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;27&lt;/span&gt;.29&lt;span class="w"&gt; &lt;/span&gt;--temperature&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;31&lt;/span&gt;.99&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;csv&lt;span class="w"&gt; &lt;/span&gt;--full&lt;span class="w"&gt; &lt;/span&gt;--online
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This is exactly why you need real users testing new features. The fix was straightforward: add the &lt;code&gt;ground_threshold&lt;/code&gt; parameter to the API client so &lt;code&gt;--ignore-ground-impact&lt;/code&gt; is properly respected in online mode.&lt;/p&gt;
&lt;h3&gt;The Cascade of Fixes&lt;/h3&gt;
&lt;p&gt;Once you start looking, you find more. The investigation uncovered several related issues with the online mode:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;v0.13.24&lt;/strong&gt;: Fixed the ground threshold parameter for online mode&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;v0.13.25-26&lt;/strong&gt;: Added weather control parameters for online mode:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;--enable-weather-zones&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--enable-3d-weather&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--wind-shear-model&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--longitude&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;--shot-direction&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;v0.13.27-28&lt;/strong&gt;: Fixed location CSV overrides for humidity and wind direction that were being silently ignored&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;v0.13.29&lt;/strong&gt;: Expanded test coverage from 156 to 192 tests to catch similar issues earlier&lt;/p&gt;
&lt;h3&gt;How the Online Solver Works&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;--online&lt;/code&gt; flag sends your trajectory parameters to the &lt;a href="https://baud.rs/JY6Kt4"&gt;ballistics API&lt;/a&gt;. Behind the scenes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The same Rust-based physics solver runs (via PyO3 bindings to Python)&lt;/li&gt;
&lt;li&gt;ML models analyze the trajectory and determine a correction factor&lt;/li&gt;
&lt;li&gt;The correction is applied and results are returned&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The ML models were trained on:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Doppler radar measurements of actual bullet flight&lt;/li&gt;
&lt;li&gt;Doppler-derived drag coefficient data&lt;/li&gt;
&lt;li&gt;Environmental correlation data&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The correction factors are typically small - often in the 0.95-1.05 range - but they account for the systematic biases that make published BCs imperfect predictors of real-world performance.&lt;/p&gt;
&lt;h3&gt;The Takeaway&lt;/h3&gt;
&lt;p&gt;Open-source software thrives on user feedback. This curious early adopter:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Validated the physics solver against established industry tools and real-world data&lt;/li&gt;
&lt;li&gt;Agreed to test a brand-new, experimental feature&lt;/li&gt;
&lt;li&gt;Found real bugs that only emerge under batch processing conditions&lt;/li&gt;
&lt;li&gt;Provided the first confirmation that the ML enhancement eliminates the need for manual BC trueing&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All of this from someone who described themselves as "a Linux admin, script kitty" who just "cobbles the genius' work together." That's exactly the kind of user who makes software better - someone who uses it in ways the developer didn't anticipate, with real requirements and real data to validate against.&lt;/p&gt;
&lt;h3&gt;Updating&lt;/h3&gt;
&lt;p&gt;If you're using the ballistics-engine CLI, update to get these fixes:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;cargo&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;ballistics-engine&lt;span class="w"&gt; &lt;/span&gt;--force
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;To enable the online solver (still experimental, but now actually tested):&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;cargo&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;ballistics-engine&lt;span class="w"&gt; &lt;/span&gt;--features&lt;span class="w"&gt; &lt;/span&gt;online&lt;span class="w"&gt; &lt;/span&gt;--force
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Then add &lt;code&gt;--online&lt;/code&gt; to your trajectory commands to use the ML-enhanced solver.&lt;/p&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;The user mentioned they'd been using webhooks to JBM Ballistics for years but experienced throttling issues. Having a local solver (with optional cloud enhancement) that they control completely changes their workflow reliability.&lt;/p&gt;
&lt;p&gt;There's also &lt;a href="https://baud.rs/H2gonn"&gt;BallisticsInsight.com&lt;/a&gt; for those who prefer a web interface, though the CLI remains the power-user choice for batch processing and integration into custom workflows.&lt;/p&gt;
&lt;p&gt;If you're using the ballistics-engine and find issues - or better yet, find that it matches your real-world dope - I'd love to hear about it. Real-world validation is worth more than a thousand unit tests.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;The ballistics-engine is open source and available on &lt;a href="https://baud.rs/2RblwQ"&gt;crates.io&lt;/a&gt;. The API documentation is at &lt;a href="https://baud.rs/JY6Kt4"&gt;api.ballistics.7.62x51mm.sh/v1/docs&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;</description><category>ballistics</category><category>cli</category><category>machine learning</category><category>open-source</category><category>rust</category><guid>https://tinycomputers.io/posts/real-world-validation-how-user-feedback-improved-the-ballistics-engine-cli.html</guid><pubDate>Mon, 26 Jan 2026 20:23:34 GMT</pubDate></item><item><title>Open Sourcing a High Performance Rust-based Ballistics Engine</title><link>https://tinycomputers.io/posts/open-sourcing-a-high-performance-rust-based-ballistics-engine.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/open-sourcing-a-high-performance-rust-based-ballistics-engine_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;12 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h2&gt;From SaaS to Open Source: The Evolution of a Ballistics Engine&lt;/h2&gt;
&lt;p&gt;When I first built &lt;a href="https://baud.rs/H2gonn"&gt;Ballistics Insight&lt;/a&gt;, my ML-augmented ballistics calculation platform, I faced a classic engineering dilemma: how to balance performance, accuracy, and maintainability across multiple platforms. The solution came in the form of a high-performance Rust core that became the beating heart of the system. Today, I'm excited to share that journey and announce the open-sourcing of this engine as a standalone library with full FFI bindings for iOS and Android.&lt;/p&gt;
&lt;h3&gt;The Genesis: A Python Problem&lt;/h3&gt;
&lt;p&gt;The story begins with a Python Flask application serving ballistics calculations through a REST API. The initial implementation worked well enough for proof-of-concept, but as I added more sophisticated physics models (Magnus effect, Coriolis force, transonic drag corrections, gyroscopic precession) the performance limitations became apparent. A single trajectory calculation that should take milliseconds was stretching into seconds. Monte Carlo simulations with thousands of iterations were becoming impractical.&lt;/p&gt;
&lt;p&gt;The Python implementation had another challenge: code duplication. I maintained separate implementations for atmospheric calculations, drag computations, and trajectory integration. Each time I fixed a bug or improved an algorithm, I had to ensure consistency across multiple code paths. The maintenance burden was growing exponentially with the feature set.&lt;/p&gt;
&lt;h3&gt;The Rust Revolution&lt;/h3&gt;
&lt;p&gt;The decision to rewrite the core physics engine in Rust wasn't taken lightly. I evaluated several options: optimizing the Python code with NumPy vectorization, using Cython for critical paths, or even moving to C++. Rust won for several compelling reasons:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Memory Safety Without Garbage Collection&lt;/strong&gt;: Ballistics calculations involve extensive numerical computation with predictable memory patterns. Rust's ownership system eliminated entire categories of bugs while maintaining deterministic performance.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Zero-Cost Abstractions&lt;/strong&gt;: I could write high-level, maintainable code that compiled down to assembly as efficient as hand-optimized C.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Excellent FFI Story&lt;/strong&gt;: Rust's ability to expose C-compatible interfaces meant I could integrate with any platform: Python, iOS, Android, or web via WebAssembly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Modern Tooling&lt;/strong&gt;: Cargo, Rust's build system and package manager, made dependency management and cross-compilation straightforward.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The results were dramatic. Atmospheric calculations went from 4.5ms in Python to 0.8ms in Rust, a 5.6x improvement. Complete trajectory calculations saw 15-20x performance gains. Monte Carlo simulations that previously took minutes now completed in seconds.&lt;/p&gt;
&lt;h3&gt;Architecture: From Monolith to Modular&lt;/h3&gt;
&lt;p&gt;The closed-source Ballistics Insight platform is a sophisticated system with ML augmentations, weather integration, and a comprehensive ammunition database. It includes features like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Neural network-based BC (Ballistic Coefficient) prediction&lt;/li&gt;
&lt;li&gt;Regional weather model integration with ERA5, OpenWeather, and NOAA data&lt;/li&gt;
&lt;li&gt;Magnus effect auto-calibration based on bullet classification&lt;/li&gt;
&lt;li&gt;Yaw damping prediction using gyroscopic stability factors&lt;/li&gt;
&lt;li&gt;A database of 2,000+ bullets with manufacturer specifications&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For the open-source release, I took a different approach. Rather than trying to extract everything, I focused on the core physics engine, the foundation that makes everything else possible. This meant:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Extracting Pure Physics&lt;/strong&gt;: I separated the deterministic physics calculations from the ML augmentations. The open-source engine provides the fundamental ballistics math, while the SaaS platform layers intelligent corrections on top.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Creating Clean Interfaces&lt;/strong&gt;: I designed a new FFI layer from scratch, ensuring that iOS and Android developers could easily integrate the engine without understanding Rust or ballistics physics.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Building Standalone Tools&lt;/strong&gt;: The engine includes a full-featured command-line interface, making it useful for researchers, enthusiasts, and developers who need quick calculations without writing code.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;The FFI Challenge: Making Rust Speak Every Language&lt;/h3&gt;
&lt;p&gt;One of my primary goals was to make the engine accessible from any platform. This meant creating robust Foreign Function Interface (FFI) bindings that could be consumed by Swift, Kotlin, Java, Python, or any language that can call C functions.&lt;/p&gt;
&lt;p&gt;The FFI layer presented unique challenges:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="cp"&gt;#[repr(C)]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;FFIBallisticInputs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;muzzle_velocity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;c_double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// m/s&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ballistic_coefficient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;c_double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;mass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;c_double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;                   &lt;/span&gt;&lt;span class="c1"&gt;// kg&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;diameter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;c_double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;               &lt;/span&gt;&lt;span class="c1"&gt;// meters&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;drag_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;c_int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;                &lt;/span&gt;&lt;span class="c1"&gt;// 0=G1, 1=G7&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;pub&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sight_height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;c_double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;&lt;span class="c1"&gt;// meters&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ... many more fields&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;I had to ensure:
- &lt;strong&gt;C-compatible memory layouts&lt;/strong&gt; using &lt;code&gt;#[repr(C)]&lt;/code&gt;
- &lt;strong&gt;Safe memory management&lt;/strong&gt; across language boundaries
- &lt;strong&gt;Graceful error handling&lt;/strong&gt; without exceptions
- &lt;strong&gt;Zero-copy data transfer&lt;/strong&gt; where possible&lt;/p&gt;
&lt;p&gt;The result is a library that can be dropped into an iOS app as a static library, integrated into Android via JNI, or called from Python using ctypes. Each platform sees a native interface while the Rust engine handles the heavy lifting.&lt;/p&gt;
&lt;h3&gt;The Mobile Story: Binary Libraries for iOS and Android&lt;/h3&gt;
&lt;p&gt;Creating mobile bindings required careful consideration of each platform's requirements:&lt;/p&gt;
&lt;h4&gt;iOS Integration&lt;/h4&gt;
&lt;p&gt;For iOS, I compile the Rust library to a universal static library supporting both ARM64 (devices) and x86_64 (simulator). Swift developers interact with the engine through a bridging header:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;inputs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;FFIBallisticInputs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;muzzle_velocity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;823.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;ballistic_coefficient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.475&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;mass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0109&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;diameter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.00782&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;let&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;result&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ballistics_calculate_trajectory&lt;/span&gt;&lt;span class="p"&gt;(&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1000.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;defer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ballistics_free_trajectory_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="bp"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Max range: &lt;/span&gt;&lt;span class="si"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pointee&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_range&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s"&gt; meters"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h4&gt;Android Integration&lt;/h4&gt;
&lt;p&gt;For Android, I provide pre-compiled libraries for multiple architectures (armeabi-v7a, arm64-v8a, x86, x86_64). The engine integrates seamlessly through JNI:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kd"&gt;class&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;BallisticsEngine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;external&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;fun&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;calculateTrajectory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;muzzleVelocity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;ballisticCoefficient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;mass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;diameter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Double&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;maxRange&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;Double&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;TrajectoryResult&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kd"&gt;companion&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kd"&gt;object&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;init&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="na"&gt;loadLibrary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ballistics_engine"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;Performance: The Numbers That Matter&lt;/h3&gt;
&lt;p&gt;The open-source engine achieves remarkable performance across all platforms:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Single Trajectory (1000m)&lt;/strong&gt;: ~5ms&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Monte Carlo Simulation (1000 runs)&lt;/strong&gt;: ~500ms&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;BC Estimation&lt;/strong&gt;: ~50ms&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Zero Calculation&lt;/strong&gt;: ~10ms&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These numbers represent pure computation time on modern hardware. The engine uses RK4 (4th-order Runge-Kutta) integration by default for maximum accuracy, with an option to switch to Euler's method for even faster computation when precision requirements are relaxed.&lt;/p&gt;
&lt;h3&gt;Advanced Physics: More Than Just Parabolas&lt;/h3&gt;
&lt;p&gt;While the basic trajectory of a projectile follows a parabolic path in a vacuum, real-world ballistics is far more complex. The engine models:&lt;/p&gt;
&lt;h4&gt;Aerodynamic Effects&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Velocity-dependent drag&lt;/strong&gt; using standard drag functions (G1, G7) or custom curves&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Transonic drag rise&lt;/strong&gt; as projectiles approach the speed of sound&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reynolds number corrections&lt;/strong&gt; for viscous effects at low velocities&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Form factor adjustments&lt;/strong&gt; based on projectile shape&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Gyroscopic Phenomena&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Spin drift&lt;/strong&gt; from the Magnus effect on spinning projectiles&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Precession and nutation&lt;/strong&gt; of the projectile's axis&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Spin decay&lt;/strong&gt; over the flight path&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Yaw of repose&lt;/strong&gt; in crosswinds&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Environmental Factors&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Coriolis effect&lt;/strong&gt; from Earth's rotation (critical for long-range shots)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Wind shear&lt;/strong&gt; modeling with altitude-dependent wind variations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Atmospheric stratification&lt;/strong&gt; using ICAO standard atmosphere&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Humidity effects&lt;/strong&gt; on air density&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Stability Analysis&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Dynamic stability&lt;/strong&gt; calculations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Pitch damping&lt;/strong&gt; coefficients through transonic regions&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Gyroscopic stability&lt;/strong&gt; factors&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Transonic instability&lt;/strong&gt; warnings&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;The Command Line Interface: Power at Your Fingertips&lt;/h3&gt;
&lt;p&gt;The engine includes a comprehensive CLI that rivals commercial ballistics software:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Basic trajectory with auto-zeroing&lt;/span&gt;
./ballistics&lt;span class="w"&gt; &lt;/span&gt;trajectory&lt;span class="w"&gt; &lt;/span&gt;-v&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2700&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-b&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.475&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;168&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-d&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.308&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;--auto-zero&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--max-range&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1000&lt;/span&gt;

&lt;span class="c1"&gt;# Monte Carlo simulation for load development&lt;/span&gt;
./ballistics&lt;span class="w"&gt; &lt;/span&gt;monte-carlo&lt;span class="w"&gt; &lt;/span&gt;-v&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2700&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-b&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.475&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;168&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-d&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.308&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;-n&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--velocity-std&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--bc-std&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.01&lt;span class="w"&gt; &lt;/span&gt;--target-distance&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;600&lt;/span&gt;

&lt;span class="c1"&gt;# Estimate BC from observed drops&lt;/span&gt;
./ballistics&lt;span class="w"&gt; &lt;/span&gt;estimate-bc&lt;span class="w"&gt; &lt;/span&gt;-v&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2700&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-m&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;168&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-d&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.308&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;--distance1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--drop1&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.0&lt;span class="w"&gt; &lt;/span&gt;--distance2&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;300&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--drop2&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;.075
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The CLI supports both imperial (default) and metric units, multiple output formats (table, JSON, CSV), and can enable individual physics models as needed.&lt;/p&gt;
&lt;h3&gt;Lessons Learned: The Open Source Journey&lt;/h3&gt;
&lt;p&gt;Extracting and open-sourcing a core component from a larger system taught me valuable lessons:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Clear Boundaries Matter&lt;/strong&gt;: Separating deterministic physics from ML augmentations made the extraction cleaner and the resulting library more focused.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Documentation is Code&lt;/strong&gt;: I invested heavily in documentation, from inline Rust docs to comprehensive README examples. Good documentation dramatically increases adoption.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Performance Benchmarks Build Trust&lt;/strong&gt;: Publishing concrete performance numbers helps users understand what they're getting and sets realistic expectations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;FFI Design is Critical&lt;/strong&gt;: A well-designed FFI layer makes the difference between a library that's theoretically cross-platform and one that's actually used across platforms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Community Feedback is Gold&lt;/strong&gt;: Early users found edge cases I never considered and suggested features that made the engine more valuable.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;The Website: ballistics.rs&lt;/h3&gt;
&lt;p&gt;To support the open-source project, I created &lt;a href="https://baud.rs/jliUH9"&gt;ballistics.rs&lt;/a&gt;, a dedicated website that serves as the central hub for documentation, downloads, and community engagement. Built as a static site hosted on Google Cloud Platform with global CDN distribution, it provides fast access to resources from anywhere in the world.&lt;/p&gt;
&lt;p&gt;The website showcases:
- Comprehensive documentation and API references
- Platform-specific integration guides
- Performance benchmarks and comparisons
- Example code and use cases
- Links to the GitHub repository and issue tracker&lt;/p&gt;
&lt;h3&gt;Looking Forward: The Future of Open Ballistics&lt;/h3&gt;
&lt;p&gt;Open-sourcing the ballistics engine is just the beginning. I'm excited about several upcoming developments:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;WebAssembly Support&lt;/strong&gt;: Bringing high-performance ballistics calculations directly to web browsers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GPU Acceleration&lt;/strong&gt;: For massive Monte Carlo simulations and trajectory optimization.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Extended Drag Models&lt;/strong&gt;: Supporting more specialized drag functions for specific projectile types.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Community Contributions&lt;/strong&gt;: I'm already seeing pull requests for new features and improvements.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Educational Resources&lt;/strong&gt;: Creating interactive visualizations and tutorials to help people understand ballistics physics.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;The Business Model: Open Core Done Right&lt;/h3&gt;
&lt;p&gt;My approach follows the "open core" model. The fundamental physics engine is open source and will always remain so. The value-added features in Ballistics Insight (ML augmentations, weather integration, ammunition databases, and the web API) constitute our commercial offering.&lt;/p&gt;
&lt;p&gt;This model benefits everyone:
- Developers get a production-ready ballistics engine for their applications
- Researchers have a reference implementation for ballistics algorithms
- The community can contribute improvements that benefit all users
- I maintain a sustainable business while giving back to the open-source ecosystem&lt;/p&gt;
&lt;h3&gt;Conclusion: Precision Through Open Collaboration&lt;/h3&gt;
&lt;p&gt;The journey from a closed-source SaaS platform to an open-source library with mobile bindings represents more than just a code release. It's a commitment to the principle that fundamental scientific calculations should be open, verifiable, and accessible to all.&lt;/p&gt;
&lt;p&gt;By open-sourcing the ballistics engine, I'm not just sharing code; I'm inviting collaboration from developers, researchers, and enthusiasts worldwide. Whether you're building a mobile app for hunters, creating educational software for physics students, or conducting research on projectile dynamics, you now have access to a battle-tested, high-performance engine that handles the complex mathematics of ballistics.&lt;/p&gt;
&lt;p&gt;The combination of Rust's performance and safety, comprehensive physics modeling, and carefully designed FFI bindings creates a unique resource in the ballistics software ecosystem. I'm excited to see what the community builds with it.&lt;/p&gt;
&lt;p&gt;Visit &lt;a href="https://baud.rs/jliUH9"&gt;ballistics.rs&lt;/a&gt; to get started, browse the documentation, or contribute to the project. The repository is available on &lt;a href="https://baud.rs/QckusG"&gt;GitHub&lt;/a&gt;, and I welcome issues, pull requests, and feedback.&lt;/p&gt;
&lt;p&gt;In the world of ballistics, precision is everything. With this open-source release, I'm putting that precision in your hands.&lt;/p&gt;</description><category>android</category><category>ballistics</category><category>ffi</category><category>ios</category><category>open-source</category><category>physics</category><category>rust</category><category>simulation</category><guid>https://tinycomputers.io/posts/open-sourcing-a-high-performance-rust-based-ballistics-engine.html</guid><pubDate>Sat, 16 Aug 2025 21:11:16 GMT</pubDate></item></channel></rss>