<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>TinyComputers.io (Posts about interpreters)</title><link>https://tinycomputers.io/</link><description></description><atom:link href="https://tinycomputers.io/categories/interpreters.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2026 A.C. Jokela 
&lt;!-- div style="width: 100%" --&gt;
&lt;a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"&gt;&lt;img alt="" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/80x15.png" /&gt; Creative Commons Attribution-ShareAlike&lt;/a&gt;&amp;nbsp;|&amp;nbsp;
&lt;!-- /div --&gt;
</copyright><lastBuildDate>Mon, 06 Apr 2026 22:12:57 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>A Stack-Based Bytecode VM for Lattice: 100 Opcodes, Serialization, and a Self-Hosted Compiler</title><link>https://tinycomputers.io/posts/a-stack-based-bytecode-vm-for-lattice.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/a-stack-based-bytecode-vm-for-lattice_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;29 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;When I &lt;a href="https://tinycomputers.io/posts/from-tree-walker-to-bytecode-vm-compiling-lattice.html"&gt;first wrote about&lt;/a&gt; Lattice's move from a tree-walking interpreter to a bytecode VM, the instruction set had 62 opcodes, concurrency primitives still delegated to the tree-walker, and programs couldn't be serialized. The VM was a foundation, correct and complete enough to become the default, but clearly a starting point.&lt;/p&gt;
&lt;p&gt;That was ten versions ago. The bytecode VM now has 100 opcodes, compiles concurrency primitives into standalone sub-chunks with zero AST dependency at runtime, ships a binary serialization format for ahead-of-time compilation, includes an ephemeral bump arena for short-lived string temporaries, and (perhaps most satisfyingly) has a self-hosted compiler written entirely in Lattice that produces the same &lt;code&gt;.latc&lt;/code&gt; bytecode files as the C implementation.&lt;/p&gt;
&lt;p&gt;This post walks through what changed and why. The full technical treatment is available as a &lt;a href="https://tinycomputers.io/papers/lattice_vm.pdf"&gt;research paper&lt;/a&gt;; this is the practitioner's version.&lt;/p&gt;
&lt;h3&gt;Why Keep Going&lt;/h3&gt;
&lt;p&gt;The &lt;a href="https://tinycomputers.io/posts/from-tree-walker-to-bytecode-vm-compiling-lattice.html"&gt;original bytecode VM&lt;/a&gt; solved the immediate problems: it eliminated recursive AST dispatch overhead and gave Lattice a single execution path for file execution, the REPL, and the WASM playground. But three issues remained.&lt;/p&gt;
&lt;p&gt;First, &lt;code&gt;OP_SCOPE&lt;/code&gt; and &lt;code&gt;OP_SELECT&lt;/code&gt; (Lattice's structured concurrency opcodes) still stored AST node pointers in the constant pool and dropped into the tree-walking evaluator at runtime. This meant the AST had to stay alive during concurrent execution, which defeated one of the main motivations for having a bytecode VM in the first place.&lt;/p&gt;
&lt;p&gt;Second, the AST dependency made serialization impossible. You can serialize bytecode to a file, but you can't easily serialize an arbitrary C pointer to an AST node. Programs had to be parsed and compiled on every run.&lt;/p&gt;
&lt;p&gt;Third, the dispatch loop used a plain &lt;code&gt;switch&lt;/code&gt; statement. Not a crisis, but computed goto dispatch is a well-known improvement for bytecode interpreters, and leaving it on the table felt unnecessary.&lt;/p&gt;
&lt;p&gt;All three problems are solved now. Let me start with the instruction set, since everything else builds on it.&lt;/p&gt;
&lt;h3&gt;100 Opcodes&lt;/h3&gt;
&lt;p&gt;The instruction set grew from 62 to 100 opcodes, organized into 16 functional categories:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Representative opcodes&lt;/th&gt;
&lt;th style="text-align: right;"&gt;Count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Stack manipulation&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CONSTANT&lt;/code&gt;, &lt;code&gt;NIL&lt;/code&gt;, &lt;code&gt;TRUE&lt;/code&gt;, &lt;code&gt;FALSE&lt;/code&gt;, &lt;code&gt;UNIT&lt;/code&gt;, &lt;code&gt;POP&lt;/code&gt;, &lt;code&gt;DUP&lt;/code&gt;, &lt;code&gt;SWAP&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Arithmetic/logical&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ADD&lt;/code&gt;, &lt;code&gt;SUB&lt;/code&gt;, &lt;code&gt;MUL&lt;/code&gt;, &lt;code&gt;DIV&lt;/code&gt;, &lt;code&gt;MOD&lt;/code&gt;, &lt;code&gt;NEG&lt;/code&gt;, &lt;code&gt;NOT&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bitwise&lt;/td&gt;
&lt;td&gt;&lt;code&gt;BIT_AND&lt;/code&gt;, &lt;code&gt;BIT_OR&lt;/code&gt;, &lt;code&gt;BIT_XOR&lt;/code&gt;, &lt;code&gt;BIT_NOT&lt;/code&gt;, &lt;code&gt;LSHIFT&lt;/code&gt;, &lt;code&gt;RSHIFT&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Comparison&lt;/td&gt;
&lt;td&gt;&lt;code&gt;EQ&lt;/code&gt;, &lt;code&gt;NEQ&lt;/code&gt;, &lt;code&gt;LT&lt;/code&gt;, &lt;code&gt;GT&lt;/code&gt;, &lt;code&gt;LTEQ&lt;/code&gt;, &lt;code&gt;GTEQ&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;String&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CONCAT&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Variables&lt;/td&gt;
&lt;td&gt;&lt;code&gt;GET/SET_LOCAL&lt;/code&gt;, &lt;code&gt;GET/SET/DEFINE_GLOBAL&lt;/code&gt;, &lt;code&gt;GET/SET_UPVALUE&lt;/code&gt;, &lt;code&gt;CLOSE_UPVALUE&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Control flow&lt;/td&gt;
&lt;td&gt;&lt;code&gt;JUMP&lt;/code&gt;, &lt;code&gt;JUMP_IF_FALSE&lt;/code&gt;, &lt;code&gt;JUMP_IF_TRUE&lt;/code&gt;, &lt;code&gt;JUMP_IF_NOT_NIL&lt;/code&gt;, &lt;code&gt;LOOP&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Functions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CALL&lt;/code&gt;, &lt;code&gt;CLOSURE&lt;/code&gt;, &lt;code&gt;RETURN&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Iterators&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ITER_INIT&lt;/code&gt;, &lt;code&gt;ITER_NEXT&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data structures&lt;/td&gt;
&lt;td&gt;&lt;code&gt;BUILD_ARRAY&lt;/code&gt;, &lt;code&gt;INDEX&lt;/code&gt;, &lt;code&gt;SET_INDEX&lt;/code&gt;, &lt;code&gt;GET_FIELD&lt;/code&gt;, &lt;code&gt;INVOKE&lt;/code&gt;, etc.&lt;/td&gt;
&lt;td style="text-align: right;"&gt;15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exceptions/defer&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PUSH_EXCEPTION_HANDLER&lt;/code&gt;, &lt;code&gt;THROW&lt;/code&gt;, &lt;code&gt;DEFER_PUSH&lt;/code&gt;, &lt;code&gt;DEFER_RUN&lt;/code&gt;, etc.&lt;/td&gt;
&lt;td style="text-align: right;"&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase system&lt;/td&gt;
&lt;td&gt;&lt;code&gt;FREEZE&lt;/code&gt;, &lt;code&gt;THAW&lt;/code&gt;, &lt;code&gt;CLONE&lt;/code&gt;, &lt;code&gt;MARK_FLUID&lt;/code&gt;, &lt;code&gt;REACT&lt;/code&gt;, &lt;code&gt;BOND&lt;/code&gt;, &lt;code&gt;SEED&lt;/code&gt;, etc.&lt;/td&gt;
&lt;td style="text-align: right;"&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Builtins/modules&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PRINT&lt;/code&gt;, &lt;code&gt;IMPORT&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrency&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SCOPE&lt;/code&gt;, &lt;code&gt;SELECT&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integer fast paths&lt;/td&gt;
&lt;td&gt;&lt;code&gt;INC_LOCAL&lt;/code&gt;, &lt;code&gt;DEC_LOCAL&lt;/code&gt;, &lt;code&gt;ADD_INT&lt;/code&gt;, &lt;code&gt;SUB_INT&lt;/code&gt;, &lt;code&gt;LOAD_INT8&lt;/code&gt;, etc.&lt;/td&gt;
&lt;td style="text-align: right;"&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wide variants&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CONSTANT_16&lt;/code&gt;, &lt;code&gt;GET_GLOBAL_16&lt;/code&gt;, &lt;code&gt;SET_GLOBAL_16&lt;/code&gt;, &lt;code&gt;DEFINE_GLOBAL_16&lt;/code&gt;, &lt;code&gt;CLOSURE_16&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Special&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RESET_EPHEMERAL&lt;/code&gt;, &lt;code&gt;HALT&lt;/code&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td style="text-align: right;"&gt;&lt;strong&gt;100&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The growth came from three directions: the integer fast-path opcodes (8 new), the wide constant variants (5 new), and the concurrency/arena opcodes. Let me explain each.&lt;/p&gt;
&lt;h4&gt;Integer Fast Paths&lt;/h4&gt;
&lt;p&gt;Tight loops like &lt;code&gt;for i in 0..1000&lt;/code&gt; spend most of their time incrementing a counter and comparing it to a bound. The generic &lt;code&gt;OP_ADD&lt;/code&gt; has to check whether its operands are integers, floats, or strings (for concatenation), which adds branching overhead on every iteration.&lt;/p&gt;
&lt;p&gt;The integer fast-path opcodes (&lt;code&gt;OP_ADD_INT&lt;/code&gt;, &lt;code&gt;OP_SUB_INT&lt;/code&gt;, &lt;code&gt;OP_MUL_INT&lt;/code&gt;, &lt;code&gt;OP_LT_INT&lt;/code&gt;, &lt;code&gt;OP_LTEQ_INT&lt;/code&gt;) skip the type check entirely and operate directly on &lt;code&gt;int64_t&lt;/code&gt; values. &lt;code&gt;OP_INC_LOCAL&lt;/code&gt; and &lt;code&gt;OP_DEC_LOCAL&lt;/code&gt; handle the &lt;code&gt;i += 1&lt;/code&gt; and &lt;code&gt;i -= 1&lt;/code&gt; patterns as single-byte instructions that modify the stack slot in place, no push or pop required.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;OP_LOAD_INT8&lt;/code&gt; encodes a signed byte directly in the instruction stream. The integer &lt;code&gt;42&lt;/code&gt; becomes two bytes (&lt;code&gt;OP_LOAD_INT8&lt;/code&gt;, &lt;code&gt;0x2A&lt;/code&gt;) instead of a three-byte &lt;code&gt;OP_CONSTANT&lt;/code&gt; plus an eight-byte constant pool entry. Any integer in [-128, 127] gets this treatment.&lt;/p&gt;
&lt;h4&gt;Wide Constant Variants&lt;/h4&gt;
&lt;p&gt;The original instruction set used a single byte for constant pool indices, limiting each chunk to 256 constants. This is fine for most functions, but the self-hosted compiler (a 2,000-line Lattice program compiled as a single top-level script) blows past that limit easily.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;OP_CONSTANT_16&lt;/code&gt;, &lt;code&gt;OP_GET_GLOBAL_16&lt;/code&gt;, &lt;code&gt;OP_SET_GLOBAL_16&lt;/code&gt;, &lt;code&gt;OP_DEFINE_GLOBAL_16&lt;/code&gt;, and &lt;code&gt;OP_CLOSURE_16&lt;/code&gt; use two-byte big-endian indices, supporting up to 65,536 constants per chunk. The compiler automatically switches to wide variants when an index exceeds 255.&lt;/p&gt;
&lt;h3&gt;The Compiler&lt;/h3&gt;
&lt;p&gt;The bytecode compiler performs a single-pass walk over the AST. It maintains a chain of &lt;code&gt;Compiler&lt;/code&gt; structs linked via &lt;code&gt;enclosing&lt;/code&gt; pointers, one per function being compiled. Variable references resolve through three tiers: local (scan the current compiler's locals array), upvalue (recursively check enclosing compilers), and global (fall through to &lt;code&gt;OP_GET_GLOBAL&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Three compilation modes handle different use cases. &lt;code&gt;compile()&lt;/code&gt; is the standard file mode: it compiles all declarations and emits an implicit call to &lt;code&gt;main()&lt;/code&gt; if one is defined. &lt;code&gt;compile_module()&lt;/code&gt; is for imports, identical to &lt;code&gt;compile()&lt;/code&gt; but skips the auto-call. &lt;code&gt;compile_repl()&lt;/code&gt; preserves the last expression on the stack as the iteration's return value (displayed with &lt;code&gt;=&amp;gt;&lt;/code&gt; prefix) and keeps the known-enum table alive across REPL iterations so enum declarations persist.&lt;/p&gt;
&lt;p&gt;The compiler implements several optimizations during code generation. Binary operations on literal operands are folded at compile time: &lt;code&gt;3 + 4&lt;/code&gt; emits a single &lt;code&gt;OP_LOAD_INT8 7&lt;/code&gt; rather than two loads and an &lt;code&gt;OP_ADD&lt;/code&gt;. The pattern &lt;code&gt;x += 1&lt;/code&gt; is detected and emitted as the single-byte &lt;code&gt;OP_INC_LOCAL&lt;/code&gt;, which modifies the stack slot in place. And every statement is wrapped by &lt;code&gt;compile_stmt_reset()&lt;/code&gt;, which appends &lt;code&gt;OP_RESET_EPHEMERAL&lt;/code&gt; to trigger the ephemeral arena cleanup.&lt;/p&gt;
&lt;h3&gt;Computed Goto Dispatch&lt;/h3&gt;
&lt;p&gt;The dispatch loop now uses GCC/Clang's labels-as-values extension for computed goto:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="cp"&gt;#ifdef VM_USE_COMPUTED_GOTO&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;static&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;dispatch_table&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;OP_CONSTANT&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;lbl_OP_CONSTANT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;OP_NIL&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;lbl_OP_NIL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ... all 100 entries&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="cp"&gt;#endif&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(;;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="cp"&gt;#ifdef VM_USE_COMPUTED_GOTO&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;goto&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;dispatch_table&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="cp"&gt;#endif&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;switch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Each opcode handler ends with a &lt;code&gt;goto *dispatch_table[READ_BYTE()]&lt;/code&gt; rather than breaking back to the top of the loop. This eliminates the switch statement's bounds check and branch table indirection, replacing it with a single indirect jump. The CPU's branch predictor sees different jump sites for different opcodes, which improves prediction accuracy compared to a single switch that all opcodes funnel through.&lt;/p&gt;
&lt;p&gt;On platforms without the extension, it falls back to a standard switch. The VM works correctly either way.&lt;/p&gt;
&lt;h3&gt;Pre-Compiled Concurrency&lt;/h3&gt;
&lt;p&gt;This is the change I'm most pleased with, because it solves the problem cleanly.&lt;/p&gt;
&lt;p&gt;Lattice has three concurrency primitives: &lt;code&gt;scope&lt;/code&gt; defines a concurrent region, &lt;code&gt;spawn&lt;/code&gt; launches a task within that region, and &lt;code&gt;select&lt;/code&gt; multiplexes over channels. In the tree-walker, these work by passing AST node pointers to spawned threads, which then evaluate the subtrees independently. The bytecode VM's original implementation did the same thing: &lt;code&gt;OP_SCOPE&lt;/code&gt; stored an &lt;code&gt;Expr*&lt;/code&gt; pointer in the constant pool and called the tree-walking evaluator at runtime.&lt;/p&gt;
&lt;p&gt;The solution is to compile each concurrent body into a standalone &lt;code&gt;Chunk&lt;/code&gt; at compile time. The compiler provides two helpers: &lt;code&gt;compile_sub_body()&lt;/code&gt; for statement blocks and &lt;code&gt;compile_sub_expr()&lt;/code&gt; for expressions. Each creates a fresh &lt;code&gt;Compiler&lt;/code&gt;, compiles the code into a new chunk, emits &lt;code&gt;OP_HALT&lt;/code&gt;, and stores the resulting chunk in the parent's constant pool as a &lt;code&gt;VAL_CLOSURE&lt;/code&gt; constant.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;OP_SCOPE&lt;/code&gt; uses variable-length encoding: a spawn count, a sync body chunk index, and one chunk index per spawn body. At runtime, the VM:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Exports locals&lt;/strong&gt; to the global environment using the &lt;code&gt;local_names&lt;/code&gt; debug table, so sub-chunks can access parent variables via &lt;code&gt;OP_GET_GLOBAL&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Runs the sync body&lt;/strong&gt; (if present) via a recursive &lt;code&gt;vm_run()&lt;/code&gt; call&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Spawns threads&lt;/strong&gt; for each spawn body, each running on a cloned VM&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Joins&lt;/strong&gt; all threads and propagates errors&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;code&gt;OP_SELECT&lt;/code&gt; similarly encodes per-arm metadata: flags, channel expression chunk index, body chunk index, and binding name index. The VM evaluates channel expressions, polls for readiness, and executes the winning arm.&lt;/p&gt;
&lt;p&gt;The key insight is that sub-chunks run as &lt;code&gt;FUNC_SCRIPT&lt;/code&gt; without lexical access to the parent's locals. Since they can't use upvalues to reach into the parent frame, the VM exports the parent's live locals into the global environment before running any sub-chunk, using a pushed scope that gets popped after all sub-chunks complete. This is slightly more expensive than true lexical capture, but it keeps the sub-chunks completely self-contained: no AST, no parent frame dependency, fully serializable.&lt;/p&gt;
&lt;h3&gt;Bytecode Serialization&lt;/h3&gt;
&lt;p&gt;With AST dependency eliminated, serialization becomes straightforward. The &lt;code&gt;.latc&lt;/code&gt; binary format starts with an 8-byte header:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;[4C 41 54 43]  magic: "LATC"
[01 00]        format version: 1
[00 00]        reserved
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The rest is a recursive chunk encoding: code length + bytecode bytes, line numbers for source mapping, typed constants (with a one-byte type tag for each), and local name debug info. Constants use seven type tags:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: right;"&gt;Tag&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Encoding&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: right;"&gt;0&lt;/td&gt;
&lt;td&gt;Int&lt;/td&gt;
&lt;td&gt;8-byte signed LE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: right;"&gt;1&lt;/td&gt;
&lt;td&gt;Float&lt;/td&gt;
&lt;td&gt;8-byte IEEE 754&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: right;"&gt;2&lt;/td&gt;
&lt;td&gt;Bool&lt;/td&gt;
&lt;td&gt;1 byte&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: right;"&gt;3&lt;/td&gt;
&lt;td&gt;String&lt;/td&gt;
&lt;td&gt;length-prefixed (u32 + bytes)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: right;"&gt;4&lt;/td&gt;
&lt;td&gt;Nil&lt;/td&gt;
&lt;td&gt;no payload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: right;"&gt;5&lt;/td&gt;
&lt;td&gt;Unit&lt;/td&gt;
&lt;td&gt;no payload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: right;"&gt;6&lt;/td&gt;
&lt;td&gt;Closure&lt;/td&gt;
&lt;td&gt;param count + variadic flag + recursive sub-chunk&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The &lt;code&gt;Closure&lt;/code&gt; tag is what makes this recursive: a function constant contains its parameter metadata followed by a complete serialized sub-chunk. Nested functions serialize naturally to arbitrary depth.&lt;/p&gt;
&lt;p&gt;The CLI integrates this cleanly:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Compile to .latc&lt;/span&gt;
clat&lt;span class="w"&gt; &lt;/span&gt;compile&lt;span class="w"&gt; &lt;/span&gt;input.lat&lt;span class="w"&gt; &lt;/span&gt;-o&lt;span class="w"&gt; &lt;/span&gt;output.latc

&lt;span class="c1"&gt;# Run pre-compiled bytecode (auto-detects .latc suffix)&lt;/span&gt;
clat&lt;span class="w"&gt; &lt;/span&gt;output.latc

&lt;span class="c1"&gt;# Or compile and run in one step (the default)&lt;/span&gt;
clat&lt;span class="w"&gt; &lt;/span&gt;input.lat
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Loading validates magic bytes, checks the format version, and uses a bounds-checking &lt;code&gt;ByteReader&lt;/code&gt; that produces descriptive error messages for truncated or malformed inputs.&lt;/p&gt;
&lt;h3&gt;The Ephemeral Bump Arena&lt;/h3&gt;
&lt;p&gt;String concatenation is a common source of short-lived allocations. An expression like &lt;code&gt;"hello " + name + "!"&lt;/code&gt; creates intermediate strings that are immediately consumed and discarded. In a language with deep-clone-on-read semantics, these temporaries add up.&lt;/p&gt;
&lt;p&gt;The ephemeral bump arena is a simple optimization: string concatenation in &lt;code&gt;OP_ADD&lt;/code&gt; and &lt;code&gt;OP_CONCAT&lt;/code&gt; allocates into a bump arena (&lt;code&gt;vm-&amp;gt;ephemeral&lt;/code&gt;) instead of the general-purpose heap. These allocations are tagged with &lt;code&gt;REGION_EPHEMERAL&lt;/code&gt;, and &lt;code&gt;OP_RESET_EPHEMERAL&lt;/code&gt; (emitted by the compiler at every statement boundary) resets the arena in O(1), reclaiming all temporary strings at once.&lt;/p&gt;
&lt;p&gt;The tricky part is escape analysis. If a temporary string gets assigned to a global variable, stored in an array, or passed to a compiled closure, it needs to be promoted out of the ephemeral arena before the arena is reset. The VM handles this at specific escape points: &lt;code&gt;OP_DEFINE_GLOBAL&lt;/code&gt;, &lt;code&gt;OP_CALL&lt;/code&gt; (for compiled closures), &lt;code&gt;array.push&lt;/code&gt;, and &lt;code&gt;OP_SET_INDEX_LOCAL&lt;/code&gt;. Each of these calls &lt;code&gt;vm_promote_value()&lt;/code&gt;, which deep-clones the string to the regular heap if its region is ephemeral.&lt;/p&gt;
&lt;p&gt;The arena uses a page-based allocator with 4 KB pages. Resetting doesn't free pages; it just moves the bump pointer back to zero, so subsequent allocations reuse the same memory without any &lt;code&gt;malloc&lt;/code&gt;/&lt;code&gt;free&lt;/code&gt; overhead. The full design and safety proof are covered in a &lt;a href="https://tinycomputers.io/papers/lattice_arena_safety.pdf"&gt;companion paper&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;Closures and the Storage Hack&lt;/h3&gt;
&lt;p&gt;The upvalue system hasn't changed architecturally since the &lt;a href="https://tinycomputers.io/posts/from-tree-walker-to-bytecode-vm-compiling-lattice.html"&gt;first VM post&lt;/a&gt;; it's still the Lua-inspired open/closed model where &lt;code&gt;ObjUpvalue&lt;/code&gt; structs start pointing into the stack and get closed (deep-cloned to the heap) when variables go out of scope. But the encoding grew to accommodate the wider instruction set.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;OP_CLOSURE&lt;/code&gt; uses variable-length encoding: a constant pool index for the function's compiled chunk, an upvalue count, and then &lt;code&gt;[is_local, index]&lt;/code&gt; byte pairs for each captured variable. &lt;code&gt;OP_CLOSURE_16&lt;/code&gt; uses a two-byte big-endian function index for chunks with more than 256 constants.&lt;/p&gt;
&lt;p&gt;The storage hack (repurposing &lt;code&gt;closure.body&lt;/code&gt; (NULL), &lt;code&gt;closure.native_fn&lt;/code&gt; (Chunk pointer), &lt;code&gt;closure.captured_env&lt;/code&gt; (ObjUpvalue** cast), and &lt;code&gt;region_id&lt;/code&gt; (upvalue count)) remains unchanged. A sentinel value &lt;code&gt;VM_NATIVE_MARKER&lt;/code&gt; distinguishes C-native functions from compiled closures:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="cp"&gt;#define VM_NATIVE_MARKER ((struct Expr **)(uintptr_t)0x1)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;A closure with &lt;code&gt;body == NULL&lt;/code&gt; and &lt;code&gt;native_fn != NULL&lt;/code&gt; is either a C native (if &lt;code&gt;default_values == VM_NATIVE_MARKER&lt;/code&gt;) or a compiled bytecode function (otherwise). This avoids adding VM-specific fields to the &lt;code&gt;LatValue&lt;/code&gt; union, which matters when values are deep-cloned frequently.&lt;/p&gt;
&lt;h3&gt;The Self-Hosted Compiler&lt;/h3&gt;
&lt;p&gt;The file &lt;code&gt;compiler/latc.lat&lt;/code&gt; is a bytecode compiler written entirely in Lattice, approximately 2,060 lines that read &lt;code&gt;.lat&lt;/code&gt; source, produce bytecode, and write &lt;code&gt;.latc&lt;/code&gt; files using the same binary format as the C implementation:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="c1"&gt;# Use the self-hosted compiler&lt;/span&gt;
clat&lt;span class="w"&gt; &lt;/span&gt;compiler/latc.lat&lt;span class="w"&gt; &lt;/span&gt;input.lat&lt;span class="w"&gt; &lt;/span&gt;output.latc

&lt;span class="c1"&gt;# Run the result&lt;/span&gt;
clat&lt;span class="w"&gt; &lt;/span&gt;output.latc
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The architecture mirrors the C compiler: lexing via the built-in &lt;code&gt;tokenize()&lt;/code&gt; function, a recursive-descent parser, single-pass code emission, and scope management with upvalue resolution. But Lattice's value semantics required some creative workarounds.&lt;/p&gt;
&lt;p&gt;The biggest constraint is that structs and maps are pass-by-value. In C, the compiler uses a &lt;code&gt;Compiler&lt;/code&gt; struct with mutable fields: local arrays, scope depth, a chunk pointer. In Lattice, passing a struct to a function creates a copy, so mutations in the callee don't propagate back. The self-hosted compiler works around this with parallel global arrays: &lt;code&gt;code&lt;/code&gt;, &lt;code&gt;constants&lt;/code&gt;, &lt;code&gt;c_lines&lt;/code&gt;, &lt;code&gt;local_names&lt;/code&gt;, &lt;code&gt;local_depths&lt;/code&gt;, &lt;code&gt;local_captured&lt;/code&gt;. Since array mutations via &lt;code&gt;.push()&lt;/code&gt; and index assignment are in-place (via &lt;code&gt;resolve_lvalue&lt;/code&gt;), global arrays work where structs don't.&lt;/p&gt;
&lt;p&gt;Nested function compilation uses explicit &lt;code&gt;save_compiler()&lt;/code&gt; / &lt;code&gt;restore_compiler()&lt;/code&gt; functions that copy all global arrays to local temporaries and back. It's verbose but correct. The Buffer type (used for serialization output) is also pass-by-value, so a global &lt;code&gt;ser_buf&lt;/code&gt; accumulates serialized bytes across function calls.&lt;/p&gt;
&lt;p&gt;Other language constraints: no &lt;code&gt;else if&lt;/code&gt; (requires &lt;code&gt;else { if ... }&lt;/code&gt; or &lt;code&gt;match&lt;/code&gt;), mandatory type annotations on function parameters (&lt;code&gt;fn foo(a: any)&lt;/code&gt;), and &lt;code&gt;test&lt;/code&gt; is a keyword so you can't use it as an identifier.&lt;/p&gt;
&lt;p&gt;The self-hosted compiler currently handles expressions, variables, functions with closures, control flow (if/else, while, loop, for, break, continue, match), structs, enums, exceptions, defer, string interpolation, and imports. Not yet implemented: concurrency primitives and advanced phase operations (react, bond, seed). The bootstrapping chain is:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;latc.lat → [C VM interprets] → output.latc → [C VM executes]
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Full self-hosting (where &lt;code&gt;latc.lat&lt;/code&gt; compiles itself) requires adding concurrency support and closing the remaining feature gaps.&lt;/p&gt;
&lt;h3&gt;The VM Execution Engine&lt;/h3&gt;
&lt;p&gt;The VM maintains a 4,096-slot value stack, a 256-frame call stack, an exception handler stack (64 entries), a defer stack (256 entries), a global environment, the open upvalue linked list, the ephemeral arena, and a module cache. A pre-allocated &lt;code&gt;fast_args[16]&lt;/code&gt; buffer avoids heap allocation for most native function calls.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;OP_CALL&lt;/code&gt; instruction discriminates three callee types. Native C functions (marked with &lt;code&gt;VM_NATIVE_MARKER&lt;/code&gt;) get the fast path: arguments are popped into &lt;code&gt;fast_args&lt;/code&gt;, the C function pointer is invoked, and the return value is pushed. No call frame allocated. Compiled closures get the full treatment: the VM promotes ephemeral values in the current frame (so the callee's &lt;code&gt;OP_RESET_EPHEMERAL&lt;/code&gt; doesn't invalidate the caller's temporaries), then pushes a new &lt;code&gt;CallFrame&lt;/code&gt; with the instruction pointer at byte 0 of the callee's chunk. Callable structs look up a constructor-named field and dispatch accordingly.&lt;/p&gt;
&lt;p&gt;Exception handling uses a handler stack. &lt;code&gt;OP_PUSH_EXCEPTION_HANDLER&lt;/code&gt; records the current IP, chunk, call frame index, and stack top. When &lt;code&gt;OP_THROW&lt;/code&gt; executes, the nearest handler is popped, the call frame and value stacks are unwound, the error value is pushed, and execution resumes at the handler's saved IP. Deferred blocks interact correctly: &lt;code&gt;OP_DEFER_RUN&lt;/code&gt; executes all defer entries registered at or above the current frame before the frame is popped by &lt;code&gt;OP_RETURN&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Iterators avoid closure allocation entirely. &lt;code&gt;OP_ITER_INIT&lt;/code&gt; converts a range or array into an internal iterator occupying two stack slots (collection + cursor index). &lt;code&gt;OP_ITER_NEXT&lt;/code&gt; advances the cursor, pushes the next element, or jumps to a specified offset when exhausted. The tree-walker used closure-based iterators for &lt;code&gt;for&lt;/code&gt; loops; the bytecode version is simpler and avoids the allocation.&lt;/p&gt;
&lt;h3&gt;Ref&amp;lt;T&amp;gt;: The Escape Hatch from Value Semantics&lt;/h3&gt;
&lt;p&gt;Everything described so far operates in a world where values are deep-cloned on every read. Maps are pass-by-value. Structs are pass-by-value. Pass a collection to a function and the function gets its own copy; mutations don't propagate back. This is correct and eliminates aliasing bugs, but it creates a real problem: how do you share mutable state when you actually need to?&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Ref&amp;lt;T&amp;gt;&lt;/code&gt; is the answer. It's a reference-counted shared mutable wrapper, the one type in Lattice that deliberately breaks value semantics:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;LatRef&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;// the wrapped inner value&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="n"&gt;refcount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;// reference count&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;When a &lt;code&gt;Ref&lt;/code&gt; is cloned (which happens on every variable read, like everything else), the VM bumps the refcount and copies the pointer. It does &lt;em&gt;not&lt;/em&gt; deep-clone the inner value. Multiple copies of a &lt;code&gt;Ref&lt;/code&gt; share the same underlying &lt;code&gt;LatRef&lt;/code&gt;, so mutations through one are visible through all others. This is the explicit opt-in to reference semantics that the rest of the language avoids.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;let r = Ref::new([1, 2, 3])
let r2 = r              // shallow copy — same LatRef
r.push(4)
print(r2.get())          // [1, 2, 3, 4] — shared state
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The VM provides transparent proxying: &lt;code&gt;OP_INDEX&lt;/code&gt;, &lt;code&gt;OP_SET_INDEX&lt;/code&gt;, and &lt;code&gt;OP_INVOKE&lt;/code&gt; all check for &lt;code&gt;VAL_REF&lt;/code&gt; and delegate to the inner value. Indexing into a &lt;code&gt;Ref&amp;lt;Array&amp;gt;&lt;/code&gt; indexes the inner array. Calling &lt;code&gt;.push()&lt;/code&gt; on a &lt;code&gt;Ref&amp;lt;Array&amp;gt;&lt;/code&gt; mutates the inner array directly. At the language level, a Ref mostly behaves like the value it wraps; you just get shared mutation instead of isolated copies.&lt;/p&gt;
&lt;p&gt;Ref has its own methods (&lt;code&gt;get()&lt;/code&gt;/&lt;code&gt;deref()&lt;/code&gt; to clone the inner value out, &lt;code&gt;set(v)&lt;/code&gt; to replace it, &lt;code&gt;inner_type()&lt;/code&gt; to inspect the wrapped type) plus proxied methods for whatever the inner value supports (map &lt;code&gt;set&lt;/code&gt;/&lt;code&gt;get&lt;/code&gt;/&lt;code&gt;keys&lt;/code&gt;, array &lt;code&gt;push&lt;/code&gt;/&lt;code&gt;pop&lt;/code&gt;, etc.).&lt;/p&gt;
&lt;p&gt;The phase system applies to Refs too. Freezing a Ref blocks all mutation: &lt;code&gt;set()&lt;/code&gt;, &lt;code&gt;push()&lt;/code&gt;, index assignment all check &lt;code&gt;obj-&amp;gt;phase == VTAG_CRYSTAL&lt;/code&gt; and error with "cannot set on a frozen Ref." This makes frozen Refs safe to share across concurrent boundaries; they're immutable handles to immutable data.&lt;/p&gt;
&lt;p&gt;This introduces a third memory management strategy alongside the dual-heap (mark-and-sweep for fluid values, arenas for crystal values) and the ephemeral bump arena. Refs use reference counting: &lt;code&gt;ref_retain()&lt;/code&gt; on clone, &lt;code&gt;ref_release()&lt;/code&gt; on free, with the inner value freed when the count hits zero. It's a deliberate trade-off: reference counting is simple and deterministic, and since Refs are the uncommon case (most Lattice code uses value semantics), the lack of cycle collection hasn't been an issue in practice.&lt;/p&gt;
&lt;h3&gt;Validation&lt;/h3&gt;
&lt;p&gt;The VM is validated by &lt;strong&gt;815 tests&lt;/strong&gt; covering every feature: arithmetic, closures, upvalues, phase transitions, exception handling, defer, iterators, data structures, concurrency, modules, bytecode serialization, and the self-hosted compiler.&lt;/p&gt;
&lt;p&gt;All 815 tests pass under both normal compilation and AddressSanitizer builds (&lt;code&gt;make asan&lt;/code&gt;), which dynamically checks for heap buffer overflows, use-after-free, stack buffer overflows, and memory leaks. For a VM with manual memory management, upvalue lifetime tracking, and an ephemeral arena that reclaims memory at statement boundaries, ASan validation is essential.&lt;/p&gt;
&lt;p&gt;Both execution modes (bytecode VM (default) and tree-walker (&lt;code&gt;--tree-walk&lt;/code&gt;)) share the same test suite and produce identical results:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;make&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt;&lt;span class="w"&gt;                &lt;/span&gt;&lt;span class="c1"&gt;# bytecode VM: 815 passed&lt;/span&gt;
make&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;TREE_WALK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;# tree-walker: 815 passed&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Feature parity is complete:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th style="text-align: center;"&gt;Tree-walker&lt;/th&gt;
&lt;th style="text-align: center;"&gt;Bytecode VM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Phase system (freeze/thaw/clone/forge)&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Closures with upvalues&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exception handling (try/catch/throw)&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Defer blocks&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pattern matching&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Structs with methods&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enums with payloads&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Arrays, maps, tuples, sets, buffers&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Iterators (for-in, ranges)&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Module imports&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Concurrency (scope/spawn/select)&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Channels&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phase reactions/bonds/seeds&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Contracts (require/ensure)&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Variable tracking (history)&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bytecode serialization (.latc)&lt;/td&gt;
&lt;td style="text-align: center;"&gt;---&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Computed goto dispatch&lt;/td&gt;
&lt;td style="text-align: center;"&gt;---&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ephemeral bump arena&lt;/td&gt;
&lt;td style="text-align: center;"&gt;---&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Specialized integer ops&lt;/td&gt;
&lt;td style="text-align: center;"&gt;---&lt;/td&gt;
&lt;td style="text-align: center;"&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The last four rows are VM-only features that have no tree-walker equivalent.&lt;/p&gt;
&lt;h3&gt;What's Next&lt;/h3&gt;
&lt;p&gt;The VM is feature-complete but not performance-optimized. The obvious next steps are register allocation to reduce stack traffic, type-specialized dispatch paths guided by runtime profiling, tail call optimization for recursive patterns, and constant pool deduplication across compilation units. Further out, the bytecode provides a natural intermediate representation for JIT compilation.&lt;/p&gt;
&lt;p&gt;On the self-hosting front, adding concurrency primitives to &lt;code&gt;latc.lat&lt;/code&gt; would close the gap to full self-compilation, where the Lattice compiler compiles itself, producing a &lt;code&gt;.latc&lt;/code&gt; file that can then compile other programs without the C implementation in the loop.&lt;/p&gt;
&lt;p&gt;The full technical details (including encoding diagrams, the complete opcode listing, compilation walkthroughs, and references to related work in Lua, CPython, YARV, and WebAssembly) are in the &lt;a href="https://tinycomputers.io/papers/lattice_vm.pdf"&gt;research paper&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The source code is at &lt;a href="https://baud.rs/fIe3gx"&gt;github.com/ajokela/lattice&lt;/a&gt;, and the project site is at &lt;a href="https://baud.rs/bwvnYT"&gt;lattice-lang.org&lt;/a&gt;.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;git clone https://github.com/ajokela/lattice.git
cd lattice &amp;amp;&amp;amp; make
./clat
&lt;/pre&gt;&lt;/div&gt;</description><category>bytecode</category><category>c</category><category>closures</category><category>compilers</category><category>concurrency</category><category>interpreters</category><category>language design</category><category>lattice</category><category>phase system</category><category>programming languages</category><category>self-hosting</category><category>serialization</category><category>upvalues</category><category>virtual machine</category><guid>https://tinycomputers.io/posts/a-stack-based-bytecode-vm-for-lattice.html</guid><pubDate>Fri, 20 Feb 2026 18:00:00 GMT</pubDate></item><item><title>Review of "Crafting Interpreters" by Robert Nystrom</title><link>https://tinycomputers.io/posts/review-of-crafting-interpreters-by-robert-nystrom.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/review-of-crafting-interpreters-by-robert-nystrom_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;25 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;h3&gt;Introduction&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/crafting-interpreters/cover-0001.png" alt="Crafting Interpreters by Robert Nystrom, cover art depicting a hand-drawn mountain with paths representing compilation stages from source code to machine code" class="book-cover-image" style="float: right; max-width: 300px; margin: 0 0 1em 1.5em;"&gt;&lt;/p&gt;
&lt;p&gt;There is a particular category of programming book that transcends its subject matter, becoming not just a reference but an experience. &lt;a href="https://baud.rs/crafting-interp"&gt;"Crafting Interpreters" by Robert Nystrom&lt;/a&gt; belongs firmly in this category. Originally published in 2021 after six years of development, the book tackles what many programmers consider one of the most intimidating topics in computer science (building a programming language from scratch) and makes it not just accessible but genuinely enjoyable.&lt;/p&gt;
&lt;p&gt;Nystrom is no stranger to technical writing that connects with practitioners. His earlier book &lt;a href="https://baud.rs/game-dev-patterns"&gt;"Game Programming Patterns"&lt;/a&gt; demonstrated a talent for explaining complex software concepts through clear prose and practical examples. With "Crafting Interpreters," he applies that same skill to language implementation, a domain traditionally guarded by &lt;a href="https://baud.rs/compilers-dragon"&gt;dense academic texts&lt;/a&gt; and &lt;a href="https://baud.rs/backus-naur-form"&gt;formal notation&lt;/a&gt; that sends most working programmers running.&lt;/p&gt;
&lt;p&gt;The book's central premise is both ambitious and elegant: build the same programming language twice. First as a tree-walk interpreter in Java (called &lt;code&gt;jlox&lt;/code&gt;), then as a bytecode virtual machine in C (called &lt;code&gt;clox&lt;/code&gt;). This dual implementation strategy isn't just a structural gimmick. It serves a deep pedagogical purpose, allowing readers to first grasp the conceptual architecture of language processing in a high-level language before rebuilding everything from raw memory and pointer arithmetic. The result is a book that manages to teach compiler theory, language design, software architecture, and low-level systems programming simultaneously.&lt;/p&gt;
&lt;p&gt;The entire text is freely available at &lt;a href="https://baud.rs/crafting-interpreters-site"&gt;craftinginterpreters.com&lt;/a&gt;, which speaks to Nystrom's commitment to making this knowledge widely accessible. The physical edition, published with care and featuring hand-drawn illustrations throughout, is worth owning for anyone who works through the material.&lt;/p&gt;
&lt;h3&gt;The Language: Lox&lt;/h3&gt;
&lt;p&gt;Before diving into implementation, Nystrom introduces Lox, the language both interpreters will execute. Lox is a dynamically typed, garbage-collected language with C-family syntax that supports first-class functions, closures, and class-based object orientation with single inheritance. It is deliberately modest in scope (no arrays, no module system, no standard library to speak of), but this restraint is precisely the point.&lt;/p&gt;
&lt;p&gt;Every feature in Lox exists because it teaches something important about language implementation. Dynamic typing means building a runtime type system. Garbage collection means understanding memory management at the deepest level. Closures require wrestling with variable capture and lifetime semantics. Classes and inheritance demand method resolution and the vtable-like dispatch mechanisms that underpin most object-oriented languages. Lox is small enough to implement in a book but complex enough that implementing it forces the reader to confront every major challenge in language design.&lt;/p&gt;
&lt;p&gt;The choice of a custom language rather than a subset of an existing one is significant. It frees Nystrom from having to explain why certain features are omitted or work differently than readers might expect. Lox is exactly what it needs to be, nothing more.&lt;/p&gt;
&lt;h3&gt;Part II: The Tree-Walk Interpreter (&lt;code&gt;jlox&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;The first implementation spans thirteen chapters and builds a complete interpreter in Java. Nystrom begins where every language implementation must: with scanning. The scanner chapter walks through converting raw source text into tokens, handling string literals, numbers, keywords, and the inevitable edge cases that make lexical analysis more interesting than it first appears.&lt;/p&gt;
&lt;p&gt;From there, the book moves into parsing, where Nystrom introduces recursive descent parsing with a clarity that makes the technique feel almost obvious in hindsight. Rather than reaching for parser generators like &lt;a href="https://baud.rs/yacc-parser"&gt;YACC&lt;/a&gt; or &lt;a href="https://baud.rs/antlr-parser"&gt;ANTLR&lt;/a&gt;, every line of the parser is written by hand. This decision is characteristic of the book's philosophy: no black boxes, no magic, no dependencies. The reader understands every piece because the reader built every piece.&lt;/p&gt;
&lt;p&gt;The chapters on expression evaluation and statement execution establish the runtime model, but the book truly hits its stride in the chapters on scope and environments. Nystrom's explanation of lexical scoping (using a chain of environment objects that form what he calls a "cactus stack") is one of the clearest treatments of this topic in any programming text. The hand-drawn illustration of nested environments, with their parent pointers threading back through enclosing scopes, communicates in a single image what paragraphs of formal specification struggle to convey.&lt;/p&gt;
&lt;p&gt;Functions and closures represent the first major conceptual challenge, and Nystrom handles them with characteristic patience. The problem of captured variables (where a closure must hold onto variables from an enclosing scope that may have already returned) is presented as a puzzle to be solved rather than a rule to be memorized. The resolver pass that performs static analysis to determine variable binding is introduced as a natural response to a concrete bug, not as an abstract compiler phase.&lt;/p&gt;
&lt;p&gt;The object-oriented chapters add classes, methods, constructors, inheritance, and super expressions. By the time &lt;code&gt;jlox&lt;/code&gt; is complete, the reader has built a language implementation capable of running recursive algorithms, managing object hierarchies, and handling the scoping rules that trip up even experienced programmers in production languages.&lt;/p&gt;
&lt;p&gt;What makes this section exceptional is Nystrom's willingness to show the design process, not just the final design. When a naive approach creates a bug or performance problem, the reader sees it happen and participates in fixing it. This iterative development style mirrors how real software is built and teaches debugging intuition alongside language implementation.&lt;/p&gt;
&lt;h3&gt;Part III: The Bytecode Virtual Machine (&lt;code&gt;clox&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;If Part II is the approachable on-ramp, Part III is where the book reveals its true ambition. Across seventeen chapters, Nystrom rebuilds everything in C, this time compiling Lox to bytecode and executing it on a stack-based virtual machine. The motivation is made concrete early: &lt;code&gt;jlox&lt;/code&gt; takes 72 seconds to compute the 40th Fibonacci number recursively, while C can do it in half a second. The bytecode VM will close that gap dramatically.&lt;/p&gt;
&lt;p&gt;The transition from Java to C is itself educational. Readers who have grown comfortable with Java's automatic memory management, dynamic arrays, and hash maps must now implement all of these from scratch. Nystrom builds a dynamic array type, a hash table, and ultimately a mark-sweep garbage collector, all in service of the language implementation. These data structures are not taught in isolation; they emerge because the VM needs them.&lt;/p&gt;
&lt;p&gt;The chunk and instruction design chapters teach the reader to think about data representation at the byte level. Each bytecode instruction is a single byte, followed by operands that encode constants, variable slots, or jump offsets. The disassembler that Nystrom builds alongside the VM is a thoughtful touch, providing a debugging tool that makes the otherwise invisible bytecode tangible.&lt;/p&gt;
&lt;p&gt;The single-pass compiler that replaces &lt;code&gt;jlox&lt;/code&gt;'s separate parsing and resolution phases is a masterclass in practical compiler construction. Nystrom uses &lt;a href="https://baud.rs/parser-techniques"&gt;Pratt parsing&lt;/a&gt; for expressions, a technique he explains with such clarity that this chapter alone has become a widely referenced resource for anyone implementing expression parsers. The Pratt parser's elegant handling of precedence and associativity through a simple table of parsing functions is one of those ideas that, once understood, feels like it should have been obvious all along.&lt;/p&gt;
&lt;p&gt;The chapters on closures in &lt;code&gt;clox&lt;/code&gt; deserve special mention. Where &lt;code&gt;jlox&lt;/code&gt; could lean on Java's garbage collector and object references to capture variables, &lt;code&gt;clox&lt;/code&gt; must solve the "upvalue" problem explicitly. Nystrom introduces the concept of upvalues (runtime objects that represent captured variables) and walks through the mechanism by which stack-allocated locals are "closed over" and moved to the heap when their enclosing function returns. The complexity of this implementation, managed through careful incremental development, demonstrates why closures are considered one of the hardest features to implement correctly in a bytecode VM.&lt;/p&gt;
&lt;p&gt;The garbage collection chapter is the book's peak of systems programming depth. Nystrom implements a mark-sweep collector, explaining reachability, root sets, and the tricolor abstraction. The treatment is practical rather than theoretical; the reader sees exactly when collection triggers, how objects are traced, and why the collector must handle the subtle case of the VM itself allocating memory during collection (which could invalidate pointers being traced). The self-adjusting heap threshold that balances collection frequency against memory usage is a detail that separates a textbook GC from one that works in practice.&lt;/p&gt;
&lt;h3&gt;Writing Style and Presentation&lt;/h3&gt;
&lt;p&gt;Nystrom's prose is the book's secret weapon. Technical writing about compilers tends toward one of two failure modes: impenetrable formalism or hand-waving oversimplification. Nystrom avoids both. His writing is conversational without being sloppy, precise without being dry. Footnotes contain genuine wit. Asides acknowledge the reader's likely confusion at exactly the moments when confusion is most natural.&lt;/p&gt;
&lt;p&gt;The hand-drawn illustrations scattered throughout the book serve a purpose beyond aesthetics. They signal that this is a personal, crafted work rather than a mass-produced textbook. The diagrams of memory layouts, parse trees, and stack states during execution are clearer than their machine-generated equivalents in most compiler texts, partly because they include exactly the detail needed and nothing more.&lt;/p&gt;
&lt;p&gt;The "Design Note" sections that appear between chapters are mini-essays on language design philosophy: why dynamic typing exists, what makes a feature "elegant," how language designers balance expressiveness against implementation complexity. These sections transform the book from a pure implementation guide into something closer to a meditation on programming language design as a creative discipline.&lt;/p&gt;
&lt;h3&gt;Strengths&lt;/h3&gt;
&lt;p&gt;The book's greatest achievement is making compiler construction feel like a natural extension of everyday programming rather than a specialized academic pursuit. By avoiding formal grammars, &lt;a href="https://baud.rs/automata-theory-book"&gt;automata theory&lt;/a&gt;, and the mathematical notation that dominates traditional compiler texts, Nystrom demonstrates that you don't need a PhD to build a working language implementation.&lt;/p&gt;
&lt;p&gt;The dual-implementation approach pays dividends throughout. Concepts that are murky in one implementation become clear in the other. The tree-walk interpreter makes the abstract concepts tangible; the bytecode VM reveals the performance and engineering considerations that production language implementations face. Together, they provide a stereoscopic view of language implementation that neither could achieve alone.&lt;/p&gt;
&lt;p&gt;The no-dependency philosophy deserves praise. There is no lexer generator, no parser generator, no framework, no library. Every line of code in both implementations is written in the book and understood by the reader. This means that upon completion, the reader owns their understanding completely; there is no mysterious tool doing critical work behind the scenes.&lt;/p&gt;
&lt;p&gt;The incremental development style produces a book that is remarkably difficult to get lost in. Each chapter begins with working code and ends with working code. The reader is never more than a few pages from being able to compile and run something. For a topic as complex as language implementation, this steady cadence of progress is essential for maintaining motivation.&lt;/p&gt;
&lt;h3&gt;Limitations&lt;/h3&gt;
&lt;p&gt;The book is not without its shortcomings. The choice of dynamic typing for Lox means that static type systems (one of the most active and important areas of modern language design) receive no coverage. Type inference, generics, algebraic data types, and pattern matching are absent. A reader completing both implementations still would not know how to add a type checker, which is arguably the most practically relevant compiler phase for working programmers today.&lt;/p&gt;
&lt;p&gt;Optimization is largely unexplored. The &lt;code&gt;clox&lt;/code&gt; VM is faster than &lt;code&gt;jlox&lt;/code&gt; by virtue of being a bytecode interpreter written in C, but Nystrom does not cover constant folding, dead code elimination, register allocation, or any of the optimization passes that distinguish a teaching compiler from a production one. JIT compilation, increasingly the standard for high-performance language runtimes, is mentioned only in passing.&lt;/p&gt;
&lt;p&gt;The error handling and recovery throughout both implementations is minimal. Production parsers need sophisticated error recovery to provide useful diagnostics. Nystrom acknowledges this gap but does not address it, leaving readers who want to build user-facing tools with significant work ahead of them.&lt;/p&gt;
&lt;p&gt;Lox's deliberate simplicity means that several common language features (arrays, iterators, modules, pattern matching, exception handling) are left as exercises. While this keeps the book focused, it means that readers must figure out on their own how to implement the features that most real languages require. The gap between Lox and a practical language is significant.&lt;/p&gt;
&lt;h3&gt;Who Should Read This Book&lt;/h3&gt;
&lt;p&gt;"Crafting Interpreters" is ideal for working programmers who have always been curious about how languages work but have been intimidated by the traditional compiler literature. Comfortable familiarity with Java and C is assumed; this is not a book for learning either language. But the reader need not have any prior knowledge of compilers, formal languages, or automata theory.&lt;/p&gt;
&lt;p&gt;Computer science students will find it an excellent companion to a formal compilers course, providing the practical intuition that textbooks like Aho's "Dragon Book" deliberately omit. Conversely, self-taught programmers who never took a compilers course will find this book fills a significant gap in their education.&lt;/p&gt;
&lt;p&gt;Language enthusiasts who have tinkered with toy interpreters but never built anything with closures, classes, or garbage collection will find exactly the guidance they need to level up. And anyone who simply enjoys beautifully crafted technical writing will find the book rewarding even as a pure reading experience.&lt;/p&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;"Crafting Interpreters" is one of the best programming books published in recent years. It takes a subject that most programmers consider forbiddingly complex and renders it not just comprehensible but engaging. Nystrom's combination of clear writing, thoughtful pedagogy, practical focus, and genuine craft produces a book that teaches far more than its nominal subject. Beyond scanning, parsing, and code generation, the reader learns how to approach complex software design, how to build systems incrementally, and how to think about the tools they use every day at a deeper level.&lt;/p&gt;
&lt;p&gt;The book will not make you a compiler engineer. It will not teach you how to build a production language runtime, optimize generated code, or implement a sophisticated type system. What it will do is demystify the machinery that powers every programming language you have ever used, and give you the confidence and foundation to explore further. For most programmers, that is more than enough. It is, in fact, exactly what was needed.&lt;/p&gt;</description><category>bytecode</category><category>c</category><category>compilers</category><category>garbage collection</category><category>interpreters</category><category>java</category><category>language design</category><category>parsing</category><category>programming languages</category><category>robert nystrom</category><category>virtual machines</category><guid>https://tinycomputers.io/posts/review-of-crafting-interpreters-by-robert-nystrom.html</guid><pubDate>Thu, 19 Feb 2026 16:30:00 GMT</pubDate></item><item><title>From Tree-Walker to Bytecode VM: Compiling Lattice</title><link>https://tinycomputers.io/posts/from-tree-walker-to-bytecode-vm-compiling-lattice.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/from-tree-walker-to-bytecode-vm-compiling-lattice_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;16 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Lattice is a programming language built around a &lt;a href="https://tinycomputers.io/posts/introducing-lattice-a-crystallization-based-programming-language.html"&gt;crystallization-based phase system&lt;/a&gt;: values start as mutable "flux" and can be frozen into immutable "fix," with the runtime enforcing the transition and providing &lt;a href="https://tinycomputers.io/posts/mutability-as-a-first-class-concept-the-lattice-phase-system.html"&gt;reactions, bonds, contracts, and temporal tracking&lt;/a&gt; around it. It's implemented in C with no external dependencies.&lt;/p&gt;
&lt;p&gt;When I started building &lt;a href="https://baud.rs/q5yFwI"&gt;Lattice&lt;/a&gt;, a tree-walking interpreter was the obvious first move. You parse source into an AST, walk the nodes recursively, and evaluate as you go. It's straightforward, easy to debug, and lets you iterate on language semantics quickly without worrying about a second representation. &lt;a href="https://baud.rs/crafting-interpreters"&gt;&lt;em&gt;Crafting Interpreters&lt;/em&gt;&lt;/a&gt; calls this approach "the simplest way to build an interpreter," and it's right.&lt;/p&gt;
&lt;p&gt;But tree-walkers have well-known limitations. Every expression evaluation descends through function calls: &lt;code&gt;eval_expr&lt;/code&gt; calling &lt;code&gt;eval_binary&lt;/code&gt; calling &lt;code&gt;eval_expr&lt;/code&gt; twice more. The overhead compounds. You're chasing pointers through heap-allocated AST nodes with poor cache locality. And the call stack of the host language (C, in Lattice's case) becomes tangled with the call stack of the guest language, making it harder to implement features like error recovery and coroutines cleanly.&lt;/p&gt;
&lt;p&gt;Lattice v0.3.0 shipped a bytecode compiler and stack-based virtual machine alongside the tree-walker. In v0.3.1, the bytecode VM became the default for file execution, the interactive REPL, and the browser-based playground. The tree-walker is still available via &lt;code&gt;--tree-walk&lt;/code&gt;, but the VM now handles everything. This post walks through the architecture of that VM, some design decisions that turned out to matter, and a mutation bug that only surfaces when you combine deep-clone-on-read semantics with in-place method dispatch.&lt;/p&gt;
&lt;h3&gt;Architecture Overview&lt;/h3&gt;
&lt;p&gt;The bytecode pipeline has three stages: lexing and parsing (shared with the tree-walker), compilation from AST to bytecode chunks, and execution on a stack-based VM. The compiler and VM together add about 8,200 lines of C to the codebase, bringing the total to around 33,000 lines.&lt;/p&gt;
&lt;p&gt;A &lt;code&gt;Chunk&lt;/code&gt; is the compilation unit: a dynamic array of bytecode instructions, a constant pool, and debug metadata mapping instructions back to source line numbers. The compiler walks the AST and emits bytes into a chunk. The VM reads bytes from the chunk and executes them against a value stack.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;typedef&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;// bytecode array&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;cap&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;constants&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="c1"&gt;// constant pool&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;const_len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;const_cap&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;// source line per instruction&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;local_names&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;// slot → variable name (debug)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;local_name_cap&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Chunk&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The VM itself is a &lt;code&gt;for(;;)&lt;/code&gt; loop with a &lt;code&gt;switch&lt;/code&gt; on the current opcode byte, the textbook approach. No computed gotos, no threaded dispatch, no JIT. Just a switch. On modern hardware with branch prediction, a well-organized switch over 62 opcodes is fast enough that the overhead is negligible compared to the cost of actual operations (string allocation, hash table lookups, deep cloning).&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(;;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;switch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;op&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;OP_CONSTANT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;OP_ADD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;OP_CALL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;     &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// 59 more cases&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The value stack holds 4,096 slots. The call frame stack holds 256 frames. Each &lt;code&gt;CallFrame&lt;/code&gt; tracks its own instruction pointer, a base pointer into the value stack for its local variables, and an array of captured upvalues for closures. When you call a function, the VM pushes a new frame pointing at the callee's chunk. When the function returns, the frame pops and execution resumes in the caller.&lt;/p&gt;
&lt;h3&gt;The Instruction Set&lt;/h3&gt;
&lt;p&gt;Lattice's instruction set has 62 opcodes. Some are standard (&lt;code&gt;OP_ADD&lt;/code&gt;, &lt;code&gt;OP_JUMP_IF_FALSE&lt;/code&gt;, &lt;code&gt;OP_RETURN&lt;/code&gt;). Others exist because of Lattice-specific semantics.&lt;/p&gt;
&lt;p&gt;The phase system needs dedicated opcodes. &lt;code&gt;OP_FREEZE&lt;/code&gt; pops a value, deep-clones it into a crystal region with &lt;code&gt;VTAG_CRYSTAL&lt;/code&gt; tags, and pushes the frozen result. &lt;code&gt;OP_THAW&lt;/code&gt; does the reverse. &lt;code&gt;OP_MARK_FLUID&lt;/code&gt; sets the phase tag to &lt;code&gt;VTAG_FLUID&lt;/code&gt;; this is what &lt;code&gt;flux&lt;/code&gt; bindings emit after their initializer. &lt;code&gt;OP_FREEZE_VAR&lt;/code&gt; and &lt;code&gt;OP_THAW_VAR&lt;/code&gt; handle the case where &lt;code&gt;freeze(x)&lt;/code&gt; targets a named variable and needs to write back the result, carrying extra operands to identify the variable's location (local slot, upvalue, or global name).&lt;/p&gt;
&lt;p&gt;Phase reactions and bonds each have their own opcodes: &lt;code&gt;OP_REACT&lt;/code&gt;, &lt;code&gt;OP_UNREACT&lt;/code&gt;, &lt;code&gt;OP_BOND&lt;/code&gt;, &lt;code&gt;OP_UNBOND&lt;/code&gt;, &lt;code&gt;OP_SEED&lt;/code&gt;, &lt;code&gt;OP_UNSEED&lt;/code&gt;. These could theoretically be implemented as native function calls, but making them opcodes lets the compiler emit the variable name as a constant operand, and the VM needs the name to look up the correct reaction/bond registration in its tracking tables, and encoding it in the bytecode avoids a runtime string lookup.&lt;/p&gt;
&lt;p&gt;Structured concurrency uses an interesting hybrid. &lt;code&gt;OP_SCOPE&lt;/code&gt; and &lt;code&gt;OP_SELECT&lt;/code&gt; each carry a constant-pool index that stores a pointer to the original AST &lt;code&gt;Expr*&lt;/code&gt; node. When the VM hits one of these opcodes, it invokes the tree-walking evaluator on that subtree. This is a deliberate design choice; the concurrency primitives involve spawning threads and managing channels, which requires the evaluator's full environment machinery. Rather than reimplement all of that in the VM, the bytecode compiler punts to the tree-walker for these specific constructs. The rest of the program runs on the VM; only &lt;code&gt;scope&lt;/code&gt; and &lt;code&gt;select&lt;/code&gt; blocks briefly drop into interpretation.&lt;/p&gt;
&lt;h3&gt;Closures and Upvalues&lt;/h3&gt;
&lt;p&gt;Closures are where bytecode VMs get interesting, and Lattice follows the upvalue model that Lua pioneered and Crafting Interpreters popularized.&lt;/p&gt;
&lt;p&gt;When a function is defined inside another function and references variables from the enclosing scope, those variables need to outlive their original stack frame. The solution is upvalues, indirection objects that start pointing into the stack and get "closed over" when the variable goes out of scope.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;typedef&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;ObjUpvalue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// points to stack slot or &amp;amp;closed&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;closed&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="c1"&gt;// holds value after scope exit&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;ObjUpvalue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;next&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;// linked list for open upvalues&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ObjUpvalue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;While the enclosing function is still executing, &lt;code&gt;location&lt;/code&gt; points directly at the stack slot. When the enclosing function returns, &lt;code&gt;OP_CLOSE_UPVALUE&lt;/code&gt; copies the stack value into the &lt;code&gt;closed&lt;/code&gt; field and repoints &lt;code&gt;location&lt;/code&gt; to &lt;code&gt;&amp;amp;closed&lt;/code&gt;. The closure doesn't know or care about the switch; it always dereferences &lt;code&gt;location&lt;/code&gt;. This is why upvalues work: they're a level of indirection that transparently survives stack frame destruction.&lt;/p&gt;
&lt;p&gt;The compiler resolves variable references in three stages: first it checks local scope (&lt;code&gt;resolve_local&lt;/code&gt;), then upvalues (&lt;code&gt;resolve_upvalue&lt;/code&gt;, which walks the compiler chain recursively), then falls back to globals via &lt;code&gt;OP_GET_GLOBAL&lt;/code&gt;. The &lt;code&gt;OP_CLOSURE&lt;/code&gt; instruction is followed by a series of &lt;code&gt;(is_local, index)&lt;/code&gt; byte pairs, one per upvalue, telling the VM whether to capture from the current frame's stack or from the parent frame's upvalue array.&lt;/p&gt;
&lt;p&gt;A concrete example makes this clearer. Consider a counter factory:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;fn make_counter() {
    flux count = 0
    return |n| { count += n; count }
}

let c = make_counter()
print(c(5))   // 5
print(c(3))   // 8
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;When &lt;code&gt;make_counter&lt;/code&gt; returns, its stack frame is destroyed, but &lt;code&gt;count&lt;/code&gt; needs to survive, because the returned closure references it. During compilation, the compiler sees that the closure's body references &lt;code&gt;count&lt;/code&gt;, which is local to the enclosing &lt;code&gt;make_counter&lt;/code&gt;. It emits an &lt;code&gt;(is_local=true, index=1)&lt;/code&gt; upvalue descriptor. At runtime, &lt;code&gt;OP_CLOSURE&lt;/code&gt; calls &lt;code&gt;capture_upvalue()&lt;/code&gt;, which either reuses an existing &lt;code&gt;ObjUpvalue&lt;/code&gt; pointing at that stack slot or creates a new one. When &lt;code&gt;make_counter&lt;/code&gt; returns, &lt;code&gt;OP_CLOSE_UPVALUE&lt;/code&gt; copies the stack value of &lt;code&gt;count&lt;/code&gt; into the upvalue's &lt;code&gt;closed&lt;/code&gt; field and repoints &lt;code&gt;location&lt;/code&gt;. The closure keeps working, oblivious to the frame being gone.&lt;/p&gt;
&lt;p&gt;One implementation detail worth noting: Lattice stores the upvalue array by repurposing the closure's &lt;code&gt;captured_env&lt;/code&gt; field (normally an &lt;code&gt;Env*&lt;/code&gt; in the tree-walker) and the upvalue count in the &lt;code&gt;region_id&lt;/code&gt; field. This avoids adding new fields to the &lt;code&gt;LatValue&lt;/code&gt; union, which matters when values are deep-cloned frequently, since every field adds to the clone cost.&lt;/p&gt;
&lt;h3&gt;Compiling for the REPL&lt;/h3&gt;
&lt;p&gt;A REPL that runs on a bytecode VM needs different compilation from file execution. The difference is small but important.&lt;/p&gt;
&lt;p&gt;In file mode, &lt;code&gt;compile_module()&lt;/code&gt; compiles a complete program and terminates with &lt;code&gt;OP_UNIT; OP_RETURN&lt;/code&gt;; the module returns unit, and any expression results along the way are discarded with &lt;code&gt;OP_POP&lt;/code&gt;. This is the right behavior for scripts: you don't want every intermediate expression to accumulate on the stack.&lt;/p&gt;
&lt;p&gt;In REPL mode, &lt;code&gt;compile_repl()&lt;/code&gt; needs the opposite behavior for the last expression. When you type &lt;code&gt;42&lt;/code&gt; at the REPL prompt, you want to see &lt;code&gt;=&amp;gt; 42&lt;/code&gt;. So if the last item in the compiled chunk is a bare expression statement, &lt;code&gt;compile_repl()&lt;/code&gt; compiles the expression but &lt;em&gt;skips the &lt;code&gt;OP_POP&lt;/code&gt;&lt;/em&gt;, leaving the value on the stack. Then it emits &lt;code&gt;OP_RETURN&lt;/code&gt;, and the VM receives the value as the chunk's return value.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;last_is_expr&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;item_count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;item_count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ITEM_STMT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;prog&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;item_count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;STMT_EXPR&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_is_expr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;emit_byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OP_RETURN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="c1"&gt;// value already on stack&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;emit_byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OP_UNIT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;// no expression — return unit&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;emit_byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OP_RETURN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;For function definitions, struct declarations, and enum definitions, the result is unit, and the REPL silently suppresses the &lt;code&gt;=&amp;gt;&lt;/code&gt; output. This matches user expectations: defining a function shouldn't print anything. The effect in practice:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="n"&gt;lattice&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;
&lt;span class="n"&gt;lattice&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"hello"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;" world"&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"hello world"&lt;/span&gt;
&lt;span class="n"&gt;lattice&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;flux&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="n"&gt;lattice&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;
&lt;span class="n"&gt;lattice&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;fn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;square&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;lattice&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;square&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;49&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Each line is independently compiled and executed on the persistent VM. Globals defined in one line (&lt;code&gt;flux x = 10&lt;/code&gt;) are visible in subsequent lines because they're stored in the VM's environment, which persists across iterations. The &lt;code&gt;Chunk&lt;/code&gt; for each line is freed after execution; constants that matter (like global variable values) have already been deep-cloned into the environment.&lt;/p&gt;
&lt;p&gt;The other critical difference is enum persistence. &lt;code&gt;compile_module()&lt;/code&gt; frees its known-enum registry after compilation, because the compiler is done. &lt;code&gt;compile_repl()&lt;/code&gt; must not, because enums defined in REPL iteration N need to be visible in iteration N+1. The REPL calls &lt;code&gt;compiler_free_known_enums()&lt;/code&gt; only on exit. The same lifetime concern applies to parsed programs; struct and function declarations store &lt;code&gt;Expr*&lt;/code&gt; pointers that compiled chunks reference at runtime. The REPL accumulates all parsed programs in a dynamic array and frees them only when the session ends.&lt;/p&gt;
&lt;h3&gt;The Global Mutation Bug&lt;/h3&gt;
&lt;p&gt;This is the story I find most instructive, because it reveals a subtle interaction between two independently reasonable design decisions.&lt;/p&gt;
&lt;p&gt;Lattice has &lt;strong&gt;deep-clone-on-read&lt;/strong&gt; semantics. When you access a variable, the environment doesn't hand you a reference to the stored value; it hands you a fresh deep clone. This eliminates aliasing entirely: two variables never share underlying memory, passing a map to a function gives the function its own copy, and there's no way to create spooky action at a distance through shared mutable state.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;env_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Env&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;lat_map_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;scopes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="mi"&gt;-1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;value_deep_clone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// always a fresh copy&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This is expensive but correct. It gives Lattice pure value semantics without needing a borrow checker or persistent data structures.&lt;/p&gt;
&lt;p&gt;The tree-walking evaluator handles in-place mutation (like &lt;code&gt;array.push()&lt;/code&gt;) with a separate &lt;code&gt;resolve_lvalue()&lt;/code&gt; mechanism that obtains a direct mutable pointer into the environment's storage, bypassing the deep clone. Push, pop, index assignment: these all go through &lt;code&gt;resolve_lvalue&lt;/code&gt; and mutate the stored value directly.&lt;/p&gt;
&lt;p&gt;The bytecode VM needed the same distinction. For local variables, this is straightforward: locals live on the value stack, and the VM has a direct pointer to them via &lt;code&gt;frame-&amp;gt;slots[slot]&lt;/code&gt;. I added &lt;code&gt;OP_INVOKE_LOCAL&lt;/code&gt;, which takes a stack slot index as an operand and passes a pointer to &lt;code&gt;vm_invoke_builtin()&lt;/code&gt;:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;OP_INVOKE_LOCAL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;slot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;method_idx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;arg_count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;method_name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;constants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;method_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str_val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;slots&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;slot&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// direct pointer&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vm_invoke_builtin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;method_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;arg_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;local_var_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// builtin mutated obj in-place — mutation persists&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ... fall through to closure/method dispatch&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;When &lt;code&gt;.push()&lt;/code&gt; grows the array by reallocating &lt;code&gt;obj-&amp;gt;as.array.elems&lt;/code&gt; and incrementing &lt;code&gt;obj-&amp;gt;as.array.len&lt;/code&gt;, it's directly modifying the stack slot. The mutation persists because &lt;code&gt;obj&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; the variable.&lt;/p&gt;
&lt;p&gt;For globals, the situation is different. Globals live in the environment (a scope-chain of hash maps), and &lt;code&gt;env_get()&lt;/code&gt; deep-clones. The generic &lt;code&gt;OP_INVOKE&lt;/code&gt; opcode works by evaluating the receiver expression onto the stack (which, for a global variable, means emitting &lt;code&gt;OP_GET_GLOBAL&lt;/code&gt;, which calls &lt;code&gt;env_get()&lt;/code&gt;, which deep-clones) and then dispatching the method on the cloned value. After the builtin mutates the clone, &lt;code&gt;OP_INVOKE&lt;/code&gt; pops and &lt;em&gt;frees&lt;/em&gt; it. The mutation vanishes.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;flux nums = [1, 2, 3]
nums.push(4)
print(nums)  // still [1, 2, 3] — the push mutated a clone
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This is the kind of bug that's obvious in retrospect but invisible when you're implementing things one piece at a time. &lt;code&gt;env_get()&lt;/code&gt; deep-cloning is correct. &lt;code&gt;OP_INVOKE&lt;/code&gt; popping the receiver after dispatch is correct. Each piece behaves correctly in isolation. The bug emerges from their composition.&lt;/p&gt;
&lt;p&gt;The fix is &lt;code&gt;OP_INVOKE_GLOBAL&lt;/code&gt;, a new opcode that knows the receiver is a global variable and writes back after mutation:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;OP_INVOKE_GLOBAL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;name_idx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;method_idx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;uint8_t&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;arg_count&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;READ_BYTE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;global_name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;constants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;name_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str_val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;method_name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;frame&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;constants&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;method_idx&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str_val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;LatValue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;obj_val&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;env_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vm&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;global_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;obj_val&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;VM_ERROR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"undefined variable '%s'"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;global_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vm_invoke_builtin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;obj_val&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;method_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;arg_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;global_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vm&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cm"&gt;/* handle error */&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;// Write back the mutated clone to the environment&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;env_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vm&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;global_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;obj_val&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ... fall through for non-builtin methods&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The compiler emits &lt;code&gt;OP_INVOKE_GLOBAL&lt;/code&gt; when it sees a method call on an identifier that isn't a local variable or an upvalue:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;EXPR_METHOD_CALL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;method_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;EXPR_IDENT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;slot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;resolve_local&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;method_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str_val&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;slot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="c1"&gt;// ... emit OP_INVOKE_LOCAL&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;upvalue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;resolve_upvalue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;method_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;object&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;str_val&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;upvalue&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="c1"&gt;// Not local, not upvalue — must be global&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="c1"&gt;// ... emit OP_INVOKE_GLOBAL&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;// ... fall through to generic OP_INVOKE&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This gives us three tiers of method dispatch: &lt;code&gt;OP_INVOKE_LOCAL&lt;/code&gt; for locals (direct pointer, no clone), &lt;code&gt;OP_INVOKE_GLOBAL&lt;/code&gt; for globals (clone + write-back), and &lt;code&gt;OP_INVOKE&lt;/code&gt; for everything else (computed receivers like &lt;code&gt;get_array().push(x)&lt;/code&gt;, where there's nothing to write back to). With the fix:&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;flux nums = [1, 2, 3]
nums.push(4)
nums.push(5)
print(nums)  // [1, 2, 3, 4, 5] — mutations persist
&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;All mutating builtins (&lt;code&gt;push&lt;/code&gt;, &lt;code&gt;pop&lt;/code&gt;, &lt;code&gt;set&lt;/code&gt;, &lt;code&gt;remove&lt;/code&gt;, &lt;code&gt;insert&lt;/code&gt;, &lt;code&gt;remove_at&lt;/code&gt;) now work correctly on global variables. The same pattern applies to maps, sets, and any other type with in-place methods.&lt;/p&gt;
&lt;p&gt;The broader lesson is that deep-clone-on-read semantics create an impedance mismatch with in-place mutation. In a reference-based language, &lt;code&gt;obj.push(x)&lt;/code&gt; just works; &lt;code&gt;obj&lt;/code&gt; is a reference, and the mutation happens wherever the reference points. In a value-based language, you need to explicitly handle the write-back for every level of variable storage. The tree-walker's &lt;code&gt;resolve_lvalue&lt;/code&gt; is one solution. The VM's tiered invoke opcodes are another. Both exist because of the same underlying tension.&lt;/p&gt;
&lt;h3&gt;The WASM Playground&lt;/h3&gt;
&lt;p&gt;Lattice's browser-based &lt;a href="https://baud.rs/odS816"&gt;playground&lt;/a&gt; compiles the entire VM to WebAssembly via Emscripten. The WASM API exposes four functions: &lt;code&gt;lat_init()&lt;/code&gt;, &lt;code&gt;lat_run_line()&lt;/code&gt;, &lt;code&gt;lat_is_complete()&lt;/code&gt;, and &lt;code&gt;lat_destroy()&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The playground runs the same bytecode VM as the native binary. Each line of input goes through the same pipeline: lex, parse, &lt;code&gt;compile_repl()&lt;/code&gt;, &lt;code&gt;vm_run()&lt;/code&gt;. The &lt;code&gt;lat_is_complete()&lt;/code&gt; function checks bracket depth to determine whether the user is mid-expression, enabling multi-line input by waiting for balanced braces before compiling.&lt;/p&gt;
&lt;p&gt;Previously the playground used the tree-walking evaluator, which meant code could behave differently in the browser than on the command line. Switching the WASM build to the bytecode VM eliminates that inconsistency; the playground, the REPL, and file execution all use the same compilation and execution path.&lt;/p&gt;
&lt;h3&gt;What Didn't Change&lt;/h3&gt;
&lt;p&gt;It's worth noting what the bytecode VM &lt;em&gt;doesn't&lt;/em&gt; change about Lattice.&lt;/p&gt;
&lt;p&gt;The value representation is identical. A &lt;code&gt;LatValue&lt;/code&gt; is still a tagged union with a type tag, phase tag, and payload. Phase transitions still deep-clone data across heap regions. The dual-heap architecture (mark-and-sweep for fluid data, arena-based regions for crystal data) is unchanged. Global variables still live in a scope-chain environment.&lt;/p&gt;
&lt;p&gt;The parser and AST are completely shared. The compiler reads the same &lt;code&gt;Program&lt;/code&gt; structure that the tree-walker reads. A single set of test programs validates both execution paths, and all 771 tests pass on both.&lt;/p&gt;
&lt;p&gt;The phase system compiles one-to-one. &lt;code&gt;freeze()&lt;/code&gt; becomes &lt;code&gt;OP_FREEZE&lt;/code&gt;. &lt;code&gt;thaw()&lt;/code&gt; becomes &lt;code&gt;OP_THAW&lt;/code&gt;. Bonds, reactions, seeds, pressure constraints: each has a corresponding opcode that does exactly what the tree-walker's evaluator function did, just driven by bytecode dispatch instead of recursive AST traversal.&lt;/p&gt;
&lt;h3&gt;Performance&lt;/h3&gt;
&lt;p&gt;I haven't done rigorous benchmarking, and I'm deliberately not making performance claims. The motivation for the bytecode VM wasn't speed; it was consistency (one execution path everywhere) and architectural cleanliness (the VM is easier to extend than the tree-walker's deeply nested switch statements).&lt;/p&gt;
&lt;p&gt;That said, bytecode VMs are generally faster than tree-walkers for the structural reasons mentioned earlier: better cache locality (sequential byte array vs. pointer-chasing through AST nodes), less call overhead (one switch dispatch vs. recursive function calls), and a compact representation that fits more of the program in cache. Whether this matters for Lattice programs depends on the workload. For a language whose core runtime cost is dominated by deep cloning, the dispatch overhead is rarely the bottleneck.&lt;/p&gt;
&lt;h3&gt;Looking Forward&lt;/h3&gt;
&lt;p&gt;The VM is feature-complete but not optimized. There's no constant folding, no dead code elimination, no register allocation (it's a pure stack machine). The &lt;code&gt;OP_SCOPE&lt;/code&gt; and &lt;code&gt;OP_SELECT&lt;/code&gt; concurrency opcodes still delegate to the tree-walker. The dispatch loop is a plain switch rather than computed gotos.&lt;/p&gt;
&lt;p&gt;These are all well-understood optimizations with clear implementation paths. The point of v0.3.1 is that the bytecode VM is now the default, passes all tests, and handles the full language surface including the phase system. Optimization is a separate project.&lt;/p&gt;
&lt;p&gt;The source code is at &lt;a href="https://baud.rs/fIe3gx"&gt;github.com/ajokela/lattice&lt;/a&gt;, and you can try it in the browser at &lt;a href="https://baud.rs/bwvnYT"&gt;lattice-lang.web.app&lt;/a&gt;. The bytecode VM, compiler, REPL, and all 62 opcodes are in four files: &lt;code&gt;compiler.c&lt;/code&gt;, &lt;code&gt;vm.c&lt;/code&gt;, &lt;code&gt;chunk.c&lt;/code&gt;, and &lt;code&gt;opcode.c&lt;/code&gt;.&lt;/p&gt;
&lt;div class="code"&gt;&lt;pre class="code literal-block"&gt;git clone https://github.com/ajokela/lattice.git
cd lattice &amp;amp;&amp;amp; make
./clat
&lt;/pre&gt;&lt;/div&gt;

&lt;h3&gt;Recommended Resources&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://baud.rs/crafting-interpreters"&gt;&lt;em&gt;Crafting Interpreters&lt;/em&gt;&lt;/a&gt; by Robert Nystrom - The definitive guide to building interpreters and bytecode VMs, and a major influence on Lattice's upvalue implementation&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/P6ofTE"&gt;&lt;em&gt;Writing A Compiler In Go&lt;/em&gt;&lt;/a&gt; by Thorsten Ball - Practical companion covering bytecode compilation and stack-based VMs&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/BSTqlt"&gt;&lt;em&gt;Engineering a Compiler&lt;/em&gt;&lt;/a&gt; by Cooper &amp;amp; Torczon - Comprehensive treatment of compiler internals from front-end to optimization&lt;/li&gt;
&lt;li&gt;&lt;a href="https://baud.rs/JhMFPU"&gt;&lt;em&gt;Compilers: Principles, Techniques, and Tools&lt;/em&gt;&lt;/a&gt; by Aho, Lam, Sethi, Ullman - The classic &lt;em&gt;Dragon Book&lt;/em&gt; covering parsing, code generation, and optimization theory&lt;/li&gt;
&lt;/ul&gt;</description><category>bytecode</category><category>c</category><category>compilers</category><category>interpreters</category><category>language design</category><category>lattice</category><category>programming languages</category><category>virtual machine</category><guid>https://tinycomputers.io/posts/from-tree-walker-to-bytecode-vm-compiling-lattice.html</guid><pubDate>Tue, 17 Feb 2026 18:00:00 GMT</pubDate></item></channel></rss>