<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>TinyComputers.io (Posts about semiconductors)</title><link>https://tinycomputers.io/</link><description></description><atom:link href="https://tinycomputers.io/categories/semiconductors.xml" rel="self" type="application/rss+xml"></atom:link><language>en</language><copyright>Contents © 2026 A.C. Jokela 
&lt;!-- div style="width: 100%" --&gt;
&lt;a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"&gt;&lt;img alt="" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/80x15.png" /&gt; Creative Commons Attribution-ShareAlike&lt;/a&gt;&amp;nbsp;|&amp;nbsp;
&lt;!-- /div --&gt;
</copyright><lastBuildDate>Mon, 06 Apr 2026 22:12:57 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Investing in the Jevons Expansion</title><link>https://tinycomputers.io/posts/investing-in-the-jevons-expansion.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/investing-in-the-jevons-expansion_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;16 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;This is the sixth piece in a series applying the Jevons Paradox framework to AI economics. The prior five built the theoretical case:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://tinycomputers.io/posts/the-paradox-of-cheap-compute.html"&gt;The Paradox of Cheap Compute&lt;/a&gt; established the historical pattern: every time the cost of compute fell by an order of magnitude, total consumption expanded far beyond the efficiency gain.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tinycomputers.io/posts/the-jevons-counter-thesis-why-ai-displacement-scenarios-underweight-demand-expansion.html"&gt;The Jevons Counter-Thesis&lt;/a&gt; argued that AI displacement models systematically undercount the demand expansion that follows when cognitive labor gets cheaper.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tinycomputers.io/posts/moores-law-for-intelligence-what-happens-when-thinking-gets-cheap.html"&gt;Moore's Law for Intelligence&lt;/a&gt; mapped the inference cost curve and showed it mirrors early Moore's Law.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tinycomputers.io/posts/something-big-is-happening-a-critique.html"&gt;Something Big Is Happening, And Something Big Is Missing&lt;/a&gt; applied the framework to a specific displacement scenario and showed where the analysis breaks down.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tinycomputers.io/posts/the-ai-vampire-is-jevons-paradox.html"&gt;The AI Vampire Is Jevons Paradox&lt;/a&gt; identified the binding constraint: human judgment doesn't scale the way compute does.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This piece asks the practical question: if you believe the framework, what follows?&lt;/p&gt;
&lt;p&gt;I should be clear about what this is and what it isn't. This is not financial advice. I'm not recommending specific trades, allocations, or timing. What I'm doing is mapping a structural argument (Jevons-style demand expansion in AI) onto the physical and economic layers that expansion must pass through. The goal is to identify where expansion creates bottlenecks, because bottlenecks are where pricing power concentrates.&lt;/p&gt;
&lt;p&gt;The key insight is that you don't need to pick which AI company wins. You don't need to know whether OpenAI, Anthropic, Google, or some company that doesn't exist yet captures the application layer. What you need to identify are the fixed-supply inputs that &lt;em&gt;every&lt;/em&gt; AI company needs regardless of who wins. The expansion has to flow through certain physical chokepoints, and those chokepoints are investable.&lt;/p&gt;
&lt;h3&gt;The Framework in One Paragraph&lt;/h3&gt;
&lt;p&gt;For readers coming to this series fresh: Jevons Paradox describes what happens when a critical input gets dramatically cheaper. The intuitive expectation is that total spending on that input falls. The historical reality is the opposite: demand expands beyond the efficiency gain, and total consumption increases. Coal in the 19th century (as Jevons himself documented in &lt;a href="https://baud.rs/xjxPfz"&gt;&lt;em&gt;The Coal Question&lt;/em&gt;&lt;/a&gt;), transistors in the 20th, bandwidth in the 21st. The prior pieces in this series argue that AI inference costs are following the same curve, with the same structural conditions that produced Jevons outcomes in every prior case. If that argument holds, then what matters isn't whether AI gets more efficient; it's where the resulting demand expansion hits physical constraints.&lt;/p&gt;
&lt;h3&gt;The Objection That Isn't&lt;/h3&gt;
&lt;p&gt;The most common pushback I get on this series is some version of: "GPUs are hitting diminishing returns, capex is already enormous, and there's a natural ceiling on how far the expansion can go." Variations appear in coverage from &lt;a href="https://baud.rs/B5ATWQ"&gt;Northeastern&lt;/a&gt; and &lt;a href="https://baud.rs/bcFAl5"&gt;illuminem&lt;/a&gt;, often framed as a correction to the Jevons thesis.&lt;/p&gt;
&lt;p&gt;It's a reasonable-sounding objection. It's also wrong, and understanding &lt;em&gt;why&lt;/em&gt; it's wrong actually strengthens the Jevons case.&lt;/p&gt;
&lt;p&gt;The objection treats a technology-specific constraint as an input-level constraint. GPUs hitting diminishing returns doesn't mean &lt;em&gt;inference&lt;/em&gt; is hitting diminishing returns. It means GPUs are reaching the end of their particular S-curve. But GPUs aren't the only way to run inference. Custom ASICs, TPUs, NPUs, and novel architectures are opening entirely new cost curves &lt;em&gt;below&lt;/em&gt; the GPU curve. The GPU plateau isn't a ceiling; it's a handoff.&lt;/p&gt;
&lt;p&gt;The numbers are already visible. Broadcom controls roughly 70% of the custom AI ASIC market, reporting \$5.2 billion in AI semiconductor revenue in Q3 alone, with &lt;a href="https://baud.rs/zcsDXo"&gt;five major hyperscaler customers&lt;/a&gt; driving demand. &lt;a href="https://baud.rs/znj9ak"&gt;Marvell's custom XPU pipeline&lt;/a&gt; spans AWS, Google, Meta, and Microsoft, with AI revenue reaching \$2.6 billion in FY2026. Google's TPU transition from v6 to v7 delivered a &lt;a href="https://baud.rs/4aoJ1v"&gt;roughly 70% cost-per-token reduction&lt;/a&gt;. Taalas, a startup building hardwired inference chips, &lt;a href="https://baud.rs/QxPpqN"&gt;claims 1000x performance per watt&lt;/a&gt; versus general-purpose GPUs. Custom ASICs handle an estimated 20% of inference workloads today and are &lt;a href="https://baud.rs/eIj2sQ"&gt;projected to reach 70–75% by 2028&lt;/a&gt;, with custom ASIC shipments growing at 44.6% annually versus 16.1% for GPUs.&lt;/p&gt;
&lt;p&gt;Every prior Jevons cycle worked exactly this way. Newcomen's engine didn't just get incrementally better; it was replaced by Watt's engine, then Corliss, then turbines. Each new technology started a fresh S-curve before the previous one fully flattened. Moore's Law didn't ride a single technology either. As Chris Miller chronicles in &lt;a href="https://baud.rs/8MdhcB"&gt;&lt;em&gt;Chip War&lt;/em&gt;&lt;/a&gt;, bipolar gave way to NMOS, then CMOS, then FinFET, now gate-all-around. The pattern is always multiple overlapping S-curves, each beginning before the last one peaks.&lt;/p&gt;
&lt;p&gt;The data supports the mechanism: &lt;a href="https://baud.rs/O6Q4Tc"&gt;every 50% reduction in inference cost has been associated with a 200–300% increase in deployment&lt;/a&gt;. That's textbook Jevons elasticity.&lt;/p&gt;
&lt;p&gt;"Diminishing returns on GPUs" isn't a ceiling on inference. It's the moment the next technology takes over. That's the &lt;em&gt;mechanism&lt;/em&gt; of Jevons Paradox, not a counterpoint to it.&lt;/p&gt;
&lt;h3&gt;The Investment Layers&lt;/h3&gt;
&lt;p&gt;If Jevons-style expansion is real, it has to flow through physical infrastructure. I think about this in four layers, ordered from deepest (most expansion-certain) to shallowest (most speculative).&lt;/p&gt;
&lt;h4&gt;Layer 1: Energy and Power&lt;/h4&gt;
&lt;p&gt;Energy is the binding constraint. If AI demand expands at anything close to Jevons rates, someone has to generate the electricity. Data center electricity demand is on track to double this year, with the sector's total consumption &lt;a href="https://baud.rs/8hWfJa"&gt;surpassing Canada's national usage&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The structural problem is deeper than just demand growth. As Vaclav Smil details in &lt;a href="https://baud.rs/OMSIzZ"&gt;&lt;em&gt;Energy and Civilization&lt;/em&gt;&lt;/a&gt;, energy transitions are slow precisely because the physical infrastructure is massive and long-lived. Roughly 70% of the U.S. electrical grid was built between the 1950s and 1970s. Much of it is approaching end-of-life at the exact moment AI is driving the largest incremental demand increase in decades. This isn't a problem that resolves quickly. Power plants take years to permit and build. Grid transmission upgrades take longer.&lt;/p&gt;
&lt;p&gt;Nuclear is where the smart money is moving. Constellation Energy's merger with Calpine creates a fleet of 21 nuclear reactors plus 50 natural gas plants, essentially a baseload power platform positioned for AI demand. Amazon signed a 1.92 GW power purchase agreement at Susquehanna and committed \$500 million to small modular reactor development. These aren't speculative bets on future demand; they're capacity commitments predicated on demand that's already contractually visible.&lt;/p&gt;
&lt;p&gt;Hyperscaler capital expenditure tells the same story: \$602 billion planned for 2026, roughly 75% tied to AI infrastructure. Goldman Sachs estimates cumulative AI infrastructure spending of \$1.15 trillion between 2025 and 2027. That capital has to buy electricity, and the electricity has to come from somewhere.&lt;/p&gt;
&lt;h4&gt;Layer 2: Physical Infrastructure&lt;/h4&gt;
&lt;p&gt;Between the power plant and the GPU sits an enormous amount of physical equipment: transformers, switchgear, power distribution units, cooling systems, racks, cabling. This is the picks-and-shovels layer; it benefits regardless of which AI stack wins.&lt;/p&gt;
&lt;p&gt;Eaton reported data center orders up 70% year-over-year. Transformers have become a bottleneck, with lead times stretching to 18+ months for large power transformers. Vertiv, which makes power management and thermal systems, is sitting on a \$9.5 billion backlog. Liquid cooling, once a niche technology, is becoming standard for high-density AI compute racks.&lt;/p&gt;
&lt;p&gt;Grid transmission and distribution may be the most underappreciated bottleneck. You can build a data center in 18 months. Getting grid interconnection can take three to five years. The physical infrastructure required to move power from generation to consumption is the constraint that's hardest to accelerate, and it benefits from AI expansion regardless of which models, chips, or cloud providers ultimately dominate.&lt;/p&gt;
&lt;h4&gt;Layer 3: Custom Silicon&lt;/h4&gt;
&lt;p&gt;The GPU-to-ASIC transition described above isn't just evidence that the Jevons expansion continues; it's itself a Jevons trigger. Each new silicon architecture that enters production at lower cost-per-token reopens the demand curve.&lt;/p&gt;
&lt;p&gt;Broadcom's AI semiconductor revenue is &lt;a href="https://baud.rs/9Hp791"&gt;doubling year-over-year to roughly \$8.2 billion in Q1 FY2026&lt;/a&gt;. Marvell's custom XPU pipeline is expanding across all major hyperscalers. Both companies are positioned on the ASIC side of the GPU-to-ASIC transition, the side that's growing at 44.6% versus 16.1%.&lt;/p&gt;
&lt;p&gt;Nvidia still dominates training workloads, and Blackwell delivers a &lt;a href="https://baud.rs/5ns8n0"&gt;10x cost-per-token reduction for open-source inference models&lt;/a&gt;, which is itself a massive Jevons input. But inference is bifurcating. Training demands flexibility and programmability (Nvidia's strength). Inference at scale demands efficiency and cost optimization (where ASICs excel). The market is splitting, and both sides drive expansion.&lt;/p&gt;
&lt;h4&gt;Layer 4: The Application Tier&lt;/h4&gt;
&lt;p&gt;This is where it gets speculative. Cloud providers and hyperscalers function as toll booths; they collect revenue proportional to total compute consumed, making them natural beneficiaries of demand expansion. But the application tier above them is where you're picking winners, not betting on expansion itself.&lt;/p&gt;
&lt;p&gt;AI-native companies become viable only at cheaper inference price points. The legal tech startup that can offer document review at one-tenth the cost of a junior associate doesn't exist at \$20 per million tokens. It might exist at \$2. It definitely exists at \$0.20. Each step down the cost curve unlocks a new tier of applications.&lt;/p&gt;
&lt;p&gt;The contrarian opportunity in this layer is latent demand: the markets that don't exist yet because the service was too expensive for most people. Roughly 80% of Americans who need a lawyer can't afford one. Most small businesses can't afford financial planning. Most students can't afford tutoring. If inference costs follow a Jevons trajectory, these aren't aspirational markets; they're inevitable markets. But investing in them means picking which company captures each one, which is a fundamentally different bet than investing in the infrastructure that serves all of them.&lt;/p&gt;
&lt;h3&gt;Who Else Is Making This Bet&lt;/h3&gt;
&lt;p&gt;This framework isn't contrarian anymore. &lt;a href="https://baud.rs/Wy7mZE"&gt;Satya Nadella tweeted&lt;/a&gt; "Jevons paradox strikes again!" when DeepSeek demonstrated cheaper inference without reducing demand. Microsoft's AI revenue hit \$13 billion, up 175% year-over-year. &lt;a href="https://baud.rs/xdNj4l"&gt;Fortune noted&lt;/a&gt; that Nadella's optimism was explicitly grounded in the paradox: cheaper AI means more AI, not less.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://baud.rs/aRLPY8"&gt;Andreessen Horowitz made the economic case directly&lt;/a&gt;: cheaper tokens unlock more demand than efficiency saves. Their thesis is that foundation model economics follow the same curve as prior compute economics: falling costs expand the addressable market faster than they reduce per-unit revenue.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://baud.rs/Qcm7AN"&gt;NPR's Planet Money covered the thesis&lt;/a&gt; in mainstream terms, bringing Jevons Paradox from an obscure 19th-century economic observation to a household framework for understanding AI economics. &lt;a href="https://baud.rs/V6W8hJ"&gt;Nathan Witkin's analysis&lt;/a&gt; showed that employment in software development, translation, and radiology &lt;em&gt;increased&lt;/em&gt; after GPT-3, exactly the demand expansion the model predicts. &lt;a href="https://baud.rs/KUEJyl"&gt;Markman Capital&lt;/a&gt; called the "flawed consensus" of GPU diminishing returns "one of the most dangerous misreadings of the current market."&lt;/p&gt;
&lt;p&gt;&lt;a href="https://baud.rs/rD0Spu"&gt;Deloitte&lt;/a&gt;, McKinsey, and Bain are all projecting massive infrastructure buildout. &lt;a href="https://baud.rs/8hWfJa"&gt;McKinsey's \$7 trillion estimate&lt;/a&gt; for data center scaling reflects the same underlying logic: if demand expands as costs fall, the physical infrastructure to support it is the bottleneck.&lt;/p&gt;
&lt;p&gt;Jevons went from an obscure economics reference to a mainstream investment framework in roughly twelve months. That's not because it's trendy; it's because the data keeps confirming the pattern.&lt;/p&gt;
&lt;h3&gt;Where the Thesis Could Be Wrong&lt;/h3&gt;
&lt;p&gt;Intellectual honesty requires mapping the failure modes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Demand elasticity might be lower than historical precedent.&lt;/strong&gt; Every prior Jevons cycle involved inputs with massive latent demand: coal for industrial heat, transistors for consumer electronics, bandwidth for media. AI inference might not have the same depth of latent demand. If the tasks AI performs well are narrower than the tasks coal or transistors enabled, the expansion could stall earlier than the model predicts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Regulatory intervention could cap the expansion.&lt;/strong&gt; Energy policy, AI regulation, data center permitting restrictions. Any of these could artificially constrain the physical infrastructure that the expansion requires. Jevons Paradox describes an economic dynamic, not a law of physics. It can be overridden by policy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The biological ceiling is real.&lt;/strong&gt; As I argued in &lt;a href="https://tinycomputers.io/posts/the-ai-vampire-is-jevons-paradox.html"&gt;The AI Vampire Is Jevons Paradox&lt;/a&gt;, human judgment is the input that doesn't scale. If every Jevons expansion in AI ultimately concentrates demand on human decision-making, and human decision-making has genuine cognitive limits, the expansion hits a different kind of constraint, one that can't be solved with more silicon or more power.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Timing risk is the most likely failure mode.&lt;/strong&gt; The direction of the thesis could be correct while the timeline is wrong. Infrastructure bottlenecks might resolve more slowly than demand builds, creating periods of overinvestment followed by correction. The historical base rate favors Jevons, but base rates describe probabilities, not certainties. Plenty of investors have been right about the direction and still lost money because they were wrong about the timing.&lt;/p&gt;
&lt;h3&gt;The Physical Footprint of Expansion&lt;/h3&gt;
&lt;p&gt;The deepest layers (energy and physical infrastructure) are the safest Jevons bets. They benefit from AI demand expansion regardless of which models, chips, or companies win. You don't need to know whether GPT-7 or Claude 6 is the better model to know that both of them will need electricity, transformers, cooling, and grid capacity.&lt;/p&gt;
&lt;p&gt;The further up the stack you go, the more you're picking winners rather than betting on expansion. Custom silicon is a strong middle ground: the GPU-to-ASIC transition is structural, and the companies positioned on the right side of it have visible demand. But the application tier is where the uncertainty concentrates, and that's where most retail investors focus their attention.&lt;/p&gt;
&lt;p&gt;The expansion has a physical footprint. Every token generated requires electricity. Every data center requires grid interconnection. Every custom ASIC requires a fab slot. Every cooling system requires water. The Jevons expansion, if it plays out as the framework predicts, will be visible not in stock prices or earnings calls but in the physical world: in power generation capacity, in transformer lead times, in grid interconnection queues, in cooling system orders.&lt;/p&gt;
&lt;p&gt;Jevons won't announce itself. It never does. It shows up in electricity bills, in transformer backorders, in cooling system lead times, in the quiet scramble to secure power purchase agreements years in advance. The signal isn't in what people say about AI. It's in what they're building to support it.&lt;/p&gt;</description><category>ai</category><category>asic</category><category>data centers</category><category>economics</category><category>energy</category><category>gpu</category><category>infrastructure</category><category>investing</category><category>jevons paradox</category><category>nuclear</category><category>semiconductors</category><category>utilities</category><guid>https://tinycomputers.io/posts/investing-in-the-jevons-expansion.html</guid><pubDate>Thu, 05 Mar 2026 14:00:00 GMT</pubDate></item><item><title>Moore's Law for Intelligence: What Happens When Thinking Gets Cheap</title><link>https://tinycomputers.io/posts/moores-law-for-intelligence-what-happens-when-thinking-gets-cheap.html?utm_source=feed&amp;utm_medium=rss&amp;utm_campaign=rss</link><dc:creator>A.C. Jokela</dc:creator><description>&lt;div class="audio-widget"&gt;
&lt;div class="audio-widget-header"&gt;
&lt;span class="audio-widget-icon"&gt;🎧&lt;/span&gt;
&lt;span class="audio-widget-label"&gt;Listen to this article&lt;/span&gt;
&lt;/div&gt;
&lt;audio controls preload="metadata"&gt;
&lt;source src="https://tinycomputers.io/moores-law-for-intelligence-what-happens-when-thinking-gets-cheap_tts.mp3" type="audio/mpeg"&gt;
&lt;/source&gt;&lt;/audio&gt;
&lt;div class="audio-widget-footer"&gt;24 min · AI-generated narration&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src="https://tinycomputers.io/images/moores-law-intelligence/silicon-wafer.jpg" alt="A silicon wafer with an array of integrated circuit dies, the physical foundation of Moore's Law" style="float: right; max-width: 40%; margin: 0 0 1em 1.5em; border-radius: 4px;"&gt;&lt;/p&gt;
&lt;p&gt;I have written about &lt;a href="https://tinycomputers.io/posts/jevons-paradox.html"&gt;Jevons Paradox&lt;/a&gt; twice now, once through the history of the semiconductor industry, and once as a &lt;a href="https://tinycomputers.io/posts/the-jevons-counter-thesis-why-ai-displacement-scenarios-underweight-demand-expansion.html"&gt;broader examination&lt;/a&gt; of what happens when the cost of a critical economic input collapses. The pattern is consistent: demand expands to overwhelm the savings. Coal. Transistors. Bandwidth. Lighting.&lt;/p&gt;
&lt;p&gt;Those pieces looked at the pattern itself. This one is different. I want to run a thought experiment forward, not backward.&lt;/p&gt;
&lt;p&gt;I've also spent a lot of time on this site looking backward at computing history, watching &lt;a href="https://tinycomputers.io/posts/stewart-cheifet-and-his-computer-chronicles.html"&gt;Stewart Cheifet walk viewers through the early personal computer revolution&lt;/a&gt; on &lt;em&gt;The Computer Chronicles&lt;/em&gt;, examining how &lt;a href="https://tinycomputers.io/posts/language-manipulators-what-a-1983-episode-of-the-computer-chronicles-got-right-and-wrong-about-word-processing.html"&gt;word processing went from a curiosity to a necessity&lt;/a&gt; in a single decade, tracing &lt;a href="https://tinycomputers.io/posts/george-morrow-pioneer-of-personal-computing.html"&gt;George Morrow's&lt;/a&gt; role in making personal computing real, and following &lt;a href="https://tinycomputers.io/posts/cpm-history-and-legacy.html"&gt;CP/M's arc&lt;/a&gt; from operating system of the future to historical footnote. I've &lt;a href="https://tinycomputers.io/posts/cpm-on-physical-retroshield-z80.html"&gt;run CP/M on physical RetroShield hardware&lt;/a&gt;, explored the &lt;a href="https://tinycomputers.io/posts/motorola-68000-processor-and-the-ti-89-graphing-calculator.html"&gt;Motorola 68000&lt;/a&gt; that powered a generation of machines, and dug into &lt;a href="https://tinycomputers.io/posts/infocom-zork-history.html"&gt;how Infocom turned text adventures into a business&lt;/a&gt; at a time when 64K of RAM was generous. That immersion in where computing came from is exactly what makes the forward question so vivid, because at every stage, the people living through the transition couldn't see what was coming next. The engineers building CP/M didn't anticipate DOS. The engineers building DOS didn't anticipate the web. The engineers building the web didn't anticipate the iPhone. The pattern is always the same: cheaper compute enables things that were unimaginable at the prior cost.&lt;/p&gt;
&lt;p&gt;The question isn't "will AI destroy jobs?" or "is the doom scenario wrong?" The question is: &lt;strong&gt;what becomes possible when thinking gets cheap?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Because AI compute is following a cost curve that looks remarkably like the early decades of Moore's Law. And if that continues (if the cost per unit of machine intelligence drops by an order of magnitude every few years) the consequences extend far beyond making today's chatbots cheaper to run.&lt;/p&gt;
&lt;h3&gt;The Cost Curve&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/moores-law-intelligence/moores-law-transistor-count.png" alt="Microprocessor transistor counts from 1971 to 2011 plotted on a logarithmic scale, showing Moore's Law doubling trend" style="max-width: 100%; margin: 0 0 1.5em 0; border-radius: 4px;"&gt;&lt;/p&gt;
&lt;p&gt;Moore's Law, in its original formulation, described the doubling of transistors per integrated circuit roughly every two years. But the economic consequence that mattered wasn't transistor density; it was cost per unit of compute. From the 1960s through the 2010s, the cost per FLOP declined at a compound rate that delivered roughly a 10x improvement every four to five years. A computation that cost \$1 million in 1975 cost \$1 by 2010. That decline didn't just make existing applications cheaper. It created entirely new categories of computing that were inconceivable at the prior cost structure.&lt;/p&gt;
&lt;p&gt;AI inference costs are now following a similar trajectory, but faster. OpenAI's text-davinci-003, released in late 2022, cost \$20 per million tokens. GPT-4o mini, released in mid-2024, delivers substantially better performance at \$0.15 per million input tokens, a 99% cost reduction in under two years. Claude, Gemini, and open-source models have followed similar curves. DeepSeek entered the market in early 2025 with pricing that undercut Western frontier models by roughly 90%, compressing the timeline further through competitive pressure.&lt;/p&gt;
&lt;p&gt;The GPU hardware underneath these models is on its own Moore's Law trajectory. GPU price-performance in FLOP/s per dollar doubles approximately every 2.5 years for ML-class hardware. Architectural improvements in transformers, mixture-of-experts routing, quantization, speculative decoding, and distillation compound on top of the hardware gains. The result is a cost curve where the effective price of a unit of machine reasoning is falling faster than the price of a transistor did during the semiconductor industry's most explosive growth phase.&lt;/p&gt;
&lt;p&gt;This matters because we know, empirically, what happens when the cost of a foundational input follows an exponential decline. We have sixty years of data on it. The compute industry went from a few thousand mainframes serving governments and large corporations to billions of devices in every pocket, every appliance, every traffic light. Total spending on computing didn't shrink as costs fell; it expanded by orders of magnitude, because each 10x cost reduction unlocked categories of use that didn't exist at the prior price point.&lt;/p&gt;
&lt;p&gt;The thought experiment is straightforward: apply that pattern to intelligence itself.&lt;/p&gt;
&lt;h3&gt;Today's Price Points Create Today's Use Cases&lt;/h3&gt;
&lt;p&gt;At current pricing (roughly \$3 per million input tokens for a frontier model like Claude Sonnet), AI is economically viable for a specific class of applications. Customer support automation. Code assistance. Document summarization. Marketing copy. Translation. These are the use cases where the value generated per token comfortably exceeds the cost per token, and where the interaction pattern involves relatively short exchanges.&lt;/p&gt;
&lt;p&gt;But there are vast categories of potential use where current pricing makes the math uncomfortable or impossible. Consider:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Continuous monitoring and analysis.&lt;/strong&gt; A financial analyst who wants an AI to continuously watch SEC filings, earnings calls, patent applications, and news feeds across 500 companies (analyzing each document in full, cross-referencing against historical patterns, and generating alerts) would consume billions of tokens per month. At current prices, this costs tens of thousands of dollars monthly. At 100x cheaper, it costs the price of a SaaS subscription.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Full-codebase reasoning.&lt;/strong&gt; This one is already arriving. Anthropic's Claude Opus 4.6, working through Claude Code, can operate at the repository level, reading files, understanding architecture, running tests, and making changes across an entire codebase in a single session. I've used it to build a &lt;a href="https://tinycomputers.io/posts/open-sourcing-a-high-performance-rust-based-ballistics-engine.html"&gt;high-performance Rust-based ballistics engine&lt;/a&gt; and to develop &lt;a href="https://tinycomputers.io/posts/introducing-lattice-a-crystallization-based-programming-language.html"&gt;Lattice, an entire programming language&lt;/a&gt; with a &lt;a href="https://tinycomputers.io/posts/from-tree-walker-to-bytecode-vm-compiling-lattice.html"&gt;bytecode VM compiler&lt;/a&gt;, projects where the AI wasn't autocompleting fragments but reasoning across thousands of lines of interconnected code, tracking type systems, managing compiler passes, and understanding how changes in one module ripple through the rest. The constraint today isn't capability; it's cost. These sessions consume large volumes of tokens, which means they're viable for serious engineering work but not yet cheap enough to run continuously on every commit, every pull request, every deployment. At 100x cheaper, that changes. At 1,000x cheaper, every codebase has an always-on collaborator that has read everything and forgets nothing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Personalized education at scale.&lt;/strong&gt; A truly personalized AI tutor that adapts to a student's learning style, tracks their understanding across subjects, reviews their homework in detail, explains mistakes with patience, and adjusts its teaching strategy over months, this requires sustained, high-volume token consumption per student. Multiply by millions of students and the current cost structure breaks. At 100x cheaper, it's viable for a school district. At 1,000x cheaper, it's viable for an individual family.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Preventive medicine.&lt;/strong&gt; Analyzing a patient's complete medical history, genetic data, lifestyle information, lab results, and the current research literature to generate genuinely personalized health recommendations (not the generic advice a five-minute doctor's visit produces, but the kind of comprehensive analysis that currently only concierge medicine patients paying \$10,000+ per year receive). At current token prices, this is prohibitively expensive for routine use. At 100x cheaper, it could be embedded in every annual checkup.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ambient intelligence.&lt;/strong&gt; The concept of AI that runs continuously in the background of your life (understanding your calendar, email, documents, and goals, proactively surfacing relevant information, drafting responses, scheduling meetings, flagging conflicts) requires sustained inference at volumes that would cost hundreds of dollars per day at current prices. At 1,000x cheaper, it costs less than your phone bill.&lt;/p&gt;
&lt;p&gt;These aren't science fiction scenarios. They're applications of current model capabilities at price points that don't yet exist. The models can already do most of this work. The cost curve is the bottleneck.&lt;/p&gt;
&lt;h3&gt;The 10x / 100x / 1,000x Framework&lt;/h3&gt;
&lt;p&gt;Moore's Law didn't deliver its benefits in a smooth, continuous flow. It came in thresholds, price points at which qualitatively new applications became viable. The pattern with AI compute is likely to follow the same staircase function.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;At 10x cheaper&lt;/strong&gt; (plausible within 1-2 years): AI becomes viable for tasks that are currently "almost worth it." Small businesses that can't justify \$500/month for AI tooling find it worthwhile at \$50/month. Individual professionals (accountants, lawyers, doctors, engineers) integrate AI into their daily workflow not as an occasional tool but as a constant companion. The volume of AI-mediated work increases dramatically, but the character of work doesn't fundamentally change. This is the equivalent of the minicomputer era: the same kind of computing, available to more people.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;At 100x cheaper&lt;/strong&gt; (plausible within 3-5 years): The applications listed above become economically viable. Continuous analysis, full-codebase reasoning, personalized education, preventive medicine at scale. At this price point, AI stops being a tool you use and starts being infrastructure you run on. Every document you write gets reviewed. Every decision you make gets a second opinion. Every student gets a tutor. Every patient gets a diagnostician. The total volume of inference consumed per capita increases by far more than 100x, because new use cases emerge that weren't contemplated at the prior price. This is the personal computer moment: qualitatively new categories of use.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;At 1,000x cheaper&lt;/strong&gt; (plausible within 5-8 years): Intelligence becomes ambient and disposable. You don't think about whether to use AI for a task any more than you think about whether to use electricity for a task. Every appliance, every vehicle, every building, every piece of infrastructure has embedded reasoning running continuously. Your home understands your patterns and adapts. Your car negotiates traffic in real time not just with sensors but with models that predict the behavior of every other vehicle. Agricultural equipment analyzes soil conditions at the individual plant level. Supply chains optimize in real time across thousands of variables. This is the smartphone moment: computing so cheap and pervasive that it becomes invisible.&lt;/p&gt;
&lt;h3&gt;The Compounding Effect&lt;/h3&gt;
&lt;p&gt;There's a dynamic in AI cost reduction that didn't exist with traditional Moore's Law: cheaper inference enables better models, which enables even cheaper inference.&lt;/p&gt;
&lt;p&gt;When inference is expensive, researchers are constrained in how they can train and evaluate models. Each experiment costs real money. Each architecture search consumes significant compute budgets. When inference costs drop, researchers can run more experiments, evaluate more architectures, and discover more efficient approaches, which further reduces costs. Distillation (training a smaller model to mimic a larger one) becomes more practical when the larger model is cheaper to run at scale. Synthetic data generation (using AI to create training data for other AI) becomes more economical. The cost reduction compounds on itself.&lt;/p&gt;
&lt;p&gt;This is already happening. GPT-4 was used to generate synthetic training data for GPT-4o. Claude's training pipeline uses prior Claude models to evaluate and filter training examples. Google's Gemini models help design the next generation of TPU chips that will run future Gemini models. The AI equivalent of "using computers to design better computers" arrived in year three of the current wave, decades earlier in the relative timeline than it took the semiconductor industry to reach the same recursive dynamic.&lt;/p&gt;
&lt;p&gt;The implication is that the cost curve isn't just declining; it's declining at an accelerating rate because each improvement enables the next one. The semiconductor industry saw this acceleration plateau after about fifty years as it approached physical limits of silicon. AI has no equivalent physical constraint on the horizon. The limits are architectural and algorithmic, and those limits have been falling faster than hardware limits ever did.&lt;/p&gt;
&lt;h3&gt;What the Semiconductor Analogy Actually Predicts&lt;/h3&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/moores-law-intelligence/cray-1.jpg" alt="A Cray-1 supercomputer on display, showing its distinctive cylindrical tower design with bench seating and exposed cooling plumbing" style="float: right; max-width: 45%; margin: 0 0 1em 1.5em; border-radius: 4px;"&gt;&lt;/p&gt;
&lt;p&gt;In 1975, a Cray-1 supercomputer delivered about 160 MFLOPS and cost \$8 million. In 2025, an iPhone delivers roughly 2 TFLOPS of neural engine performance and costs \$800. That's a 12,500x performance increase at a 10,000x cost decrease, a net improvement of roughly 100 million times in price-performance over fifty years.&lt;/p&gt;
&lt;p&gt;Nobody in 1975 predicted Instagram, Uber, Google Maps, or Spotify. Not because these applications required fundamentally new physics; they just required compute that was cheap enough to run in a device that fit in your pocket. The applications were latent, waiting for the cost curve to reach them.&lt;/p&gt;
&lt;p&gt;The history is instructive at each threshold. When a capable computer crossed below \$20,000 in the early 1980s, it unlocked small business accounting, the same work mainframes did, just for smaller organizations. When it crossed below \$2,000 in the mid-1990s, it unlocked home computing, and with it the web browser, email, and e-commerce. When capable compute crossed below \$200 in the smartphone era, it unlocked ride-sharing, mobile payments, and social media, none of which had any conceptual precursor at the \$20,000 price point. Each 10x reduction didn't just expand the existing market. It created a market that was literally unimaginable at the prior price.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://tinycomputers.io/images/moores-law-intelligence/ibm-system-360.jpg" alt="An IBM System/360 Model 30 mainframe computer with its distinctive red cabinet and operator control panel" style="float: right; max-width: 45%; margin: 0 0 1em 1.5em; border-radius: 4px;"&gt;&lt;/p&gt;
&lt;p&gt;The same principle applies to intelligence. We are in the mainframe era of AI. The applications we see today (chatbots, code assistants, image generators) are the equivalent of payroll processing and scientific computation on 1960s mainframes. They are real and valuable, but they represent a tiny fraction of what becomes possible when the cost drops by five or six orders of magnitude.&lt;/p&gt;
&lt;p&gt;What are the Instagram and Uber equivalents of cheap intelligence? By definition, we can't fully predict them. But we can identify the structural conditions that will enable them:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When intelligence costs less than attention, delegation becomes default.&lt;/strong&gt; Today, the cognitive cost of formulating a good prompt, evaluating the output, and iterating often exceeds the cost of just doing the task yourself. As models get cheaper, faster, and better at understanding context, the threshold shifts. Eventually, not delegating a cognitive task to AI becomes the irrational choice, the way not using a calculator for arithmetic became irrational.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When intelligence costs less than data storage, everything gets analyzed.&lt;/strong&gt; Today, most data that organizations collect is never analyzed. It's stored, archived, and forgotten, because the cost of human analysis exceeds the expected value of the insights. When AI analysis is effectively free, every dataset gets examined. Every log file gets reviewed. Every customer interaction gets analyzed for patterns. The volume of insight generated from existing data increases by orders of magnitude.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When intelligence costs less than communication overhead, organizations restructure.&lt;/strong&gt; This is already starting. A significant fraction of white-collar work is coordination: meetings, emails, status updates, project management. These exist because humans need to synchronize their mental models of shared projects. AI tools are already compressing this layer: meeting summaries that eliminate the need for half the attendees, project dashboards that maintain themselves, codebases where an AI agent tracks the state of every open issue so developers don't have to sit through standup. When AI can maintain a comprehensive, always-current model of a project's state, much of the coordination overhead that justifies entire job categories (project managers, program managers, business analysts, internal consultants) begins to evaporate. An organization that currently needs 50 people to coordinate a complex project might need 10, with AI handling the information synthesis that previously required human intermediaries. That's a genuine productivity gain. It's also 40 people who need to find something else to do, and the honest answer is that we don't yet know how fast the demand side creates new roles to absorb them.&lt;/p&gt;
&lt;h3&gt;The Demand Expansion Is the Story&lt;/h3&gt;
&lt;p&gt;The instinct when hearing "AI gets 1,000x cheaper" is to think about cost savings. That's the substitution frame: doing the same things for less money. And yes, that will happen. But the semiconductor analogy tells us that cost savings are the boring part of the story.&lt;/p&gt;
&lt;p&gt;When compute got 1,000x cheaper between 1980 and 2000, the interesting story wasn't that scientific simulations got cheaper to run. It was that entirely new industries (PC software, internet services, mobile apps, social media, cloud computing) emerged to consume orders of magnitude more compute than the entire prior industry had used. The efficiency gain on existing applications was dwarfed by the demand expansion from new applications.&lt;/p&gt;
&lt;p&gt;The same will likely be true for intelligence. Consider bandwidth as a parallel case. In 1995, a 28.8 kbps modem made email and basic web pages viable. Nobody was streaming video; it was physically impossible at that bandwidth, not merely expensive. By 2005, broadband had made streaming music viable. By 2015, streaming 4K video was routine. By 2025, cloud gaming and real-time video conferencing were infrastructure-level assumptions. Total bandwidth consumption didn't decline as it got cheaper. It increased by roughly a million times, because each generation of cost reduction enabled applications that consumed orders of magnitude more bandwidth than the previous generation's entire output.&lt;/p&gt;
&lt;p&gt;The interesting story isn't that customer support gets cheaper. It's the applications that are currently impossible (not difficult, not expensive, but literally impossible at current price points) that become not just possible but routine.&lt;/p&gt;
&lt;p&gt;A world where every small business has a CFO-grade financial analyst. Where every patient has a diagnostician who has read every relevant paper published in the last decade. Where every student has a tutor who knows exactly where they're struggling and why. Where every local government has the analytical capacity currently reserved for federal agencies.&lt;/p&gt;
&lt;p&gt;And the nature of building software itself is changing in ways that go beyond "engineers with better tools." For most of computing history, writing code meant a human translating intent into syntax, line by line, function by function. AI assistance started as autocomplete: suggesting the next line, filling in boilerplate. But that phase is already ending. Today, with tools like Claude Code, the workflow has inverted. The human describes what they want (an architecture, a feature, a behavior) and the AI writes the implementation across files, runs the tests, and iterates on failures. The engineer's role shifts from writing code to directing and reviewing it, from syntax to judgment. At 10x cheaper, this is how professional developers work. At 100x cheaper, it's how small teams build products that previously required departments. At 1,000x cheaper, the barrier between "person with an idea" and "working software" functionally disappears. The entire concept of what it means to be a software engineer is being rewritten in real time, not by replacing engineers, but by redefining the skill from "can you write this code?" to "do you know what to build and why?"&lt;/p&gt;
&lt;p&gt;These aren't efficiency improvements on existing systems. They're new capabilities that create new categories of economic activity, new forms of organization, and new kinds of products and services that don't have current analogs, just as social media, ride-sharing, and cloud computing had no analogs in the mainframe era.&lt;/p&gt;
&lt;h3&gt;The Question That Matters&lt;/h3&gt;
&lt;p&gt;I should be honest about what I don't know. The displacement scenarios for white-collar labor are not fantasy. AI is already capable enough to handle work that was solidly middle-class professional territory two years ago: document review, financial analysis, code generation, customer support, content production. The scenarios where this accelerates faster than the economy can absorb are plausible, and anyone who dismisses them outright isn't paying attention. When a technology can replicate cognitive labor at a fraction of the cost, the transitional pain for the people whose livelihoods depend on that labor is real and potentially severe. The speed matters: prior technology transitions unfolded over decades, and AI compression of that timeline into years is a genuine uncertainty that historical analogy doesn't fully resolve.&lt;/p&gt;
&lt;p&gt;But there is a question that displacement scenarios consistently underweight, and it's the one I explored in my &lt;a href="https://tinycomputers.io/posts/the-jevons-counter-thesis-why-ai-displacement-scenarios-underweight-demand-expansion.html"&gt;Jevons counter-thesis&lt;/a&gt;: what happens on the demand side? Every model that projects mass unemployment from cheap AI is implicitly assuming that the economy remains roughly the same size, with machines doing the work humans used to do. That's the substitution frame. And the substitution frame has been wrong at every prior technological inflection point, not slightly wrong, but wrong by orders of magnitude.&lt;/p&gt;
&lt;p&gt;The semiconductor industry's answer, delivered over sixty years of data, is unambiguous. Every order-of-magnitude cost reduction generated more economic activity, more employment, and more total compute consumption than the one before it. The economy didn't shrink as compute got cheaper. It restructured around cheap compute and grew. Roughly 80% of Americans who need legal help can't afford it today. Personalized tutoring is a luxury good. Custom software is out of reach for most small businesses. These aren't speculative markets; they're documented unmet demand suppressed by the cost of human intelligence. When that cost collapses, the demand doesn't stay static.&lt;/p&gt;
&lt;p&gt;The honest answer is that both things will happen simultaneously. Jobs will be displaced, some permanently. And new categories of economic activity will emerge that are currently inconceivable, just as social media and cloud computing were inconceivable in the mainframe era. The question is which force dominates, and how fast the transition occurs. I think the historical pattern favors demand expansion, but I hold that view with the humility of someone who knows the speed of this particular transition is unprecedented.&lt;/p&gt;
&lt;p&gt;AI inference costs are following the same curve as semiconductors, possibly faster. The tokens-per-dollar ratio will improve by orders of magnitude. And when it does, the applications that emerge will make today's AI use cases look as quaint as running payroll on a room-sized mainframe.&lt;/p&gt;
&lt;p&gt;The thought experiment ends where all Jevons stories end: with more consumption, not less. More intelligence deployed, not less. More economic activity built on cheap cognition, not less. The cost curve is the enabling condition. What gets built on top of it is the part we can't fully predict, and historically, that's always been the most interesting part.&lt;/p&gt;</description><category>ai</category><category>compute costs</category><category>demand expansion</category><category>economics</category><category>inference</category><category>jevons paradox</category><category>moore's law</category><category>semiconductors</category><category>technology</category><category>tokens</category><guid>https://tinycomputers.io/posts/moores-law-for-intelligence-what-happens-when-thinking-gets-cheap.html</guid><pubDate>Sat, 28 Feb 2026 14:00:00 GMT</pubDate></item></channel></rss>