🎧 Listen to this article

This is the sixth piece in a series applying the Jevons Paradox framework to AI economics. The prior five built the theoretical case:

This piece asks the practical question: if you believe the framework, what follows?

I should be clear about what this is and what it isn't. This is not financial advice. I'm not recommending specific trades, allocations, or timing. What I'm doing is mapping a structural argument — Jevons-style demand expansion in AI — onto the physical and economic layers that expansion must pass through. The goal is to identify where expansion creates bottlenecks, because bottlenecks are where pricing power concentrates.

The key insight is that you don't need to pick which AI company wins. You don't need to know whether OpenAI, Anthropic, Google, or some company that doesn't exist yet captures the application layer. What you need to identify are the fixed-supply inputs that every AI company needs regardless of who wins. The expansion has to flow through certain physical chokepoints, and those chokepoints are investable.

The Framework in One Paragraph

For readers coming to this series fresh: Jevons Paradox describes what happens when a critical input gets dramatically cheaper. The intuitive expectation is that total spending on that input falls. The historical reality is the opposite — demand expands beyond the efficiency gain, and total consumption increases. Coal in the 19th century — as Jevons himself documented in The Coal Question — transistors in the 20th, bandwidth in the 21st. The prior pieces in this series argue that AI inference costs are following the same curve, with the same structural conditions that produced Jevons outcomes in every prior case. If that argument holds, then what matters isn't whether AI gets more efficient — it's where the resulting demand expansion hits physical constraints.

The Objection That Isn't

The most common pushback I get on this series is some version of: "GPUs are hitting diminishing returns, capex is already enormous, and there's a natural ceiling on how far the expansion can go." Variations appear in coverage from Northeastern and illuminem, often framed as a correction to the Jevons thesis.

It's a reasonable-sounding objection. It's also wrong — and understanding why it's wrong actually strengthens the Jevons case.

The objection treats a technology-specific constraint as an input-level constraint. GPUs hitting diminishing returns doesn't mean inference is hitting diminishing returns. It means GPUs are reaching the end of their particular S-curve. But GPUs aren't the only way to run inference. Custom ASICs, TPUs, NPUs, and novel architectures are opening entirely new cost curves below the GPU curve. The GPU plateau isn't a ceiling — it's a handoff.

The numbers are already visible. Broadcom controls roughly 70% of the custom AI ASIC market, reporting \$5.2 billion in AI semiconductor revenue in Q3 alone, with five major hyperscaler customers driving demand. Marvell's custom XPU pipeline spans AWS, Google, Meta, and Microsoft, with AI revenue reaching \$2.6 billion in FY2026. Google's TPU transition from v6 to v7 delivered a roughly 70% cost-per-token reduction. Taalas, a startup building hardwired inference chips, claims 1000x performance per watt versus general-purpose GPUs. Custom ASICs handle an estimated 20% of inference workloads today and are projected to reach 70–75% by 2028, with custom ASIC shipments growing at 44.6% annually versus 16.1% for GPUs.

Every prior Jevons cycle worked exactly this way. Newcomen's engine didn't just get incrementally better — it was replaced by Watt's engine, then Corliss, then turbines. Each new technology started a fresh S-curve before the previous one fully flattened. Moore's Law didn't ride a single technology either — as Chris Miller chronicles in Chip War, bipolar gave way to NMOS, then CMOS, then FinFET, now gate-all-around. The pattern is always multiple overlapping S-curves, each beginning before the last one peaks.

The data supports the mechanism: every 50% reduction in inference cost has been associated with a 200–300% increase in deployment. That's textbook Jevons elasticity.

"Diminishing returns on GPUs" isn't a ceiling on inference. It's the moment the next technology takes over. That's the mechanism of Jevons Paradox, not a counterpoint to it.

The Investment Layers

If Jevons-style expansion is real, it has to flow through physical infrastructure. I think about this in four layers, ordered from deepest (most expansion-certain) to shallowest (most speculative).

Layer 1: Energy and Power

Energy is the binding constraint. If AI demand expands at anything close to Jevons rates, someone has to generate the electricity. Data center electricity demand is on track to double this year, with the sector's total consumption surpassing Canada's national usage.

The structural problem is deeper than just demand growth. As Vaclav Smil details in Energy and Civilization, energy transitions are slow precisely because the physical infrastructure is massive and long-lived. Roughly 70% of the U.S. electrical grid was built between the 1950s and 1970s. Much of it is approaching end-of-life at the exact moment AI is driving the largest incremental demand increase in decades. This isn't a problem that resolves quickly. Power plants take years to permit and build. Grid transmission upgrades take longer.

Nuclear is where the smart money is moving. Constellation Energy's merger with Calpine creates a fleet of 21 nuclear reactors plus 50 natural gas plants — essentially a baseload power platform positioned for AI demand. Amazon signed a 1.92 GW power purchase agreement at Susquehanna and committed \$500 million to small modular reactor development. These aren't speculative bets on future demand — they're capacity commitments predicated on demand that's already contractually visible.

Hyperscaler capital expenditure tells the same story: \$602 billion planned for 2026, roughly 75% tied to AI infrastructure. Goldman Sachs estimates cumulative AI infrastructure spending of \$1.15 trillion between 2025 and 2027. That capital has to buy electricity, and the electricity has to come from somewhere.

Layer 2: Physical Infrastructure

Between the power plant and the GPU sits an enormous amount of physical equipment: transformers, switchgear, power distribution units, cooling systems, racks, cabling. This is the picks-and-shovels layer — it benefits regardless of which AI stack wins.

Eaton reported data center orders up 70% year-over-year. Transformers have become a bottleneck, with lead times stretching to 18+ months for large power transformers. Vertiv, which makes power management and thermal systems, is sitting on a \$9.5 billion backlog. Liquid cooling, once a niche technology, is becoming standard for high-density AI compute racks.

Grid transmission and distribution may be the most underappreciated bottleneck. You can build a data center in 18 months. Getting grid interconnection can take three to five years. The physical infrastructure required to move power from generation to consumption is the constraint that's hardest to accelerate — and it benefits from AI expansion regardless of which models, chips, or cloud providers ultimately dominate.

Layer 3: Custom Silicon

The GPU-to-ASIC transition described above isn't just evidence that the Jevons expansion continues — it's itself a Jevons trigger. Each new silicon architecture that enters production at lower cost-per-token reopens the demand curve.

Broadcom's AI semiconductor revenue is doubling year-over-year to roughly \$8.2 billion in Q1 FY2026. Marvell's custom XPU pipeline is expanding across all major hyperscalers. Both companies are positioned on the ASIC side of the GPU-to-ASIC transition — the side that's growing at 44.6% versus 16.1%.

Nvidia still dominates training workloads, and Blackwell delivers a 10x cost-per-token reduction for open-source inference models — which is itself a massive Jevons input. But inference is bifurcating. Training demands flexibility and programmability (Nvidia's strength). Inference at scale demands efficiency and cost optimization (where ASICs excel). The market is splitting, and both sides drive expansion.

Layer 4: The Application Tier

This is where it gets speculative. Cloud providers and hyperscalers function as toll booths — they collect revenue proportional to total compute consumed, making them natural beneficiaries of demand expansion. But the application tier above them is where you're picking winners, not betting on expansion itself.

AI-native companies become viable only at cheaper inference price points. The legal tech startup that can offer document review at one-tenth the cost of a junior associate doesn't exist at \$20 per million tokens. It might exist at \$2. It definitely exists at \$0.20. Each step down the cost curve unlocks a new tier of applications.

The contrarian opportunity in this layer is latent demand — the markets that don't exist yet because the service was too expensive for most people. Roughly 80% of Americans who need a lawyer can't afford one. Most small businesses can't afford financial planning. Most students can't afford tutoring. If inference costs follow a Jevons trajectory, these aren't aspirational markets — they're inevitable markets. But investing in them means picking which company captures each one, which is a fundamentally different bet than investing in the infrastructure that serves all of them.

Who Else Is Making This Bet

This framework isn't contrarian anymore. Satya Nadella tweeted "Jevons paradox strikes again!" when DeepSeek demonstrated cheaper inference without reducing demand. Microsoft's AI revenue hit \$13 billion, up 175% year-over-year. Fortune noted that Nadella's optimism was explicitly grounded in the paradox — cheaper AI means more AI, not less.

Andreessen Horowitz made the economic case directly: cheaper tokens unlock more demand than efficiency saves. Their thesis is that foundation model economics follow the same curve as prior compute economics — falling costs expand the addressable market faster than they reduce per-unit revenue.

NPR's Planet Money covered the thesis in mainstream terms, bringing Jevons Paradox from an obscure 19th-century economic observation to a household framework for understanding AI economics. Nathan Witkin's analysis showed that employment in software development, translation, and radiology increased after GPT-3 — exactly the demand expansion the model predicts. Markman Capital called the "flawed consensus" of GPU diminishing returns "one of the most dangerous misreadings of the current market."

Deloitte, McKinsey, and Bain are all projecting massive infrastructure buildout. McKinsey's \$7 trillion estimate for data center scaling reflects the same underlying logic: if demand expands as costs fall, the physical infrastructure to support it is the bottleneck.

Jevons went from an obscure economics reference to a mainstream investment framework in roughly twelve months. That's not because it's trendy — it's because the data keeps confirming the pattern.

Where the Thesis Could Be Wrong

Intellectual honesty requires mapping the failure modes.

Demand elasticity might be lower than historical precedent. Every prior Jevons cycle involved inputs with massive latent demand — coal for industrial heat, transistors for consumer electronics, bandwidth for media. AI inference might not have the same depth of latent demand. If the tasks AI performs well are narrower than the tasks coal or transistors enabled, the expansion could stall earlier than the model predicts.

Regulatory intervention could cap the expansion. Energy policy, AI regulation, data center permitting restrictions — any of these could artificially constrain the physical infrastructure that the expansion requires. Jevons Paradox describes an economic dynamic, not a law of physics. It can be overridden by policy.

The biological ceiling is real. As I argued in The AI Vampire Is Jevons Paradox, human judgment is the input that doesn't scale. If every Jevons expansion in AI ultimately concentrates demand on human decision-making, and human decision-making has genuine cognitive limits, the expansion hits a different kind of constraint — one that can't be solved with more silicon or more power.

Timing risk is the most likely failure mode. The direction of the thesis could be correct while the timeline is wrong. Infrastructure bottlenecks might resolve more slowly than demand builds, creating periods of overinvestment followed by correction. The historical base rate favors Jevons, but base rates describe probabilities, not certainties. Plenty of investors have been right about the direction and still lost money because they were wrong about the timing.

The Physical Footprint of Expansion

The deepest layers — energy and physical infrastructure — are the safest Jevons bets. They benefit from AI demand expansion regardless of which models, chips, or companies win. You don't need to know whether GPT-7 or Claude 6 is the better model to know that both of them will need electricity, transformers, cooling, and grid capacity.

The further up the stack you go, the more you're picking winners rather than betting on expansion. Custom silicon is a strong middle ground — the GPU-to-ASIC transition is structural, and the companies positioned on the right side of it have visible demand. But the application tier is where the uncertainty concentrates, and that's where most retail investors focus their attention.

The expansion has a physical footprint. Every token generated requires electricity. Every data center requires grid interconnection. Every custom ASIC requires a fab slot. Every cooling system requires water. The Jevons expansion, if it plays out as the framework predicts, will be visible not in stock prices or earnings calls but in the physical world — in power generation capacity, in transformer lead times, in grid interconnection queues, in cooling system orders.

Jevons won't announce itself. It never does. It shows up in electricity bills, in transformer backorders, in cooling system lead times, in the quiet scramble to secure power purchase agreements years in advance. The signal isn't in what people say about AI. It's in what they're building to support it.