Sovereign AI · Post 1 of 7 · 8 May 2026

Sovereign AI Is Not a Buzzword

The procurement category that will define a decade of AI semiconductor capital flows.

In October 2023 the U.S. Bureau of Industry and Security tightened its export controls on advanced GPUs. In January 2024 it refined them. Within months, multiple Gulf states — notably the UAE, where the G42 group restructured its China relationships to access Western chips through a Microsoft partnership — discovered that the silicon underlying their national AI plans was no longer reliably available on the original terms. Saudi Arabia’s HUMAIN initiative, Singapore’s national AI strategy, and Korea’s K-Cloud effort each accelerated diversification planning in response to the same regulatory uncertainty. The response was not policy papers. It was cheques.

This is the moment sovereign AI stopped being a slogan and became a procurement category.

The distinction matters more than it may appear. Slogans are about positioning; procurement categories are about budgets. Once a budget line exists, the questions change. You stop asking whether governments will diversify their compute and start asking who they buy from, on what terms, and for how long. The first set of questions is rhetorical. The second set is operational. Republic of Capital is interested in the operational set.

The obvious objection — “why not just buy NVIDIA?” — becomes, on inspection, a category error. Of course governments still buy NVIDIA. They buy more of it than ever. The point is that they no longer buy only NVIDIA. The architecture being built across sovereign data centres in Riyadh, Abu Dhabi, Singapore, and Seoul is heterogeneous: incumbent silicon for training, where deep workload integration matters and CUDA’s ecosystem is genuinely hard to displace; non-incumbent silicon for inference, where competition has shifted to cost and where open-source frameworks like vLLM, SGLang, and llama.cpp have made it possible to serve commodity-tier production workloads on multiple hardware backends without rewriting the stack.

Call this the diversification layer. Its existence reframes the competitive question for an emerging chip company.

The Diversification Layer Already Exists

A clarification worth making early: the diversification layer is not a sovereign-AI invention. AWS pairs NVIDIA with Trainium and Inferentia. Google pairs NVIDIA with TPU. Microsoft has Maia, Meta has MTIA. Hyperscalers built this architecture first, for the same fundamental reason — single-vendor risk and cost-per-token economics — and have been running it at scale for years.

What is new is the procurement adaptation. Sovereign buyers are now making explicit commitments to heterogeneous compute architectures, formalised through allocation budgets, multi-year contracts, and government counterparties. The architecture is borrowed from hyperscalers. The procurement model is sovereign-specific. The two together are what create the investible category.

A second clarification, equally important: “diversification” in sovereign procurement does not mean exclusion of US vendors. G42’s 2024 pivot was toward Microsoft and NVIDIA, not away from them — the China divestment was the price of access, not its substitute. HUMAIN partners with both NVIDIA and AMD. Singapore continues to deploy NVIDIA at scale. What sovereign buyers are pursuing is concentration management — adding credible non-incumbent suppliers alongside the incumbent — not removal of foreign suppliers. The diversification slot is additive. The investment thesis depends on the slot being large enough in absolute dollar terms to support several viable suppliers, not on the incumbent losing share.

The Competitive Slot Is More Crowded Than Headlines Suggest

Most AI semiconductor coverage assumes a single competitive contest: emerging chip companies versus NVIDIA. Read the analyst reports, scan the IPO filings, and you will find the same implicit framing. NVIDIA is the incumbent. The startups are the challengers. The question is whether the challengers can win.

Stated this way, the question has only one answer. NVIDIA’s ecosystem lead, software stack, and TSMC manufacturing partnership are formidable. The startups are not going to win that contest.

But it is the wrong contest. The actual contest, in the procurement category that emerged in 2023 and 2024, is who fills the diversification slot — and the field is broader than the startup-only framing implies.

AMD’s MI300 series has captured measurable share in inference workloads, particularly in deployments where memory capacity is the binding constraint. Intel’s Gaudi line continues to compete on cost despite well-documented adoption challenges. Hyperscaler custom silicon — Trainium, TPU, increasingly available as merchant product — competes for the same commodity inference workloads. And then there are the perhaps ten emerging chip companies — Cerebras, Groq, Tenstorrent, Etched, SambaNova, Rebellions, FuriosaAI, and others — competing for what remains.

A sovereign buyer evaluating the slot is choosing among all of these, not only the startups. The startup case has to be made against this fuller field. Most of the ten will not deliver at production scale. The handful that do will be the companies whose stories define the next decade of AI semiconductor capital flows. The framing matters because the implied competition shapes everything: the valuation methodology, the M&A surface, the bar for product-market fit.

Why Inference, and Not Training

The diversification layer exists at the inference layer specifically. Training, for the foreseeable future, will run on the incumbent. Three reasons.

First, training is integration-heavy. A frontier-scale training run is not a workload one slots into a generic compute environment; it is a co-engineered system in which the chip, the interconnect, the memory hierarchy, and the model architecture are tuned together. CUDA’s depth here is not a marketing claim. The cost of porting a training pipeline to non-incumbent hardware, in research time and degraded performance, exceeds whatever savings the alternative offers.

Second, training capex concentration favours the buyer with the deepest pockets and the highest performance ceiling. A frontier lab spending heavily on a training run does not optimise for cost-per-token. It optimises for performance and reliability. The economics there continue to point to NVIDIA.

Third — and this is where the analysis has to be precise — inference itself is bifurcating. A meaningful share of inference, particularly the workloads enterprises and governments deploy at scale, is commoditising on cost: chat, retrieval-augmented generation, structured outputs, batch inference, classification. For these workloads, throughput per dollar and per watt is the metric that matters, and a chip that delivers competitive throughput at lower cost is a competitive product.

But another share of inference — long-context reasoning, multimodal generation, agentic loops with extended chain-of-thought — is moving in the opposite direction. These workloads consume far more compute per query than commodity inference, demand more memory bandwidth, and benefit from incumbent silicon advantages similar to training. As of 2026, frontier inference is becoming more specialised, not less.

The diversification layer addresses the commoditising tier — the bulk of inference compute by absolute volume, even if the frontier tier is the fastest-growing. This is the tier sovereign procurement is configuring for, and the tier where chip companies competing on cost and ecosystem can win allocation. The thesis depends on commodity inference remaining a commercially significant tier — not on all inference commoditising. So far it does.

The Competitive Surface

Within the diversification layer, what determines who fills the slot at production scale?

Three things, in roughly this order.

Manufacturability. The chip has to ship. Given current foundry capacity dynamics — TSMC heavily allocated to NVIDIA, Samsung Foundry working through yield challenges at advanced nodes, Intel Foundry early in its merchant journey — a chip company without a manufacturable path to volume is not in the contest. Foundry relationships are competitive assets in a way they were not five years ago.

Ecosystem alignment. Inference at scale depends on memory, interconnect, and software in addition to the accelerator. A company whose memory partner is also its strategic investor, whose CPU partner is a global IP licensor, and whose national government is a procurement customer has compounding advantages over a company built on arm’s-length relationships. The cap table is not just a financing document. It is a competitive position that takes years to assemble.

Deployment speed. I assume the sovereign window has a duration. The structural reason is that today’s procurement urgency reflects a specific period of supply tightness and regulatory uncertainty; as greenfield liquid-cooled capacity arrives at scale and the regulatory landscape stabilises, the urgency premium that favours fast deployers should compress. This is a working hypothesis, not a fact. While the window remains open, a chip that integrates into existing facilities — air-cooled, into existing rack densities, on standard interconnect — has a deployment-speed advantage measured in months. A chip that requires liquid-cooling retrofits or new facility construction often has a deployment timeline measured in years. The advantage is structural during the window. It diminishes after.

These three are above-threshold filters. Performance benchmarks remain a gating criterion — a chip below the workload-adequacy threshold does not enter the contest at all. Above that threshold, the three above matter more than incremental benchmark advantage.

The Sovereign Customer

The other piece of the picture worth being precise about is the customer. There is a tendency, particularly among investors who have not sat across the table from a sovereign buyer, to model these procurements as either heroic (the government as visionary national-strategy actor) or compromised (the government as patron writing welfare cheques). Neither is the full picture.

The sovereign buyer in this category is, in the cases I have observed, behaving as a constrained rational customer. The decision matrix involves performance benchmarks, total cost of ownership, deployment timelines, vendor financial credibility, and strategic resilience. The last criterion is what is new; it is what creates the diversification layer. But the first four still apply.

The constraint is industrial policy. In most jurisdictions, the diversification slot is reserved — explicitly or implicitly — for domestic or geopolitically aligned suppliers. Korea will favour Korean chip companies because the development of domestic semiconductor champions has been central to Korean industrial policy for decades; this is the policy logic underlying the recent move from grant funding to direct equity participation in companies like Rebellions and FuriosaAI. The UAE’s procurement reflects its broader geopolitical realignment. Saudi Arabia’s HUMAIN partnerships reflect specific bilateral relationships. The European Union’s nascent allocations favour European-supported designs.

Analytically, this means the merit contest happens within the candidate set eligible under industrial policy, not across the full global field. A chip company whose alignment is right faces a much narrower competitive set than the global candidate count suggests. A chip company whose alignment is wrong is not in the contest at all, regardless of technical merit. This is the asymmetry that makes the cap table — investors, government stakes, customer commitments — a competitive position rather than only a financing document.

The procurement mechanics also differ from typical enterprise sales. In the cases that have become publicly visible, sovereign data centre allocations have looked more like negotiated strategic partnerships than conventional RFP-based bids. The government has decided that the diversification layer will be a budget line; the question is which eligible company gets the slot. Once allocated, the contracts are typically multi-year, structured around hardware refresh cycles, with built-in upgrades and significant switching costs.

I will use the term quasi-annuity throughout this series, with one important qualification. These contracts are more durable than typical hardware sales but less durable than utility concessions. They remain subject to renegotiation clauses, scope reductions, political triggers, and reorganisation when administrations change. The closer credit analog is defence procurement — multi-year, government counterparty, sticky budgets, but periodically restructured. Post 5 will return to this distinction in the valuation discussion.

This framing also addresses the common “subsidised business” objection — which Post 5 takes up at length. The relationship between policy and contract is not as clean as “policy creates the category, merit awards the contract.” Industrial policy shapes both. The candidates are pre-filtered by alignment; merit decides among the filtered set. That is a more honest description of how sovereign chip allocations actually work.

What This Means

The strategic implication is the one that has not yet been fully priced.

Sovereign AI is the first AI market in which the structural advantage appears poised to belong not to the chip with the highest peak performance, but to the chip whose ecosystem alignment, manufacturability, and deployment speed allow it to ship into the window before the window closes. The diversification layer rewards different attributes than the training layer does.

I write “appears poised” rather than “belongs” deliberately. No sovereign deal has yet definitively validated this principle at scale. Cerebras’s UAE engagement reflected as much about G42 strategic positioning as about ecosystem. Korean allocations to Rebellions and FuriosaAI are not yet at the volume that would prove the thesis. The next twelve to twenty-four months of allocation decisions will determine whether the principle holds.

That reframing — from absolute performance to ecosystem-weighted performance, with industrial policy filtering the eligible set — is what the rest of this seven-part series is about.

Post 2 will work through the unit economics of a 10MW non-incumbent inference allocation and show how cumulative revenue compounds. Post 3 will turn to the historical precedent — the telecom market after Huawei — and ask what is true and what is not in the analogy. Post 4 will explain why Korea is uniquely positioned to compete in the diversification layer, and why the answer rests on memory, foundry, and government coordination rather than on chip design alone. Posts 5 and 6 will turn to the buyside view and the consolidation endgame. Post 7 will recap the argument in five minutes.

The sovereign window is open. The economics underneath it are sturdier than the political framing suggests. The competitive contest is more concentrated, and more constrained by industrial policy, than most investors have priced. The next twelve months will determine which companies position themselves to be in the consolidation that follows — and which spend their resources fighting a contest, against NVIDIA, that is not the contest that matters.

Common Objections

Isn't this just temporary? Won't the export controls relax?

Possibly. But the procurement category, once established, is harder to unwind than the policy that created it. Procurement budgets are politically sticky. Single-vendor dependency, having been demonstrated as a strategic risk, is unlikely to be re-embraced by sovereign buyers regardless of subsequent policy. The category will outlast the catalyst.

Won't NVIDIA simply launch products tailored for the sovereign market?

NVIDIA already has product lines designed for export-permissible regions. From the buyer's standpoint, the concern is not that NVIDIA is unwilling to sell. It is that the U.S. government can change the terms unilaterally. NVIDIA's products do not address that concern. Diversification does.

Couldn't the diversification slot end up small? In practice, 'diversification' programs may allocate 80%+ to the incumbent and only 10–20% to alternatives.

A real possibility, and the dollar implications matter. The architectural shift to heterogeneous compute is genuine, but the proportional split may continue to favour the incumbent for years. The thesis does not require parity. It requires that the alternative slot be large enough in absolute dollar terms to support several viable suppliers. Even a 15% slot inside a $50bn sovereign data-centre programme is $7.5bn — sufficient for three or four chip companies to build durable revenue. The shape of the slot, more than its share, is what matters for the analysis.

Aren't the hyperscaler custom silicon programs the bigger threat to sovereign AI startups?

A reasonable challenge. Trainium, TPU, MTIA, and Maia compete for the same workloads, increasingly available as merchant product, and benefit from integration with hyperscaler-scale software stacks. Their disadvantage in the sovereign context is precisely that they are hyperscaler-aligned — the diversification mandate exists in part to avoid the same single-vendor risk that NVIDIA presents. A sovereign buyer choosing Trainium-as-a-service over NVIDIA has not actually diversified its strategic exposure. The startups offer something hyperscaler silicon cannot: ownership not aligned with U.S. hyperscaler strategy.

Aren't governments terrible customers — slow, political, demanding?

In some respects, yes. Government procurement is slower than enterprise procurement, and the contracts come with regulatory and reporting overhead. But the contracts also offer characteristics enterprise customers do not provide: multi-year structure, low credit risk, large deployment scale, and the durability of budget allocations that have been politically committed.

Why not just sell to hyperscalers and skip this entirely?

Hyperscalers are the largest ultimate market, but they are not the right first market for an emerging chip company. Hyperscalers operate a department-store model — the chip company pre-invests capital, and the hyperscaler bills on utilisation. Hyperscalers also build their own custom silicon, making them suppliers and competitors simultaneously. Sovereign customers, by contrast, have an explicit diversification mandate and limited internal silicon capability. They are pragmatic first customers in a way hyperscalers are not.

Doesn't all this depend on inference commoditisation continuing?

Partially. The diversification layer addresses the commodity-tier inference workloads, where commoditisation is real. Frontier inference — long-context, multimodal, agentic — is moving in the other direction, and that is the most important risk to the broader thesis. But commodity inference remains the bulk of total inference compute by volume, and that bulk is what sovereign procurement is configuring for. The thesis depends on commodity inference remaining a commercially significant tier, not on all inference commoditising.

Sources & Further Reading

  • U.S. Bureau of Industry and Security, Implementation of Additional Export Controls on Advanced Computing — October 2023; subsequent refinements through 2024
  • TrendForce, HBM and AI Server Market Analysis — quarterly
  • Counterpoint Research, AI Inference Accelerator Tracker
  • IEEE / ACM literature on inference framework portability — vLLM, SGLang, llama.cpp
  • Korea Investment Corporation, Korea Development Bank, MOTIE — public disclosures on semiconductor sector financing
  • AWS, Google Cloud, Microsoft Azure — public disclosures on Trainium, TPU, Maia merchant availability

Subscribe

New essays delivered when published. Free for now.