[00]   Canonical Labs · Data Centers & Neoclouds

The most important real estate on earth.

A data center is a building full of computers: power comes in, heat comes out, compute happens in between. Three years of AI demand turned that boring utility into the scarcest, most-fought-over asset in tech. This is the whole thing explained from the ground up — the stack, the players, the power constraint, and the economics. Companion to the Semiconductor Stack and Decentralized AI maps.

Coverage7 stack layers
Players mapped31 across the stack
SourcesBVP Atlas · PitchBook · Sacra · public filings
UpdatedMay 28, 2026
The AI data center buildout in four numbers.
$700B
Projected AI capex in 2026. Hyperscalers spent ~$130B in Q1 2026 alone. (Packy McCormick; Shanu Mathew.)
$7.6T
Cumulative through 2031. BVP, Packy McCormick, and Jamin Ball independently converge near the same number.
11%
xAI's GPU fleet utilization. Hyperscalers run 5–10%. The most expensive hardware ever built mostly sits idle.
1GW
A single hyperscale campus. That's a small city's worth of power — and power, not chips, is the real bottleneck.

A data center is a building full of computers.

Power comes in, heat comes out, compute happens in between. That's it. We went from corporate server rooms, to the cloud, to today's AI-driven explosion — but the physics never changed. The single most-quoted efficiency number is PUE (Power Usage Effectiveness): total facility power divided by the power that actually reaches the chips. A PUE of 1.2 means for every watt of compute, you burn 0.2 watts on cooling and overhead. Perfect is 1.0; nobody gets there.

POWER IN Grid / PPA transformers, UPS, backup Cooling air → liquid → immersion Compute racks GPUs, networking, storage Network out tokens, training, inference ↑ HEAT OUT PUE = total facility power ÷ power that reaches the chips
Canonical POV

Everything else on this page is a consequence of that one diagram. The AI boom didn't change what a data center is — it changed the numbers. Racks went from 5–10 kW to 40–80 kW (and 1 MW is coming), which broke air cooling, which broke the grid connection, which broke the permitting timeline. Follow the heat and you'll find every bottleneck.

Why the last three years changed everything.

Training a frontier model needs orders of magnitude more compute than the cloud workloads data centers were built for. A single frontier training cluster draws double-digit megawatts, packs thousands of GPUs, and costs eight figures a year just to run. Multiply that across every lab and hyperscaler and you get the largest infrastructure capex cycle in history.

~$130Bhyperscaler capex, Q1 2026 alone
$700Bprojected full-year 2026 capex
$1T+projected 2027 capex
$7.6Tcumulative through 2031 (convergent estimates)
Canonical POV

Three independent analysts — Bessemer's Atlas, Packy McCormick, and Jamin Ball — land within rounding distance of $7.5–7.6T cumulative spend through 2031. When three people modelling from different starting points converge, the number is less a forecast and more a planning assumption the whole industry is already underwriting. The investable question is no longer "will it get built" but "which layer captures the margin."

What's inside, layer by layer.

From the GPU-as-a-service product at the top to the silicon at the bottom, every layer is its own market with its own incumbents and challengers. Click any company to open it.

LAYER REPRESENTATIVE PLAYERS 1 · Neoclouds GPU-as-a-service, AI-native CoreWeaveCoreWeave CrusoeCrusoe LambdaLambda NebiusNebius 2 · Operations & managed services who actually runs the cluster Penguin SolutionsPenguin Solutions NVIDIA DGX CloudNVIDIA DGX Cloud 3 · Compute silicon GPUs, custom ASICs, accelerators NVIDIANVIDIA AMDAMD Google TPUGoogle TPU CerebrasCerebras d-Matrixd-Matrix 4 · Networking & interconnect spine-leaf, InfiniBand vs. Ethernet AristaArista BroadcomBroadcom Ayar LabsAyar Labs 5 · Cooling why 40–80 kW racks need liquid MotivairMotivair CoolITCoolIT GRCGRC JetCoolJetCool 6 · Power & energy the binding constraint ConstellationConstellation OkloOklo Nano NuclearNano Nuclear 7 · Facilities (REITs & operators) the buildings themselves EquinixEquinix Digital RealtyDigital Realty VantageVantage QTSQTS
Stack layer Representative player Click any box → opens the company
Canonical POV

The interesting investing question is which layers convert from real-estate economics to software-like economics. The building is a commodity; the silicon is a near-monopoly (see our Semiconductor Stack map). The margin pools that look most under-served sit in the boring middle: power procurement, liquid cooling at 40–80 kW per rack, and the operations layer that keeps thousands of GPUs actually running.

The great unbundling.

The AWS / Azure / Google Cloud hyperscalers are vertically integrated, general-purpose clouds that bolted AI on. Neoclouds are GPU-first and AI-native, built specifically for training and inference. They exist because hyperscalers are slow to allocate GPU capacity, pricing is opaque, and an AI startup needs a fundamentally different product than an enterprise cloud customer. A parallel trend: sovereign clouds, as Saudi Arabia, the UAE, Japan, and France build national AI compute.

NeocloudsPublic (CRWV)

CoreWeave

The neocloud bellwether

The reference example of the category and the first to go public. GPU-first cloud purpose-built for AI, financed with an aggressive mix of equity and GPU-backed debt — roughly $28B raised across equity + debt in the 12 months through March 2026, including an $8.5B credit facility.

$5.1B2025 revenue (+170% YoY)
~$65Bmarket cap (Oct 2025)
Canonical POV

CoreWeave proved neoclouds can reach public-market scale — but its model is a leveraged bet that GPU demand and pricing hold up long enough to service GPU-collateralized debt. The bull and bear case are the same fact.

Stage: Nasdaq: CRWV (IPO March 2025).
Visit CoreWeave
NeocloudsSeries D

Crusoe Energy

Energy-first neocloud

Crusoe's wedge is power: it started by turning stranded and flared natural gas into compute, and now positions energy as the differentiator in a power-constrained market. $4.14B raised in total, including a $600M Series D (Founders Fund, Dec 2024) and a $750M Brookfield credit facility (June 2025).

$4.14Btotal raised
$750MBrookfield credit facility
Canonical POV

If power is the binding constraint (it is), the neocloud that owns its energy story has the most durable moat. Crusoe is the clearest expression of "the data center business is really an energy business."

Backers: Founders Fund, Brookfield.
Visit Crusoe
NeocloudsSeries E

Lambda

GPU cloud for builders

One of the longest-running GPU clouds, popular with researchers and startups for on-demand and reserved capacity. ~$5.9B valuation post Series E (Nov 2025), $3B+ raised in total, with an early-2025 mix of a $480M equity raise plus a $500M GPU-backed loan — the same equity-plus-hardware-debt pattern as CoreWeave.

~$5.9Bvaluation (Series E)
$3B+total raised
Canonical POV

Lambda's developer mindshare is the asset hyperscalers can't easily buy. The question for every neocloud is the same: does the brand and product stickiness outlast the GPU depreciation schedule?

Stage: Series E, Nov 2025.
Visit Lambda
More neoclouds
Voltage Park — launched managed Kubernetes (June 2025), moving up-stack from raw infrastructure to platform services.
Nebius — the ex-Yandex cloud unit, now a European neocloud play (NBIS).
Together AI — inference-focused, with a strong open-source model-hosting business.
Nscale — among the top AI-infra startups by total funding raised.

Every major player, mapped by layer.

Filter by stack layer or search by name, stage, or what they do. Click any card to open the company. Public funding and market data only — no proprietary deal-flow. Every outbound link carries ?ref=canonicalcc.

It's not the chips. It's the power.

You can buy GPUs with money. You cannot buy a grid connection with money — in some regions the interconnection queue is a 5–7 year wait. So AI builders are doing whatever it takes: signing 20-year Power Purchase Agreements, building generation behind the meter (gas, solar, nuclear), and restarting shuttered nuclear plants. A single frontier cluster is double-digit megawatts; a hyperscale campus is 500 MW to 1 GW+ — a small city.

Power draw, megawatts (illustrative)

Frontier cluster~50 MW
Large factory~100 MW
Mid-size city~300 MW
Hyperscale campus1,000 MW

Orders of magnitude, not precise figures — the point is that a single AI campus now rivals a city. That's why utilities, regulators, and grid operators are suddenly in the room.

Power & EnergyPublic (CEG)

Constellation Energy

The nuclear renaissance, literally

Constellation's deal to restart the Three Mile Island nuclear plant to supply AI compute is the single most symbolic moment of the buildout: AI demand is reviving capacity the US had written off. The economics of guaranteed, carbon-free, 20-year offtake are exactly what hyperscalers want.

Canonical POV

When AI companies sign 20-year nuclear contracts, they're telling you they believe demand is structural, not a bubble. Watch the offtake terms — they're a cleaner demand signal than any capex forecast.

Stage: Nasdaq: CEG.
Visit Constellation
Power & EnergyPublic (OKLO)

Oklo

Small modular reactors

Oklo (Sam Altman-backed) is building small modular reactors (SMRs) — factory-built, smaller nuclear units that can be sited next to load. If SMRs work at cost, they are the clean behind-the-meter answer to the interconnection-queue problem.

Canonical POV

SMRs are a 2028+ story with real regulatory and cost risk, but the prize is enormous: dispatchable, carbon-free power you can build on your own timeline instead of the grid's.

Stage: NYSE: OKLO.
Visit Oklo
Also in the power layer
Nano Nuclear — micro-reactors aimed at edge and remote data centers, where grid access is the limiting factor.
Behind-the-meter & storage — on-site generation (gas, solar) plus large battery banks to smooth the spiky power profile of training runs is becoming standard kit. A wave of battery-storage startups is targeting DC power-smoothing and grid-edge backup — an under-mapped, fast-emerging layer.

The most expensive hardware ever built sits mostly idle.

xAI's fleet reportedly runs at 11% utilization; hyperscalers at 5–10% (per Cloudflare CEO Matthew Prince). Why? We spent 25 years inventing VMs, containers, and schedulers to share CPUs efficiently — almost none of that exists for GPUs yet. GPU multi-tenancy is genuinely hard: memory isolation, kernel scheduling, and security boundaries are unsolved at the level CPUs enjoy.

Utilization, % (illustrative)

CPU (cloud)~65%
Server (pre-cloud)~15%
xAI GPUs11%
Hyperscaler GPUs5–10%

CPUs took two decades of virtualization tooling to reach high utilization. GPUs are roughly where servers were before VMware — which is either a crisis or the biggest software opportunity in infrastructure.

Canonical POV

Current fixes — NVIDIA MIG, time-slicing, vGPU, and Kubernetes fractional-GPU schedulers (Run:ai, ClearML) — are early. There's also a distributed alternative: Exo-style pooling of consumer hardware for inference (an M4 Max can push ~3.7M tok/sec). Whoever builds "VMware for GPUs" turns a 10% utilization problem into a 10x effective-supply unlock — arguably the highest-leverage software bet in the entire stack.

How a data center actually makes money.

Three revenue models stack on top of each other: colocation (rent the space), managed hosting (rent the infrastructure), and cloud (rent the compute). Each step up the ladder trades real-estate economics for software-like margins. Revenue is locked in with take-or-pay contracts; many operators use a REIT structure. Play with the core levers below.

Inputs

30 MW
1.20
Lower is better. Liquid cooling pushes modern AI facilities toward ~1.1.
6.0 ¢/kWh
70%
30 ¢/kWh
Blended revenue per IT kWh delivered to customers.
$10.0M / MW

Illustrative annual unit economics

Annual revenue
Annual power cost
Power-only gross margin
Build capex
Simple payback

Illustrative only. This models power-cost economics against blended compute revenue; it ignores staff, maintenance, financing, GPU depreciation, and contract structure. Real deals live or die on those omitted lines — this is a teaching tool, not a model.

Where governments help, and where they block.

Data centers create jobs and tax revenue but consume enormous power and water, so local politics are messy. The map is splitting into three postures.

Blocking — moratoriums

Virginia (the largest US data-center market) is debating limits; Ireland has paused new builds; Singapore restricts capacity. NEPA and environmental review can add years to a US build. Grid operators increasingly worry data centers consume too much available power.

Incentivizing — the land grab

Texas, Ohio, and Indiana compete aggressively with tax breaks and fast permitting. For a build where time-to-power is the scarcest resource, "we'll permit you in months, not years" is worth more than the tax incentive itself.

Mandating — sovereign AI

A growing list of countries require domestic AI compute for national security and data sovereignty — creating guaranteed demand for in-country capacity (Saudi Arabia, UAE, Japan, France), often on favorable terms.

The water question

Cooling consumes water as well as power. As builds concentrate in a few favorable regions, water rights and community impact are becoming as politically charged as the megawatts — and are the most common trigger for local opposition.

Where this all goes.

A few things look close to certain, and one big question stays open.

Liquid cooling becomes standard

Air can't handle 40 kW+ per rack. With 1 MW racks arriving on the Rubin Ultra generation, facility design changes fundamentally — liquid (and immersion) move from optional to required, and that reshapes who wins the cooling layer.

The memory wars & Feynman

NVIDIA's Feynman chip (2028) pairs 3D-stacked SRAM with 16-Hi HBM — the play is to retain the training monopoly while closing the inference gap. The $20B Groq acquisition fits the same pattern: absorb potential disruption rather than compete with it.

Breaking the memory wall

Models grew ~10,000,000x in a decade; memory bandwidth didn't keep up, so accelerators spend most of their time waiting on data. Startups like Majestic Labs attack this with memory-centric servers — one shared pool giving processors ~1,000x more fast memory. Majestic claims a single rack matches the fast-memory capacity of 25 NVIDIA NVL72 Vera Rubin racks at a fraction of the power. If it holds, the unit of inference shifts from the GPU to the memory system.

The interconnect wall

As clusters pass 100k GPUs, the bottleneck moves from FLOPs to moving data between chips — and copper runs out of reach. Co-packaged optics and optical I/O (Ayar Labs) put the photons on the package; photonic interposers reimagine the wiring entirely. We map the photonic-compute pure-plays in the Semiconductor Stack.

Edge vs. centralized

Does inference migrate to the edge (consumer devices, on-prem) or stay in centralized mega-clusters? Reselling excess capacity — the AWS supply-chain-services playbook applied to compute — could turn idle GPUs into a liquid market.

The endgame question

Does inference become effectively infinite and nearly free as efficiency compounds — or does demand always outpace supply, keeping compute scarce and power the permanent constraint? The whole investment thesis hinges on which way that breaks.

The bear case, with equal rigor.

A buildout this large has a long list of ways to disappoint. The serious ones:

01

The GPU-debt doom loop

Neoclouds finance GPUs with GPU-collateralized debt. If utilization or pricing softens, the collateral depreciates faster than the loans amortize. A demand wobble could cascade through the most leveraged operators first.

02

Capex outruns revenue

$7.6T cumulative spend assumes AI revenue scales to justify it. If monetization lags the buildout — a real risk — the industry is left with stranded, depreciating capacity and a brutal write-down cycle.

03

Power doesn't arrive in time

5–7 year interconnection queues, SMRs that slip to the 2030s, and local opposition could mean the chips exist but can't be powered. Time-to-power, not capital, becomes the hard ceiling on growth.

04

The utilization fix arrives

If someone builds "VMware for GPUs," effective supply could jump 5–10x without a single new building — great for buyers, but it would gut the scarcity premium underwriting today's neocloud and facility valuations.

05

Regulatory & community backlash

Moratoriums, water-rights fights, and grid-reliability mandates are spreading. The biggest markets (Virginia, Ireland, Singapore) are already pushing back, and the politics get worse as power bills rise for everyone else.

06

Hyperscalers re-bundle

Neoclouds exist because hyperscalers were slow. If AWS/Azure/GCP fix allocation, pricing, and product for AI workloads, the structural reason for an independent neocloud layer weakens — and they have infinitely deeper balance sheets.

Canonical POV

Most of the bull case is priced into the obvious layers — silicon and the public neoclouds. We'd rather underwrite the layers where centralized alternatives are structurally inadequate and demand is non-negotiable: power procurement, liquid cooling, and the operations talent that keeps it all running. Those are the picks-and-shovels of the picks-and-shovels.

How this was built, and where the data is weakest.

Sources & approach

  • FrameworkBessemer's "Roadmap: the AI Data Center Stack" for the layer taxonomy.
  • CapexPacky McCormick, Jamin Ball, and Shanu Mathew for the $700B/2026, $1T+/2027, and ~$7.6T cumulative figures; McKinsey and Goldman Sachs for buildout context.
  • Company dataPublic funding rounds, market caps, and revenue via PitchBook, Sacra, and primary filings/press releases.
  • RefreshStatic snapshot, May 2026. Subscribe for updates.

Known limitations

  • Public onlyThis page deliberately excludes proprietary deal-flow, internal valuations, and any confidential Canonical portfolio information. Only publicly available data is shown.
  • EstimatesCapex trajectories and the power/utilization bars are order-of-magnitude illustrations from third-party analysts, not audited figures.
  • CalculatorThe unit-economics tool is a teaching model: it ignores staff, maintenance, financing, and GPU depreciation. Don't underwrite a deal with it.
  • Moving fastFunding, market caps, and roadmaps change weekly. Spot something stale or wrong? Tell us.

If you're building in the data center stack, we want to hear from you.

Canonical leads early-stage investments in technical teams building at the frontier of AI infrastructure — power, cooling, operations, and the software that makes compute efficient. See our portfolio and thesis.

Built by Anand Iyer at Canonical · v1.0 · Educational tool. Not investment advice. Public data only; point-in-time and not exhaustive.