Verse IV · Costs

Costs

Evidence totaled what training runs cost. This chapter goes underneath that, to the physical plant the whole industry runs on: what a server costs, what a rack costs, what a megawatt of datacenter costs to build, what it costs to keep that megawatt cool, and what it costs to store the data a frontier model trains on. None of these figures are leaks. They are list prices, industry capex surveys, and a research model built for exactly this purpose, current as of mid-2026.

The servers

A single Nvidia H200 GPU runs 28,000 to 45,000 dollars depending on the variant. An eight-GPU HGX H200 server, the way these chips actually ship, runs 370,000 to 400,000 dollars typical, once the NVLink fabric, network interface cards, and integration are counted alongside the GPUs themselves, near 46,000 dollars per GPU. The current rack-scale system, the GB200 NVL72, prices at 2.8 to 3.4 million dollars for the hardware alone, near 3.9 million dollars all in with networking and storage, for 72 GPUs in one rack. Divide that out and the fully integrated cost lands near 54,000 dollars per GPU, higher than the eight-GPU box, not lower. Scaling up bought more compute. It did not buy a cheaper GPU. The cost of tying that many chips together at that density is itself a line item, and it grows with the rack.

Fully integrated cost per GPU, mid-2026. Higher density did not mean a cheaper chip.

The chips that are not for sale

Google's TPUs are the other path to this kind of compute, and they are not purchasable hardware at all. TPU v6e, Trillium, prices at 2.70 dollars per chip-hour on demand, 1.89 dollars per chip-hour on a one-year commitment, through Google Cloud and nowhere else. TPU v7, Ironwood, is specified publicly, 4,614 FP8 teraflops, 192 gigabytes of HBM3E memory, 7.37 terabytes per second of memory bandwidth, but as of this writing Google has not published a per-chip price for it. There is no rack anyone can buy. Whatever efficiency Google's own silicon buys, and the architecture suggests it buys a real amount, stays inside Google Cloud, rented by the hour, never sold.

The lament of ARM64

Buy an x86 server today and the market works the way a market should. Dell, HPE, Supermicro, Lenovo, and a half dozen others compete to put a chassis around two competing CPU lines, AMD's EPYC and Intel's Xeon, and that competition shows up in the price. Try to buy a general-purpose ARM64 server on the same open market and the choice collapses to one CPU vendor, Ampere, whose AmpereOne and Altra lines are the only general-purpose ARM silicon most buyers can actually purchase. The OEMs building chassis around that one chip are Supermicro, Gigabyte, ASRock Rack, IEIT, and BYD. HPE, Wiwynn, and Dell, despite earlier involvement with Ampere's parts, are not currently on that list at all. Nvidia sells its own ARM64 Grace CPU, but only bundled into Grace-Hopper and Grace-Blackwell GPU systems, never as a standalone chassis choice the way an EPYC or a Xeon is. And the ARM silicon doing the most for efficiency right now, AWS's Graviton, Google's Axion, and Microsoft's Cobalt, reportedly up to 20 percent cheaper and 60 percent more energy-efficient than comparable x86 instances, is not for sale under any terms. It is proprietary to the cloud that built it, unavailable to anyone trying to run their own infrastructure outside the three hyperscalers that designed it for themselves. An industry that wants efficient, small-model-friendly hardware to actually proliferate is choosing between a competitive x86 market and an ARM64 market that is, in practice, one chip vendor, a handful of OEMs reselling that one chip, and the genuinely efficient silicon locked behind someone else's login.

This is the part of the lament that actually costs money. An IDC survey cited in late 2025 found 86 percent of CIOs planning to repatriate at least some workload from the public cloud back to infrastructure they own, AI spending named as a leading driver, enterprises looking at their own GPU and inference bills and asking whether owning the iron would cost less than renting someone else's. ARM64's well-documented energy advantage, up to 60 percent less power for comparable work, is exactly the kind of saving a repatriation decision should be able to capture. It mostly cannot. An enterprise repatriating onto x86 buys into a real, competitive retail market. An enterprise that wants to repatriate onto ARM64 instead finds one CPU vendor, a handful of OEMs, and a missing chassis from most of the names that would normally sell it to them, while the ARM silicon actually proven to save the most energy sits inside the very clouds that enterprise is trying to leave. The hardware gap this chapter laments is not an abstraction. It is enterprises paying x86's power bill, on their own premises, for want of a market that would sell them the cheaper alternative.

What it costs to build the room

Three figures describe the same kind of spending, measured three different ways. JLL's 2026 industry survey puts an ordinary datacenter build at 8 to 12 million dollars per megawatt, and an AI-optimized facility, built for the power density and cooling a GPU cluster needs, at 15 to 20 million dollars per megawatt or more, for the building, power infrastructure, and cooling alone, before a single server goes in a rack. Epoch AI's 2026 research model for a full one-gigawatt AI campus, sized around a fleet of GB200 NVL72 racks, puts the fully loaded figure, building, power, cooling, networking, and the servers themselves together, at 38 billion dollars up front, 38 million dollars per megawatt. The gap between that number and the construction-only figures above is the compute. A gigawatt-scale AI campus is not a building with computers in it afterward. The computers are most of the bill. Amortized over a five-year hardware lifespan and a fourteen-year facility lifespan, the same model puts the campus's annual cost of ownership, capital and operating combined, near 8.5 billion dollars a year, 8.5 million dollars per megawatt per year, of which servers and network infrastructure alone account for roughly 5 billion of that annual figure.

Cost per megawatt, mid-2026. Standard and AI-optimized figures cover building and power infrastructure only. The gigawatt-campus figure includes the servers themselves, which is why it is highest.

Power and cooling

How that megawatt gets cooled changes its running cost as much as its construction cost. Air-cooled facilities run a power usage effectiveness, the ratio of total facility power to the power the computers themselves use, of 1.4 to 1.6, meaning cooling and overhead burn 40 to 60 percent as much electricity as the compute it is cooling. Liquid cooling brings that down to 1.10 to 1.20. Immersion cooling, dunking the hardware directly in a dielectric fluid, gets a GPU-dense cluster to 1.03 to 1.08. The infrastructure to do that is not free, liquid cooling alone adds upward of 1.1 million dollars per megawatt to a build, but for a 100 megawatt facility, the efficiency gap between air and liquid cooling is worth 30 to 50 million dollars a year in electricity not spent.

Power usage effectiveness by cooling method, mid-2026. Lower is better. 1.0 would mean zero overhead.

Epoch AI's gigawatt model prices the energy itself, separate from the cooling that moves it, at 0.6 billion dollars a year. Treated as a full gigawatt running close to continuously across a year, that implies a price near seven cents per kilowatt-hour.

\frac{$0.6 B}{1 GW \times 8760 h} \approx $0.068/ kWh

That price is not fixed. It is already moving. PJM, the grid operator covering much of the mid-Atlantic and Midwest, saw its capacity price for the 2025-26 service period jump 9.3 times over, from 29 dollars per megawatt-day to 270 dollars per megawatt-day, with some locations closer to 450. PJM and the utilities involved point to forecasted datacenter load as the driver. Residents in PJM's territory are projected to see their electricity bills rise roughly 15 percent in 2026 against the pre-datacenter-boom baseline. The price per kilowatt-hour a frontier lab pays today is not the price it will pay once the grid it is straining finishes passing that strain through to everyone connected to it.

Storing the petabytes

Raw hard disk capacity, the cheapest medium available and still the right economic answer for nearline archive tiers, runs 20 to 25 dollars per terabyte in 2026, near 25,000 dollars for a full petabyte of raw drives. A single 42U rack can hold roughly a petabyte of that HDD capacity for about 25,000 dollars, or roughly four petabytes of faster QLC SSD capacity for about 1.2 million dollars, twelve times the cost per petabyte for the speed. Once redundancy, power, management, and a realistic multi-year operating window are counted rather than just the drives, a single petabyte's fully loaded cost on premises runs closer to 1.3 million dollars. Definitions already put GPT-4's training corpus near 13 trillion tokens and DeepSeek V3's near 14.8 trillion. Tokenized text itself is compact, a few bytes a token, so the cleaned corpus a model actually trains on can be a modest number of terabytes. The real storage footprint of a training run is larger than that figure alone, by a wide margin. It has to hold the raw, uncleaned data the corpus was filtered down from, the model checkpoints saved at intervals across a months-long run, each one a multi-terabyte snapshot of every parameter at trillion-parameter scale, and the optimizer state sitting alongside each checkpoint, often two to four times the size of the model itself for the optimizers this kind of training actually uses. A frontier run's storage bill is not the size of its finished dataset. It is that, plus the discarded majority of what was scraped to build it, plus a running history of the model's own intermediate states, and petabytes, plural, is the realistic unit for the whole pile.

What it costs to train one, bottom-up

Put the pieces in this chapter together and a frontier-class training cluster prices out two different ways, and the gap between those two ways is the lesson. Public discussion of a current frontier run puts the cluster size at 50,000 to 100,000 GPUs, a figure no lab has confirmed on the record. Multiply the high end of that range by this chapter's own fully-integrated GB200 NVL72 price, near 54,000 dollars per GPU, and the hardware alone prices near 5.4 billion dollars.

100, 000 \times $54, 000 \approx $5.4 B (hardware only)

That number almost certainly overstates what one training run actually costs, for the same reason DeepSeek's headline figure understated what its company actually spent. A fleet of 100,000 GPUs is not bought, used once, and retired. It trains many models across years of operation, and it serves inference traffic in between. Spreading its cost over a single run, rather than the years of work that fleet performs, inflates the figure the same way ignoring a fleet's full lifetime deflates DeepSeek's. Epoch AI's published cost model for named frontier models puts hardware at 47 to 67 percent of a training run's total cost, with staff and energy making up the rest, and finds frontier training budgets growing 2.4 times a year since 2016, crossing half a billion dollars in 2025 and heading toward one to three billion dollars by 2027. That trajectory, built independently of anything in this manifesto, lands in the same place this chapter's own numbers point to from underneath: somewhere in the high hundreds of millions to low billions of dollars for a single 2026 frontier run, and rising. Introduction called this a bill that arrives every quarter and never comes due. This chapter is what that bill is actually made of, megawatt by megawatt, rack by rack, petabyte by petabyte.

What I pay instead

Every figure above describes someone else's bill. Mine is easy to put a number on by comparison. The DGX Spark in my closet, the same one running the Qwen coding model mentioned in Evidence, draws a peak of 240 watts and, measured under real inference load on an identical unit, an average closer to 50. Left running for a year at the 2026 national average US residential rate, 18.7 cents a kilowatt-hour, that is somewhere between 82 dollars a year at realistic load and 393 dollars a year if it ran flat out, continuously, all year, which it does not. Add the 3,999 dollar purchase price as a one-time cost and the first year runs a little over 4,000 dollars. Every year after that runs under 400.

50 W \times 8760 h \times $0.187/ kWh \approx $82/ yr

Epoch AI's gigawatt campus, the same model cited earlier in this chapter, prices the energy alone, not the building, not the servers, just the electricity, at 0.6 billion dollars a year. That is roughly 1.5 million times what my closet costs to run at the worst-case, continuous-peak estimate, and more than 7 million times what it costs running the way it actually runs. The frontier and the small machine in my closet are not on the same cost curve, not by a factor that rounds to anything. They are not the same kind of expense at all.

Annual energy cost, logarithmic scale. A personal DGX Spark, realistic load, against Epoch AI's one-gigawatt AI campus model.

None of the figures in this chapter required a leak. They are what the industry itself publishes, sells, and surveys, added up in the open. The trillion-parameter frontier is not expensive because someone is hiding the bill. It is expensive because the bill is exactly this large, in public, for anyone willing to do the arithmetic.

Sources

Epoch AI. "Total cost of ownership of a one-gigawatt AI data center." May 2026. epoch.ai. The gigawatt-campus cost model: 38 billion dollars up front, 8.5 billion dollars a year amortized, assuming a GB200 NVL72 fleet.
Cottier, B., Rahman, R., Fattorini, L., Maslej, N., Besiroglu, T., and Owen, D. "The Rising Costs of Training Frontier AI Models." 2024, revised 2025. arXiv:2405.21015. The cost-composition model (hardware, staff, energy) and the 2.4x-per-year training cost growth trend cited above.
ServeTheHome. "Ampere AmpereOne Pricing and SKU List with Current OEM Partners." 2026. servethehome.com. The current OEM list for general-purpose ARM64 servers, and Dell, HPE, and Wiwynn's absence from it.
HBS. "Cloud Repatriation Trends: Cost, AI and the Push Towards Hybrid." November 2025. hbs.net. The IDC-sourced figure that 86 percent of CIOs planned to repatriate some workload from the public cloud in 2025, AI spending cited as a leading driver.
SemiAnalysis. "Are AI Datacenters Increasing Electric Bills for American Households?" 2026. newsletter.semianalysis.com. The PJM capacity price spike, 29 to 270 dollars per megawatt-day, and the projected household bill impact.
JLL Research. "2026 Market Outlook for Global Data Centers." 2026. jll.com. Standard and AI-optimized datacenter construction cost per megawatt.
NVIDIA. "Personal AI Supercomputer Powered by Blackwell: DGX Spark." 2026. nvidia.com. DGX Spark's price and 240-watt peak system power rating.
U.S. Energy Information Administration. "Electric Power Monthly." 2026. eia.gov. The 18.7 cents per kilowatt-hour average US residential electricity rate used above.