xAI, Elon Musk’s artificial intelligence company, generated approximately 72,000 metric tons of CO2 equivalent training Grok 4, according to data cited in the Stanford AI Index, published April 2026. That single training run emitted the annual carbon output of roughly 15,000 US passenger vehicles. These Grok 4 carbon emissions cover only the upfront cost of building the model — before a single query was ever processed.
What 72,000 Tons of CO2 Actually Means
The EPA calculates the average American passenger vehicle emits 4.6 metric tons of CO2 per year. At 72,000 metric tons, Grok 4’s training run matches the annual output of roughly 15,652 cars. Alternatively, it is enough energy to power approximately 9,600 average US homes for a full year, based on US Energy Information Administration residential figures.
- 15,652 cars driven for one year (EPA: 4.6 tCO2/vehicle/year)
- 9,600 US homes powered for a full year on electricity
- 8 million gallons of gasoline combusted
That figure covers training only — the one-time capital cost of creating the model. Not inference. Not the embedded carbon in manufacturing 100,000 Nvidia H100 GPUs. Not ongoing data center cooling. The operational costs compound from that baseline indefinitely.
How Grok 4 Compares to Other AI Models
Meta disclosed 539 metric tons of CO2 equivalent for training Llama 2’s 65-billion-parameter model in 2023 — at the time considered a substantial figure. Grok 4’s emissions are 133 times larger. The gap reflects the compounding compute intensity of frontier model training: doubling parameters typically requires far more than double the energy.
| Model | Developer | Training CO2 (tCO2eq) | Disclosed |
|---|---|---|---|
| Grok 4 | xAI | ~72,000 | Via Stanford AI Index |
| Llama 2 (65B) | Meta | 539 | Yes (model card) |
| GPT-4 | OpenAI | Not disclosed | No |
| Gemini Ultra | Not disclosed | No |
The Stanford AI Index reports that training compute has roughly doubled every nine months since 2020. If energy consumption tracks compute — and there is no credible evidence it doesn’t — the industry’s aggregate training emissions are on the same doubling schedule. 72,000 tons is not a ceiling. It’s a data point on an upward curve.
Water: The Other Cost AI Labs Don’t Report
Training emissions are one-time and measurable. Inference carries a persistent environmental cost that accumulates with every query, and industry reporting on it is even thinner. GPT-4o‘s inference operations consume enough water annually to meet the drinking-water needs of approximately 12 million people, according to the Stanford AI Index. That water cools data center servers — and unlike electricity, which can increasingly come from renewable sources, water evaporated in cooling towers does not return to the local watershed.
The World Resources Institute estimates that 4 billion people already face severe water stress at least one month per year. AI data centers are not agriculture — which claims roughly 70% of global freshwater — but they are a fast-growing demand category in regions already under pressure. As AI integrations reach consumer products at scale, inference volumes accelerate and that 12-million-person figure grows with them.
Microsoft, Google, and Amazon have all acknowledged rising data center water consumption in recent sustainability filings. None have committed to binding per-model water reduction targets.
The Data Center Buildout Is Accelerating, Not Slowing
The IEA’s 2024 electricity report projected global data center power demand could double by 2026 relative to 2022 levels, driven primarily by AI workloads. That projection now looks conservative. xAI’s “Colossus” facility in Memphis became operational in 2024 with approximately 100,000 Nvidia H100 GPUs and is already expanding. Meanwhile, Nebius has committed $10 billion to a new AI data center in Finland — a deliberate choice to access a grid that runs on near-total renewable generation.
Grid mix determines actual carbon intensity. A training run powered by Norwegian hydroelectricity produces a fraction of the emissions of the same computation on a coal-heavy regional grid. xAI’s Memphis facility draws from the Tennessee Valley Authority, which as of 2024 operated a mix of roughly 40% nuclear, 25% natural gas, 14% coal, and 20% renewables. Better than the US average, but materially above carbon-free — meaning better siting would have produced a lower number than 72,000 tons.
The Disclosure Gap the Industry Has Chosen Not to Close
Meta’s Llama 2 model card remains an anomaly: a major AI lab voluntarily publishing per-model emissions data in a standardized public format. It has not been widely replicated. OpenAI, Google DeepMind, Anthropic, and xAI publish no equivalent. This is a strategic choice, not a technical constraint — the data exists in energy billing records and power purchase agreements those labs hold internally.
The EU AI Act, which began phased enforcement in 2025, includes provisions for environmental transparency in high-risk AI systems but stops short of mandatory per-model emissions disclosure. The US has no equivalent regulation pending. What reporting exists comes through cloud provider sustainability reports — aggregated across workloads, never attributed to individual models or customers. The Stanford AI Index’s ability to report Grok 4’s 72,000-ton figure demonstrates the methodology exists. Labs simply choose not to apply it publicly.
Why the Trajectory Matters More Than Any Single Number
The AI industry is competing on benchmark performance, and the fastest path to higher benchmarks has been more compute. That equation has no built-in environmental brake. Electricity and water are cheap relative to the revenue value of frontier model capabilities. As investment in frontier AI continues to accelerate, the pressure to train larger models on compressed timelines only intensifies.
If Grok 4’s successor uses 10 times the compute — a plausible figure given stated scaling roadmaps across multiple labs — training emissions reach approximately 720,000 metric tons: 150,000 cars per year. The generation after that pushes into the millions. This is not speculation; it is the directional logic of every major lab’s publicly stated development plan.
The AI industry successfully deferred broad regulatory intervention in the US in 2024 and 2025, framing safety concerns as technically premature. The carbon and water debate is structurally harder to defer: the harms are measurable now, the accounting methodology is established, and the industrial emissions regulatory framework has decades of precedent. Voluntary disclosure has not historically been sufficient to curb industrial carbon trajectories. Mandatory per-model reporting — with standardized methodology and public filing requirements — is the mechanism that would produce accurate data. The industry’s current posture is that it prefers to operate without that data on the public record.
For enterprises with ESG commitments or operating under climate disclosure frameworks: inference from an AI provider that cannot produce per-model emissions data is an unquantified liability in your scope 3 accounting. The ask is simple — request the number from your vendor. If they cannot provide it, that is itself material information for procurement decisions.