- Mustafa Suleyman, CEO of Microsoft AI, argues that AI compute has grown by a factor of one trillion since 2010, driven by faster chips, higher-bandwidth memory, and massive GPU cluster scaling.
- Training time for a language model fell from 167 minutes on eight GPUs in 2020 to under four minutes on equivalent modern hardware—a 50x improvement against Moore’s Law’s predicted 5x.
- Research from Epoch AI cited in the essay finds the compute required to reach a fixed AI performance benchmark halves approximately every eight months, roughly twice the pace of Moore’s Law.
- Suleyman projects approximately 1,000x in effective compute by the end of 2028 and potentially 200 gigawatts of new AI capacity added annually by 2030.
What Happened
Mustafa Suleyman, CEO of Microsoft AI and co-founder of DeepMind, published an essay in MIT Technology Review on April 8, 2026, arguing that AI development is far from plateauing. He contends that three converging hardware advances have driven compute growth far beyond conventional projections, countering recent industry concern that scaling gains are diminishing.
His central data point: from 2010 to 2026, the compute used to train frontier AI models has grown by a factor of approximately one trillion, scaling from roughly 10^14 floating-point operations (flops) for early systems to over 10^26 flops for the largest models in production today.
Why It Matters
Suleyman’s essay directly engages an active debate among researchers and investors about whether AI scaling is encountering diminishing returns. His position—backed by specific hardware benchmarks and third-party research—carries institutional weight: he oversees Microsoft’s AI product strategy and infrastructure investments, including the company’s partnership with OpenAI and its custom Maia silicon program.
The piece arrives as leading labs including OpenAI, Google DeepMind, and Anthropic are committing to significantly larger training runs for next-generation models, with hyperscaler AI infrastructure spending projected in the hundreds of billions of dollars through 2027.
Technical Details
Suleyman identifies three hardware drivers behind the compute explosion. First, Nvidia GPUs delivered a sevenfold increase in raw performance over six years—from 312 teraflops in 2020 to 2,250 teraflops on current hardware. Microsoft’s Maia 200 chip, which launched in January 2026, delivers 30% better performance per dollar than any other hardware in the company’s fleet, Suleyman states.
Second, HBM3 (high bandwidth memory, third generation) stacks chips vertically and triples the data bandwidth of its predecessor, keeping processors continuously fed with data. Third, interconnect technologies such as NVLink and InfiniBand now link hundreds of thousands of GPUs into training clusters Suleyman describes as “warehouse-size supercomputers that function as single cognitive entities.”
The combined effect: a training run that took 167 minutes on eight GPUs in 2020 now completes in under four minutes on comparable modern hardware—a 50x improvement versus the 5x Moore’s Law would have predicted. Separately, citing Epoch AI research, Suleyman notes that the compute required to reach a fixed AI performance level halves approximately every eight months. Deployment costs have also declined sharply: serving costs for some recent models have fallen by a factor of up to 900 on an annualized basis.
Who’s Affected
Suleyman projects that global AI-relevant compute will reach 100 million H100-equivalents by 2027—a tenfold increase in three years—and that effective compute will grow by another 1,000x by the end of 2028. “By 2030 we’ll bring an additional 200 gigawatts of compute online every year,” he writes, comparing that figure to the combined peak energy use of the United Kingdom, France, Germany, and Italy.
Those projections are most immediately relevant to cloud infrastructure providers—Microsoft, Google, Amazon, and Meta—currently committing to large-scale AI data center build-outs. A single refrigerator-size AI compute rack already consumes 120 kilowatts, equivalent to 100 homes, and the energy demands implied by Suleyman’s projections will require significant grid and generation investment.
What’s Next
Suleyman points to falling solar costs (down approximately 100x over 50 years) and battery prices (down 97% over three decades) as a supply-side offset to energy demand growth, and frames 2028 as a near-term compute inflection point. He argues the trajectory will drive a transition from current conversational AI tools to semiautonomous agent systems capable of managing multi-week autonomous projects, though he does not specify a commercial deployment timeline for such systems.
Microsoft is scheduled to hold its Build developer conference in May 2026, where AI infrastructure and agent-layer capabilities are expected to feature prominently.