ANALYSIS

AI Compute Crunch Forces Outages, Product Cuts, and 48% GPU Price Jump

E Elena Volkov Apr 14, 2026 3 min read
Engine Score 8/10 — Important
Editorial illustration for: AI Compute Crunch Forces Outages, Product Cuts, and 48% GPU Price Jump
  • Anthropic’s Claude API recorded 98.95% uptime over the 90 days ending April 8, 2026, well below the 99.99% standard that established cloud providers maintain, resulting in enterprise customer losses to rivals.
  • OpenAI is shutting down its Sora video generation app on April 26, 2026, redirecting freed compute toward coding and enterprise products built on a model internally codenamed Spud.
  • Spot prices for Nvidia’s latest-generation Blackwell GPUs rose 48% in two months to $4.08 per hour as of April 2026, according to the Ornn Compute Price Index.
  • Bank of America analysts project AI compute demand will exceed supply through at least 2029.

What Happened

Surging demand for agentic AI has triggered a compute capacity crisis across the industry, according to a Wall Street Journal report summarized by The Decoder’s Maximilian Schreiner on April 13, 2026. Anthropic’s Claude API recorded just 98.95% uptime over the 90 days ending April 8—well below the 99.99% standard that established cloud providers typically maintain—and the company has already lost enterprise customers as a result. OpenAI, facing a 150% surge in API token consumption between October 2025 and March 2026, announced it will shut down its Sora video generation app to redirect compute toward higher-priority products.

Why It Matters

Agentic AI workloads—where models complete multi-step tasks autonomously—consume substantially more tokens per session than conventional chat interactions, amplifying demand far beyond what providers had provisioned. The explosive growth of these workloads caught the industry off-guard: since January 2026, several providers began rolling out new usage limits specifically to manage agent-driven consumption before the supply crunch became publicly visible. Coreweave, one of the largest publicly traded AI cloud infrastructure companies, raised prices more than 20% toward the end of 2025 and began requiring smaller customers to sign three-year contracts rather than one-year agreements. Bank of America analysts estimate the imbalance between demand and supply will persist through at least 2029.

Technical Details

Token usage across OpenAI’s API climbed from 6 billion per minute in October 2025 to 15 billion per minute by the end of March 2026—a 150% increase in five months—per the WSJ. Spot market prices for Nvidia’s latest-generation Blackwell GPUs reached $4.08 per hour as of April 2026, up 48% from $2.75 two months earlier, according to the Ornn Compute Price Index. Anthropic’s annualized revenue rate grew from $9 billion at the end of 2025 to $14 billion in February 2026, then crossed $30 billion by April 2026—a trajectory that has outpaced the company’s ability to provision reliable capacity. In parallel, OpenAI shifted its Codex enterprise billing from flat message-based pricing to token-based metering in early April and introduced a $100 Pro tier for compute-intensive coding sessions; Windsurf replaced its credit system with daily and weekly quotas in March.

Who’s Affected

Enterprise API customers dependent on Anthropic have been among the first to migrate to competing providers. David Hsu, founder of the software platform Retool, told the WSJ that he prefers Anthropic’s Claude Opus model but recently moved workloads to OpenAI because Anthropic’s service kept going down—an indication that reliability, not model capability, is now the decisive factor in vendor selection for some teams. Developers using GitHub Copilot, Windsurf, and OpenAI’s Codex now face new usage caps and token-based metering introduced in March and April 2026; GitHub cited “rapid growth, high concurrency, and intensive usage” in its announcement of new Copilot limits on April 10.

What’s Next

OpenAI will take Sora’s web and app versions offline on April 26, 2026, with the API endpoint following in September; freed compute will be directed toward products built on a model internally codenamed Spud. OpenAI CFO Sarah Friar told the WSJ that she spends a significant portion of her time identifying near-term compute capacity and that the company is actively shelving projects because resources are unavailable. Vultr CEO J.J. Kardwell told the WSJ that available power capacity through 2026 is already fully committed, pointing to long hardware lead times and slow data center buildouts as structural bottlenecks with no near-term resolution.

Share

Enjoyed this story?

Get articles like this delivered daily. The Engine Room — free AI intelligence newsletter.

Join 500+ AI professionals · No spam · Unsubscribe anytime