Cursor Composer 2.5 Matches Opus 4.7 at 12x Less

Cursor shipped Composer 2.5, a major upgrade to its in-house AI coding model built on the open-source Kimi K2.5 checkpoint from Moonshot.
Composer 2.5 was trained on 25 times more synthetic tasks than its predecessor; 85% of compute went to extra training and reinforcement learning.
The model matches Claude Opus 4.7 and GPT-5.5 on SWE-Bench Multilingual (79.8%) and CursorBench v3.1 (63.2%).
Pricing: $0.50 per million input tokens and $2.50 per million output tokens — a fraction of Anthropic and OpenAI per-token rates.

What Happened

Cursor shipped Composer 2.5, a major upgrade to its in-house AI coding model, The Decoder reported on Monday. The model builds on the open-source Kimi K2.5 checkpoint from Moonshot and was trained on 25 times more synthetic tasks than Composer 2. Cursor says 85% of the compute budget went toward extra training and reinforcement learning.

Why It Matters

Composer 2.5 is the clearest 2026 data point that frontier-tier coding capability has effectively commoditised at the model layer. A coding model trained by Cursor on an open-source Moonshot base now matches the SWE-Bench Multilingual performance of Anthropic’s flagship Opus 4.7 and OpenAI’s GPT-5.5 — at a fraction of the per-token cost. For AI coding tools, this changes the economic equation: the differentiation moves from raw model capability to product UX, integration depth, and customer-specific tuning.

The shift is consistent with the broader pattern Ramp’s AI Index showed last week — Anthropic narrowly leads B2B share but the cheap-inference layer is picking up volume at the bottom of the market. Cursor sits in a unique position: its in-house model gives it a structural cost advantage versus tools that wrap third-party APIs.

Technical Details

On SWE-Bench Multilingual, Composer 2.5 scored 79.8% — matching Opus 4.7 and GPT-5.5 within a comparable band. On CursorBench v3.1, the company’s own coding benchmark, Composer 2.5 reached 63.2%. Pricing is $0.50 per million input tokens and $2.50 per million output tokens; a faster variant with the same performance runs $3.00 input and $15.00 output. For context, Anthropic’s Claude Opus 4.7 and OpenAI’s GPT-5.5 charge per-million-token rates that are an order of magnitude higher on output.

The model is live in Cursor now. Cursor disclosed that it is already training a much larger successor model “from scratch” with SpaceX and xAI, using ten times the compute on the Colossus-2 cluster with one million H100 equivalents. SpaceX had previously announced plans to acquire Cursor for $60 billion. The Colossus-2 cluster is xAI’s primary training infrastructure, located in Memphis.

Who’s Affected

Cursor customers — millions of developers using Cursor as their primary IDE — benefit from materially lower per-task costs. Anthropic and OpenAI face direct pricing pressure on coding workloads, which have been their fastest-growing enterprise revenue category. Other AI coding tools — GitHub Copilot, Codeium, Replit, Cognition’s Devin, Continue.dev — face the question of whether to license Composer 2.5 alongside their existing model stacks. Moonshot’s open-source Kimi K2.5 checkpoint gains validation as a starting point for capability-class derivatives.

What’s Next

The Cursor-SpaceX-xAI joint successor model on Colossus-2 will be the next major capability milestone. SpaceX’s $60 billion acquisition plan for Cursor, if completed, would be one of the largest tech-sector M&A transactions of the year. Composer 2.5 is live now; pricing and SLA details are in the Cursor model documentation. Expect Anthropic and OpenAI to respond with price-tier adjustments or coding-specific model variants in the coming weeks.

Cursor’s Composer 2.5 Matches Opus 4.7 and GPT-5.5 on Benchmarks at a Fraction of the Cost

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

Cursor’s Composer 2.5 Matches Opus 4.7 and GPT-5.5 on Benchmarks at a Fraction of the Cost

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

SOOHAK Benchmark: AI Models Confidently Solve Math Problems That Have No Solution

Claude Mythos Tops ExploitBench, Develops V8 Browser Exploits at 12x GPT-5.5 Cost

Claude Mythos Preview Becomes First AI Model to Clear All UK AISI Cyberattack Simulations