- GPT-5.5 scores 82.7% on Terminal-Bench 2.0, improving on GPT-5.4’s 75.1% and exceeding Claude Opus 4.7’s 69.4% and Gemini 3.1 Pro’s 68.5% on the same benchmark.
- The model matches GPT-5.4’s per-token serving latency while using fewer tokens to complete equivalent Codex tasks.
- GPT-5.5 is available today to ChatGPT Plus, Pro, Business, and Enterprise subscribers and in Codex; API access requires additional safety review and is not yet available.
- OpenAI reports that more than 85% of its staff uses Codex weekly, spanning engineering, finance, marketing, and other functions.
What Happened
OpenAI released GPT-5.5 on April 24, 2026, making it available to ChatGPT Plus, Pro, Business, and Enterprise subscribers and to users of Codex, the company’s agentic coding environment. A higher-capability variant, GPT-5.5 Pro, is rolling out simultaneously to Pro, Business, and Enterprise users in ChatGPT. API access is not yet available; OpenAI said it is working with partners on the safety and security requirements for serving the model at scale.
Why It Matters
GPT-5.5 enters a benchmark landscape where Anthropic’s Claude Opus 4.7 and Google’s Gemini 3.1 Pro have both staked positions in agentic and computer-use performance. On Terminal-Bench 2.0 — which tests complex command-line workflows requiring planning, iteration, and tool coordination — GPT-5.5 scores 82.7% against Claude Opus 4.7’s 69.4% and Gemini 3.1 Pro’s 68.5%. On OSWorld-Verified, a benchmark for computer-use capability, GPT-5.5 scores 78.7% to Claude Opus 4.7’s 78.0%.
OpenAI framed the release as part of a broader push to build infrastructure for agentic AI, citing software engineering as the domain where AI has already produced measurable productivity gains and pointing to scientific research as the next area of focus.
Technical Details
On SWE-Bench Pro, which evaluates real-world GitHub issue resolution in a single-pass end-to-end mode, GPT-5.5 reaches 58.6%. On Expert-SWE — an internal OpenAI benchmark for long-horizon coding tasks with a median estimated human completion time of 20 hours — GPT-5.5 scores 73.1% against GPT-5.4’s 68.5%. On FrontierMath Tier 4, which tests advanced mathematical reasoning, GPT-5.5 scores 35.4% against GPT-5.4’s 27.1% and Claude Opus 4.7’s 22.9%.
OpenAI states that GPT-5.5 matches GPT-5.4’s per-token latency in real-world serving despite operating at a higher capability level, and uses significantly fewer tokens to complete equivalent Codex tasks. According to the company, on Artificial Analysis’s Coding Index, GPT-5.5 delivers what it characterizes as state-of-the-art coding intelligence at half the cost of competitive frontier coding models.
On the GDPval benchmark, which measures general task performance via wins or ties, GPT-5.5 scores 84.9% against GPT-5.5 Pro’s 82.3%, Claude Opus 4.7’s 80.3%, and Gemini 3.1 Pro’s 67.3%. GPT-5.5 Pro leads on BrowseComp at 90.1%, above Gemini 3.1 Pro’s 85.9% and GPT-5.5’s 84.4%.
Who’s Affected
Software developers and engineering teams are the primary users. Michael Truell, Co-founder and CEO of Cursor, said: “GPT-5.5 is noticeably smarter and more persistent than GPT-5.4, with stronger coding performance and more reliable tool use. It stays on task for significantly longer without stopping early, which matters most for the complex, long-running work our users delegate to Cursor.”
Dan Shipper, Founder and CEO of Every, tested GPT-5.5 against a real post-launch debugging scenario: could the model replicate a codebase rewrite that had required days from a senior engineer? GPT-5.5 produced a matching solution; GPT-5.4 did not. Pietro Schirano, CEO of MagicPath, reported that GPT-5.5 merged a branch with hundreds of frontend and refactor changes into a substantially changed main branch in approximately 20 minutes without manual conflict resolution.
Enterprise API customers are not yet served. OpenAI said it is working with those partners on safety and security requirements before releasing GPT-5.5 at API scale.
What’s Next
OpenAI said GPT-5.5 and GPT-5.5 Pro will come to the API “very soon,” without specifying a date. The company stated it evaluated GPT-5.5 across its full safety and preparedness frameworks, conducted targeted testing for advanced cybersecurity and biology capabilities, and gathered feedback from approximately 200 early-access partners before the public release.