Anthropic released Claude Opus 4.6 on February 5, 2026, introducing a 1-million token context window to the Opus model class for the first time, alongside new agentic capabilities and developer-facing controls for adjusting reasoning depth and task duration.
- First Opus-class model to offer a 1M token context window, currently in beta
- Outperforms OpenAI’s GPT-5.2 by approximately 144 Elo points on GDPval-AA, a benchmark measuring economically valuable knowledge work in finance and legal domains
- Achieves top scores on Terminal-Bench 2.0 (agentic coding) and Humanity’s Last Exam (multidisciplinary reasoning), and leads all frontier models on BrowseComp
- New
/effortparameter lets developers trade reasoning depth for speed and cost, defaulting to high
What Happened
Anthropic released Claude Opus 4.6 on February 5, 2026, with a focus on extended agentic task performance, enhanced coding reliability, and — for the first time in the Opus line — a 1-million token context window available in beta. The full announcement and model card are published on Anthropic’s website.
In the announcement, Anthropic stated the model “plans more carefully, sustains agentic tasks for longer, can operate more reliably in larger codebases, and has better code review and debugging skills to catch its own mistakes.” Author details for the announcement were not available at time of publication.
The model is immediately available on claude.ai, via the API using the identifier claude-opus-4-6, and on major cloud platforms. Pricing is unchanged at $5 per million input tokens and $25 per million output tokens.
Why It Matters
The 1M token context window expands what Opus 4.6 can process in a single session — entire codebases, lengthy contract documents, or multi-year financial records — without requiring developers to split and re-inject content. This capability was previously unavailable in the Opus model line.
The GDPval-AA benchmark evaluates performance on economically valuable knowledge work in finance, legal, and related professional domains. Opus 4.6’s 144-point Elo lead over GPT-5.2 on that benchmark is a directly comparable, task-grounded measure against a commercial competitor.
Anthropic also noted that Opus 4.6 leads all frontier models on BrowseComp, a benchmark measuring a model’s ability to locate hard-to-find information online — a capability directly relevant to research and due diligence workflows.
Technical Details
On Terminal-Bench 2.0, which evaluates multi-step agentic coding tasks in a terminal environment, Opus 4.6 achieved the highest score across all models evaluated. On Humanity’s Last Exam, a complex multidisciplinary reasoning benchmark, it also leads all other frontier models benchmarked.
On GDPval-AA, Opus 4.6 outperforms OpenAI’s GPT-5.2 by approximately 144 Elo points and its direct predecessor, Claude Opus 4.5, by 190 Elo points — both on the same set of finance, legal, and professional knowledge work tasks.
Adaptive thinking allows the model to gauge from contextual cues how much extended reasoning to apply, rather than applying maximum reasoning uniformly. The /effort parameter (defaulting to high) lets developers manually dial this down to medium to reduce latency and cost on simpler tasks. Context compaction, available via the API, enables the model to summarize its own context window mid-task, allowing longer runs without hitting hard token limits.
Anthropic’s internal team noted in the announcement: “the model brings more focus to the most challenging parts of a task without being told to, moves quickly through the more straightforward parts, handles ambiguous problems with better judgment, and stays productive over longer sessions.”
Who’s Affected
Developers building agentic workflows through the Claude API gain direct access to the 1M context window in beta, context compaction, and the /effort parameter. Enterprise users in finance and legal verticals are the primary target of GDPval-AA performance claims, given that benchmark’s direct focus on those professional domains.
In Claude Code, developers can now assemble teams of agents to work in parallel on a single task. Microsoft Office users receive a related set of updates: substantial changes to Claude in Excel and a research preview release of Claude in PowerPoint. Cowork users — those using Anthropic’s autonomous multitasking workspace — gain access to Opus 4.6’s full capability set for long-horizon, multi-step tasks.
What’s Next
The 1M token context window is currently in beta and Anthropic has not confirmed a general availability date. Claude in PowerPoint was released as a research preview, indicating it is not yet feature-complete or generally available.
Anthropic’s system card for Opus 4.6 covers safety evaluations in detail, with the company stating the model shows “an overall safety profile as good as, or better than, any other frontier model in the industry, with low rates of misaligned behavior across safety evaluations.” No third-party audit of those safety claims was referenced in the announcement.