Claude Opus 4.6 — What’s New

Released by Anthropic on 2026-02-05, focused on coding, enterprise agents, and professional workflows.

Key updates

Agent Teams

The headline feature. You can split work into multiple agents that run in parallel and coordinate directly—no more single-agent, fully-serial workflows.

1M token context (beta)

Opus now supports a 1M-token context window (beta) and up to 128K output, aimed at long-running tasks and large codebases.

Context Compaction

When context gets close to the limit, the system can automatically compress older content into a summary to keep long conversations usable.

Adaptive Thinking

Configurable “thinking intensity” with multiple levels (low/medium/high/max). Use lower levels for simple requests to reduce cost/latency; crank it up for hard problems.

Stronger coding performance

Reported Terminal-Bench 2.0 score: 65.4% (new high). Improvements across planning, review, debugging, and working inside large repos.

Zero-day vulnerability detection

Anthropic claims out-of-the-box discovery of 500+ open-source zero-days, all human-verified.

Long-context performance

MRCR v2 score reportedly 76% (vs Sonnet 4.5 at 18.5%), addressing the “gets fuzzy as the conversation grows” problem.

Office integrations

Excel can better interpret messy spreadsheets; PowerPoint generation is in preview and tries to match existing styles.

Benchmarks (as reported)

BenchmarkResult
GDPval-AA (finance/legal real work)1606 Elo, +144 over GPT-5.2
Finance agent benchmark#1
Terminal-Bench 2.065.4%

Safety

Anthropic positions Opus 4.6 as having strong safety performance and says it underwent the most rigorous evaluation process to date.


What this means for me

Agent Teams is the most exciting part. In practice, it’s the difference between “one agent does everything” and “one agent codes, one tests, one researches docs” — a big real-world speedup.

1M context + compaction matters for long projects. It reduces the painful loop of re-explaining requirements as the session grows.

Adaptive Thinking is also practical: not every request needs maximum reasoning. Being able to dial it down helps save tokens for the truly hard parts.

On pricing: I’m on the Max 5x plan ($100/month). It’s not cheap, but for heavy users it can be cost-effective—especially if Adaptive Thinking lets you stretch the Opus budget further.