Claude Opus 4.8 Is Out: Smarter, Faster, and It Actually Admits When It Is Wrong
Anthropic quietly dropped Claude Opus 4.8 on May 28, 2026 — and this time, the biggest upgrade is not just higher benchmark scores. The model now flags its own mistakes before you even notice them. Here is everything you need to know.
What is Claude Opus 4.8 and why does it matter?
Think of Opus 4.8 as Anthropic saying: “We kept the price the same, but handed you a noticeably better engineer.” This is not a radical redesign — it is a focused, precise upgrade to Opus 4.7, landing just 41 days after its predecessor. That speed is unusual even for Anthropic, and it signals something: the team is shipping fast and iterating with real purpose.
The model is available today across Claude.ai, the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. If you are already using Opus 4.7, switching is a one-line config change. No breaking API differences, same context window, same pricing structure.
Why should non-developers care?
Even if you never write a single line of code, Opus 4.8 affects you. The same model powers the Claude.ai chat interface you might use for writing, research, or analysis. It now handles long, complex tasks more reliably — and when it is not sure about something, it tells you instead of making something up. That alone is a meaningful shift.
The four features that actually change your workflow
Claude Code can now split one enormous task into hundreds of parallel sub-agents, each approaching the problem from a different angle, then verify and combine their findings automatically.
Run at 2.5x the standard speed. Now 3x cheaper than the previous generation’s fast tier. Activate it in Claude Code with the /fast command.
Users on Claude.ai can now tell the model how hard to think — from a quick response to maximum-depth reasoning. Effort levels range from default “high” up to “xhigh” and “max.”
A new developer capability on the Messages API. You can now send updated instructions to the model while it is already working — useful for long-running agent pipelines.
Dynamic Workflows in plain language
Imagine you need to migrate an entire codebase — hundreds of thousands of lines spread across dozens of files. Normally, that takes an engineering team days. With Dynamic Workflows (currently in research preview, available in Claude Code), Opus 4.8 acts as the project lead: it breaks the problem apart, assigns chunks to multiple parallel sub-agents, watches their outputs, finds contradictions, and consolidates everything into a final verified result. You hand off the task and focus on something else.
Dynamic Workflows is still a research preview. It is available inside Claude Code, not through the regular chat interface. GA availability has not been announced yet.
Benchmark numbers: what went up, what stayed flat
Anthropic positions this as a “modest but tangible improvement” — and the numbers back that up honestly. Not every metric jumped dramatically, but the gains in the areas that matter most for real work are real.
| Benchmark | Opus 4.7 | Opus 4.8 | Change |
|---|---|---|---|
| SWE-bench Pro (agentic coding) | 64.3% | 69.2% | +4.9 pts |
| SWE-bench Verified | 87.6% | 88.6% | +1.0 pt |
| USAMO 2026 (math) | 69.3% | 96.7% | +27.4 pts |
| GDPval-AA (knowledge work) | 1753 | 1890 | +137 Elo |
| GPQA Diamond (science) | 94.2% | 93.6% | -0.6 pts |
| OSWorld-Verified (computer use) | — | 83.4% | Lead |
The math score jump from 69.3% to 96.7% on USAMO 2026 is the largest single-cycle improvement seen from any Opus-class model. For anyone using Claude for quantitative research, financial modeling, or science-heavy tasks, this is a big deal.
The “honesty upgrade”: an AI that catches its own errors
Here is the part that does not show up clearly in any benchmark table, but is arguably the most important change in Opus 4.8.
AI models have always had a peculiar habit: confidently reporting success even when the underlying work is shaky. An agent might “finish” a task and tell you everything went smoothly — while silently glossing over a flaw it noticed but decided not to mention. Opus 4.8 is built to fight that tendency directly.
4x less likely to let code flaws pass silently
According to Anthropic’s internal testing, Opus 4.8 is roughly four times less likely than Opus 4.7 to allow a flaw in code it has written to go unremarked. The model raises a flag, explains the issue, and asks whether you want to proceed. That might sound like a small thing — but for teams running automated pipelines where no human is watching every step, it changes the risk profile entirely.
One early tester from Bridgewater described the change simply: the model no longer pretends to make progress when it has not. That quiet shift in behavior is more valuable than a two-point benchmark improvement will ever be.
Whether you are using Claude to write code, research a topic, or draft a report — a model that tells you when it is uncertain is one you can actually trust. Honesty is not just an alignment goal. It is a practical feature.
Pricing breakdown: standard vs. Fast Mode
One of the headline decisions from Anthropic: no price increase. Opus 4.8 costs exactly the same as Opus 4.7 at the standard tier. The only new pricing is for Fast Mode, which is significantly cheaper than the previous generation’s fast option.
$25 per 1M output tokens. Unchanged from Opus 4.7. Full reasoning depth.
$50 per 1M output. 2.5x the speed. 3x cheaper than the old fast tier. Use /fast in Claude Code.
To put Fast Mode in perspective: the previous generation’s fast pricing was roughly $30/$150 per million tokens. The new rate of $10/$50 — while still premium — represents a real cost reduction for teams running high-volume agent loops where speed matters more than maximum reasoning depth.
How it stacks up against GPT-5.5 and Gemini 3.1 Pro
Anthropic ran Opus 4.8 directly against OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro on the same benchmark suite. The results are mostly favorable for Claude — with one notable exception.
Where Opus 4.8 wins
Where GPT-5.5 still leads
That Terminal-Bench gap is the one area Anthropic does not claim to have overtaken OpenAI. For most practical software engineering work, the gap is narrow enough that it will not matter. But for highly terminal-intensive workflows, GPT-5.5 still has a measurable edge.
Opus 4.8 is the most expensive frontier model on the market by output rate — roughly 2.5x more expensive than GPT-5.5 on output. For teams where cost is the primary constraint, that difference is real. For teams where quality and reliability matter most, the investment case is stronger.
Who should upgrade right now?
Upgrade immediately if you are…
Running automated coding pipelines where silent errors are a real risk. The honesty improvements alone justify migration. Also upgrade if you are working on large-scale codebase operations — migrations, sweeps, refactors across hundreds of files. Dynamic Workflows changes the economics of what you can realistically delegate to an AI agent.
If you are doing quantitative or scientific work, the USAMO math jump from 69% to 97% is remarkable. Financial modeling, research synthesis, and data analysis all benefit here.
You can wait if you are…
Using Claude primarily for writing, brainstorming, or light research where Opus 4.7 already performs well. The improvements in this release are targeted heavily at agentic, code-heavy, and long-horizon tasks. For casual daily use, the difference will feel subtle.
Switching to claude-opus-4-8 from claude-opus-4-7 is a configuration-only change for API users. No breaking changes, same context window, same tool surface. There is essentially no reason not to make the switch if you are already on Opus 4.7.
What comes next: Mythos-class models are on the horizon
Anthropic was unusually forward about this: Opus 4.8 is not the ceiling. The company’s most advanced model, known internally as Mythos, has already been shared with a select group of companies but has not been released publicly. Anthropic said in the same breath as the Opus 4.8 announcement that Mythos-class models are expected “in the coming weeks.”
What is Mythos? Details are thin, but it is described as a tier above the entire Opus line — a separate class of model rather than a version increment. Wikipedia’s Claude page notes that Mythos was released to some companies in 2026 but remains off-limits to the general public for now.
Alongside Mythos hints, Anthropic also teased Sonnet 4.8 as a possible near-term release. The Sonnet line tends to hit the sweet spot between performance and cost for most everyday users, so a Sonnet upgrade would be meaningful for a broader audience than the Opus releases.
If the two-month release cadence Anthropic has maintained through 2026 holds, the next major model event could arrive as early as late July. Given the Mythos tease, the next announcement may be more significant than a point release.
Frequently asked questions
Yes. It went live on May 28, 2026 across Claude.ai, the Claude API (model ID: claude-opus-4-8), Amazon Bedrock, Google Vertex AI, and Microsoft Foundry.
No. Standard pricing remains $5 per million input tokens and $25 per million output tokens. Fast Mode adds an optional $10/$50 tier at 2.5x speed.
Dynamic Workflows lets Claude Code plan a complex task and split it across hundreds of parallel sub-agents. It is currently a research preview available inside Claude Code, not the standard chat interface.
Inside Claude Code, type /fast to toggle it on. Fast Mode is also available via the Claude API for developers who request access from an account manager or join the waitlist.
Mostly, but not entirely. GPT-5.5 still leads on Terminal-Bench 2.1 (78.2% vs 74.6%). Opus 4.8 leads on agentic coding, knowledge work, math, and computer use.
Mythos is Anthropic’s most advanced model, described as a step above the Opus line. It has been shared with select enterprise partners but has not been released to the public. Anthropic says Mythos-class models are coming “in the coming weeks.”
The bottom line on Opus 4.8
Claude Opus 4.8 is a focused, honest upgrade. It does not reinvent the wheel — but it adds meaningful precision to the things developers and teams care about most: reliable long-running agents, a model that admits uncertainty, dramatically improved math, and a faster mode that finally costs less to run.
The unchanged pricing makes the upgrade decision easy for existing Opus 4.7 users. And with Mythos waiting just around the corner, Anthropic’s 2026 is shaping up to be its most consequential release year yet.
