Anthropic Says New Claude Opus 4.8 Catches 4 Times More Of Its Errors

Anthropic released Claude Opus 4.8 on Thursday, pitching the upgraded model as more honest and less prone to inventing facts than the version it replaces.

Key Points:

Anthropic shipped Claude Opus 4.8 on Thursday, calling honesty its standout gain.

The model is roughly four times less likely to let code flaws slip past, the company says.

Fast mode now runs 2.5 times faster and costs three times less than before.

Anthropic Pitches Opus 4.8 Honesty

The company unveiled the model on Thursday, framing it as a steady build on Opus 4.7 rather than a reinvention, with most benchmark scores creeping up only slightly. On the SWE-Bench Pro coding test, it scored 69.2%, up from 64.3% for the prior version and ahead of OpenAI's GPT-5.5, which managed 58.6%.

Honesty drew the spotlight. Anthropic says AI models often jump to conclusions, claiming progress on thin evidence, and that early testers found 4.8 quicker to admit doubt during long, unattended tasks. Its tests indicated the model is about four times less likely than 4.7 to let coding flaws slip past unremarked.

The upgrade shipped with fresh controls, including a setting that lets users dial how hard the model works on a task, now available on every plan. Anthropic also cut the price of fast mode, where the model runs at 2.5 times normal speed, to a third of what earlier models charged.

Also Read: Kalshi Wins CFTC Approval For First U.S. Bitcoin Perpetual Futures

Pritchard Backs Opus 4.8 Judgment

Tom Pritchard, a staff engineer at Shopify, told Anthropic the coding version shows far better judgment. He said the model "asks the right questions, catches its own mistakes," and pushes back when a plan looks weak. For teams burned by AI agents that wiped live production databases, that kind of promise may carry real weight.

Not everyone was convinced.

On Reddit, many users doubted the benchmark charts, summing up the mood as nobody trusting them, while others feared losing the older Opus 4.6 they still preferred for daily work.

Opus 4.8 Caps Anthropic Surge

The launch arrived at a heady moment for the lab. Anthropic's valuation has climbed past OpenAI's near $965 billion mark after a fresh round that ranked among the largest in tech. Investors widely expect the company to pursue a public listing later this year.

The release also capped a fast run of upgrades, with Opus 4.7 reaching users barely a month earlier under its own cloud of benchmark doubt. Anthropic has since teased Mythos, a far more powerful model it is holding back from the public over cybersecurity concerns.