Anthropic Says New Claude Opus 4.8 Catches 4 Times More Of Its Errors

Anthropic Says New Claude Opus 4.8 Catches 4 Times More Of Its Errors

Anthropic released Claude Opus 4.8 on Thursday, pitching the upgraded model as more honest and less prone to inventing facts than the version it replaces.

Key Points:

  • Anthropic shipped Claude Opus 4.8 on Thursday, calling honesty its standout gain.
  • The model is roughly four times less likely to let code flaws slip past, the company says.
  • Fast mode now runs 2.5 times faster and costs three times less than before.

Anthropic Pitches Opus 4.8 Honesty

The company unveiled the model on Thursday, framing it as a steady build on Opus 4.7 rather than a reinvention, with most benchmark scores creeping up only slightly. On the SWE-Bench Pro coding test, it scored 69.2%, up from 64.3% for the prior version and ahead of OpenAI's GPT-5.5, which managed 58.6%.

Honesty drew the spotlight. Anthropic says AI models often jump to conclusions, claiming progress on thin evidence, and that early testers found 4.8 quicker to admit doubt during long, unattended tasks. Its tests indicated the model is about four times less likely than 4.7 to let coding flaws slip past unremarked.

The upgrade shipped with fresh controls, including a setting that lets users dial how hard the model works on a task, now available on every plan. Anthropic also cut the price of fast mode, where the model runs at 2.5 times normal speed, to a third of what earlier models charged.

Also Read: Kalshi Wins CFTC Approval For First U.S. Bitcoin Perpetual Futures

Pritchard Backs Opus 4.8 Judgment

Tom Pritchard, a staff engineer at Shopify, told Anthropic the coding version shows far better judgment. He said the model "asks the right questions, catches its own mistakes," and pushes back when a plan looks weak. For teams burned by AI agents that wiped live production databases, that kind of promise may carry real weight.

Not everyone was convinced.

On Reddit, many users doubted the benchmark charts, summing up the mood as nobody trusting them, while others feared losing the older Opus 4.6 they still preferred for daily work.

Opus 4.8 Caps Anthropic Surge

The launch arrived at a heady moment for the lab. Anthropic's valuation has climbed past OpenAI's near $965 billion mark after a fresh round that ranked among the largest in tech. Investors widely expect the company to pursue a public listing later this year.

The release also capped a fast run of upgrades, with Opus 4.7 reaching users barely a month earlier under its own cloud of benchmark doubt. Anthropic has since teased Mythos, a far more powerful model it is holding back from the public over cybersecurity concerns.

Read Next: Dogecoin Reserves Edge Up To 28B As Whale Support Stays Weak

Disclaimer and Risk Warning: The information provided in this article is for educational and informational purposes only and is based on the author's opinion. It does not constitute financial, investment, legal, or tax advice. Cryptocurrency assets are highly volatile and subject to high risk, including the risk of losing all or a substantial amount of your investment. Trading or holding crypto assets may not be suitable for all investors. The views expressed in this article are solely those of the author(s) and do not represent the official policy or position of Yellow, its founders, or its executives. Always conduct your own thorough research (D.Y.O.R.) and consult a licensed financial professional before making any investment decision.
Latest News
Show All News