Claude Opus 4.8 Tops Gemini And GPT On Multiple Coding Tests

Claude Opus 4.8 Tops Gemini And GPT On Multiple Coding Tests

Anthropic released Claude Opus 4.8, claiming the upgraded model outperforms OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro on several coding benchmarks.

Key Points:

  • Anthropic launched Claude Opus 4.8 on May 28, pricing it level with the prior 4.7 release.
  • The company says it tops OpenAI's GPT-5.5 and Google's Gemini 3.1 Pro on SWE-Bench Pro and other tests.
  • A revamped fast mode and dynamic workflows aim to cut the cost and time of agentic work.

Claude Opus 4.8 Tops Coding Benchmarks

The company unveiled the model Thursday, building on the Opus 4.7 version it shipped about six weeks earlier. Anthropic said Opus 4.8 scored 69.2% on the SWE-Bench Pro coding test, beating both rivals there and topping them on several other measures. It also reported gains in computer use, knowledge work and financial analysis, along with a 74.2% mark on the Terminal-Bench 2.1 benchmark.

Anthropic framed the release as a more honest model, saying testers found it flags its own uncertainty and stops short of unsupported claims. Internal reviews rate it about four times less likely than Opus 4.7 to let coding flaws slip past, and the firm says it scores higher on respecting user autonomy.

Also Read: Cardano Whales Seize 67.5% Of ADA Supply, A Six-Year High

Why Anthropic's Cost Controls Matter

Pricing held steady at $5 per million input tokens and $25 per million output tokens. A revamped fast mode now runs about 150% quicker and costs three times less than the earlier setting. Anthropic also opened a research preview of dynamic workflows, which spin up hundreds of parallel subagents for migrations spanning hundreds of thousands of lines of code.

Even so, the gains stay incremental.

GPT-5.5 still leads on one terminal-coding test, and Anthropic itself called the model a modest step rather than a breakthrough. Developers can now revise Claude's instructions mid-task through its Messages API. Buyers hunting cheaper AI may weigh those spending controls more heavily than the thin margins between top models.

Anthropic Valuation And Mythos Backdrop

The launch landed the same day Anthropic confirmed a $65 billion Series H round at a $965 billion valuation. That haul, led by Altimeter Capital, Dragoneer, Greenoaks and Sequoia Capital, pushed the five-year-old firm past OpenAI's reported $850 billion and lifted annual revenue near $47 billion.

The valuation nearly tripled from $380 billion in February, in what could prove Anthropic's last private raise before a stock listing. The company has held back its more powerful Mythos model, built for cybersecurity work, releasing it to only a handful of organizations over safety concerns. It now expects to widen access to Mythos-class systems for all customers in the coming weeks.

Read Next: Cisco Research Shows Frontier AI Models Failing Under Multi-Turn Attacks

Disclaimer and Risk Warning: The information provided in this article is for educational and informational purposes only and is based on the author's opinion. It does not constitute financial, investment, legal, or tax advice. Cryptocurrency assets are highly volatile and subject to high risk, including the risk of losing all or a substantial amount of your investment. Trading or holding crypto assets may not be suitable for all investors. The views expressed in this article are solely those of the author(s) and do not represent the official policy or position of Yellow, its founders, or its executives. Always conduct your own thorough research (D.Y.O.R.) and consult a licensed financial professional before making any investment decision.
Latest News
Show All News