Claude Mythos Solves 32-Step AISI Hack In 6 Of 10 Attempts

A new checkpoint of Anthropic's Claude Mythos Preview has become the first AI model to solve both UK government cyberattack simulations, raising fresh questions about autonomous hacking.

AISI Reports Mythos Breakthrough

The UK's AI Security Institute reported Wednesday that the newer Mythos checkpoint completed its 32-step corporate network attack range, "The Last Ones," in 6 of 10 attempts. The earlier version had managed just 3 of 10.

The updated model also cracked "Cooling Tower," an industrial control system range that no prior model had passed, in 3 of 10 tries.

Rival OpenAI's GPT-5.5 was tested on the same exercise. It solved "The Last Ones" in 3 of 10 attempts but did not complete "Cooling Tower."

AISI ran the ranges with a 100 million-token compute budget per attempt, and the agency noted that performance kept scaling at that ceiling, suggesting higher budgets would push success rates further.

Also Read: Southeast Asia Blockchain Week Brings Ripple, Avalanche, Solana Foundation, And K-Pop To Bangkok

Doubling Time Keeps Shrinking

AISI tracks cyber progress through time horizon benchmarks, measuring how long an autonomous task a model can finish at 80% reliability. In November 2025, the agency estimated a doubling time of 8 months. By February 2026, that figure had compressed to 4.7 months, and both Mythos and GPT-5.5 have since exceeded the faster trend.

The agency acknowledged uncertainty about whether the latest results signal a new acceleration or a one-time leap.

Research nonprofit METR, which tracks AI on software tasks rather than cyber ranges, has produced a similar figure of roughly 4.2 months. AISI said the convergence strengthens the case that the trend reflects real capability gains rather than a quirk of one evaluation suite.

The institute stressed that its ranges lack active defenders, so the results show what models can do against weakly protected networks rather than hardened enterprise systems.

Why Capability Jumps Matter

The newer Mythos checkpoint did not arrive with a fresh model release. AISI used the same version Anthropic deployed last month with Project Glasswing, its security partnership program, after receiving an updated build of the same model.

"Notable capability jumps do not always require new model releases," the institute wrote. That cuts against the assumption that defenders can pace themselves to launch cycles.

Anthropic introduced Mythos Preview on Apr. 7, framing the model as a turning point for the security industry after it identified zero-day flaws across major operating systems and browsers in internal tests. The company said it had withheld broader release because of those capabilities, and AISI's earlier April evaluation flagged Mythos as a clear step up from previous frontier systems.