Anthropic's Mythos AI model leads rival systems at finding software vulnerabilities, but new independent benchmarks expose weaker judgment and steep running costs.
Mythos Preview Tops Source Code Audits
Offensive security firm XBOW confirmed the headline claim. The firm assembled a 10-expert team to evaluate the model across benchmarks, workflows, and integrations.
XBOW said Mythos Preview "presents a significant step up over all existing models, regardless of provider." Testers ran the model against frozen open-source applications with known vulnerabilities.
Mythos cut false negatives by 42% against Opus 4.6, with the reduction reaching 55% once the model received source code access, The Decoder reported. The model excelled at live-plus-source testing. It performed less reliably when given source code alone.
Also Read: XRP ETFs Hit Record $1.39B But Token Loses 4th Spot To BNB
Cost Question Tempers Anthropic's Edge
Anthropic has indicated Mythos Preview will be roughly 5 times more expensive than an Opus model, already among the priciest options on the market. That premium prompted XBOW to test whether a cheaper rival could match Mythos given more runtime.
The answer was yes. On a fixed token budget for web vulnerability discovery, Mythos beat Opus 4.6 but lost to OpenAI's GPT-5.5, which XBOW recorded at a 10% miss rate. XBOW noted the model "isn't terribly inefficient" if accuracy is the goal, but it is not best-in-class once cost normalization enters the picture.
The firm now recommends running a mix of models rather than relying on one.
Mythos AI Performance In Context
Mythos exhibited mixed judgment, rejecting false positives better than predecessors but sometimes discarding true ones when evidence failed to meet its formal criteria. Reverse engineering and native-code analysis ranked among its sharpest skills, with the model able to triage findings from competing systems.
Anthropic first unveiled Mythos in early April, restricting access to roughly 50 partners and framing the release as a step change in AI cyber capability. The U.K. AI Security Institute later said both Mythos and GPT-5.5 had "substantially exceeded" its accelerated forecast. The agency now estimates cyber capabilities double every 4.7 months, down from an earlier eight-month figure set in November 2025.
Read Next: Hyperliquid Rejects Wall Street's Manipulation Claims As HYPE Drops 14%





