Claude Mythos AI Built Working Exploits Across 50 Cloudflare Repos, Then Refused To Demo

Cloudflare confirmed Monday that Anthropic's unreleased Mythos Preview model chained bugs into working exploits across more than 50 of its repositories.

Cloudflare Project Glasswing Findings

The disclosure came in a blog post from Cloudflare Chief Security Officer Grant Bourzikas, who said his team pointed Mythos Preview at production code spanning the runtime, edge data path and protocol stack. Cloudflare joined Project Glasswing, Anthropic's invite-only program for defensive security partners. Bourzikas called the model "a real step forward," citing two capabilities competitors lacked.

Mythos chained several small attack primitives into working proofs of concept. The model also compiled and ran exploit code in a scratch environment, then revised its hypothesis when a run failed.

The post also flagged inconsistent refusals from the preview model.

In one case, Mythos declined to write a demonstration exploit after confirming several memory bugs in a codebase, then complied when the same task was framed differently in a separate session.

Also Read: Crypto Funds Bleed $1.07B As Iran Tensions End Six-Week Inflow Run

Multi-Agent Harness Beats Solo Scanners

Cloudflare said pointing one generic coding agent at a repository did not work for vulnerability research. Bourzikas instead built a multi-stage harness running roughly 50 parallel agents on narrow tasks. The pipeline runs reconnaissance, hunting, adversarial validation, deduplication and reachability tracing.

An independent agent tries to disprove each finding before it enters the triage queue, cutting false positives that plague memory-unsafe code written in C and C++. Anthropic has committed $100 million in model credits and $4 million in donations to open-source security groups under Project Glasswing.

Mythos Preview will not be released publicly.

Crypto Smart Contracts Face AI Exploit Wave

The Cloudflare findings land as on-chain losses mount. The Verus-Ethereum bridge lost $11 million Monday in a cross-chain attack, with proceeds swapped into 5,402 Ether (ETH).

Anthropic researchers previously showed that AI agents could autonomously exploit live contracts at a profit. In one test, models scanned 2,849 deployed contracts and produced exploits worth $3,694 for $3,476 in compute.

CertiK warned on May 15 that legacy smart contracts now sit at the center of an AI-driven hunting wave. DeFi protocols lost more than $605 million across roughly 20 days in April, including the $293 million KelpDAO drain on Apr. 19. Social engineering took another $306 million across the first quarter.