Claude Fable 5 returned on Jul. 1 with sharp user complaints, but benchmark data points to a stricter Anthropic router rather than a weaker model.
Key Points:
- BridgeBench reported a collapse in Fable 5 coding scores after most debugging tasks were routed away from the model.
- Arena.AI found mostly stable blind human-preference results, with gains in document and expert text categories.
- Developers face the clearest disruption because routine debugging prompts can trigger the new classifier.
Fable 5 Routing
Claude Fable 5 came back online on Jul. 1 after its reinstatement, and users on X quickly described it as broken, nerfed or less capable than before. The strongest evidence for that view came from BridgeMind, which reran its BridgeBench coding suite against the reinstated version.
The results looked severe. Debugging fell from 86.2 to 25.9, refactoring dropped from 73.6 to 38.4, and hallucination resistance declined from 75.9 to 61.7.
Those numbers do not show a clean model-level collapse because BridgeBench said only three of 12 TypeScript debugging tasks actually reached Fable 5. The other nine were intercepted by Anthropic’s new safety classifier and sent to Claude Opus 4.8, with each fallback scored as zero because the evaluated model did not answer.
Also Read: Strategy’s 491 BTC Mystery Revives Debate Over Saylor’s Sell Policy
Anthropic Classifier
Arena.AI reached a different conclusion because it measured blind human preferences across a wider mix of prompts, including text, vision, document, code and agent tasks. Its early data showed Fable 5 holding mostly steady against the June version.
Frontend code slipped from 1650 to 1623 Elo, which Arena said remained within the confidence interval while votes accumulated. Document performance rose 34 points, expert text gained 25 points and creative writing increased by 9 points.
The split suggests Fable 5 still performs like Fable 5 when prompts reach it. The problem is that security-adjacent coding work can be diverted before the model responds, especially when prompts contain terms such as vulnerability, exploit, hook or fix.
Anthropic has acknowledged that the new classifiers will generate false positives on ordinary coding and debugging work. The company said it will refine the system over time, but it has not given a target date.
The current setup follows a broader safety dispute after Amazon researchers reported a jailbreak that pushed Fable 5 to identify and demonstrate software vulnerabilities. Anthropic’s answer was a conservative classifier, which now appears to block more than the dangerous prompts it was designed to catch.
Read Next: Trump Says He Did Not Know About $1.4B Crypto Income





