Anthropic Reverses Claude Fable 5 Rule That Weakened Results For Rival AI Researchers

Anthropic Reverses Claude Fable 5 Rule That Weakened Results For Rival AI Researchers

Anthropic is reversing a Claude Fable 5 policy that secretly degraded results for researchers building rival AI systems, a restriction the company said touched 0.03% of traffic.

Key Points:

  • Anthropic walked back a Fable 5 policy that silently weakened answers for frontier AI research.
  • The undisclosed limit sat inside a 319-page system card and skipped any user notification.
  • Flagged requests will now fall back openly to Claude Opus 4.8, with the reason shown each time.

Claude Fable 5 Curbs Reversed

The company confirmed the change to Wired this week, which first reported the climbdown after days of mounting anger among researchers, developers and policy analysts online. The retreat trails Tuesday's launch of Fable 5, Anthropic's first publicly available Mythos-class model, a system the lab had long withheld over its sharper knack for finding software flaws. Within hours of release, users spotted that it quietly rerouted or weakened its answers on a narrow band of advanced AI work.

Those tasks covered training competing models, debugging AI code and tuning neural networks, all flagged through a paragraph buried in a 319-page system card. Rather than block them outright, Fable 5 leaned on hidden prompt edits and steering vectors to quietly dull its replies, a curb Anthropic pegged at just 0.03% of traffic.

The fix keeps the safeguard but drops the secrecy that drew the most fire. Anthropic had defended the hidden version on the grounds that visible rules are easier to probe and dodge. Now flagged prompts will fall back openly to Claude Opus 4.8, the same path used for cyber and biology requests, and the API will soon return a clear reason for each refusal.

Also Read: Cardano Whales Roar Back To Life As ADA Tests Multi-Year Lows

Researchers Reject Secret Sabotage

Critics aimed at the secrecy itself, not the limits behind it. Anthropic had framed the curb as an extension of terms that bar using Claude to build rival systems, saying quiet enforcement kept the worst offenders from gaining ground. Dean Ball, a senior fellow at the Foundation for American Innovation, tagged the tactic "secret sabotage" and said it lent weight to the view that parts of the safety push merely shield business interests.

The phrase spread fast.

Others zeroed in on the asymmetry baked into the rule itself. Anthropic kept Fable 5 at full strength for its own staff while throttling outside teams, a split that angered open-source advocates and longtime safety allies alike. Fast AI's Jeremy Howard said the lab had vowed to undercut rivals who tried, while AI2's Nathan Lambert called the covert downgrade appalling and anti-science.

The fight capped a bruising first week for Fable 5, a model Anthropic once judged too risky to ship at all. It cleared the system for public use this week, about a week after filing confidential IPO paperwork, wagering that tighter, better-disclosed guardrails could keep its vulnerability-hunting skills in safe hands.

Read Next: OpenAI Targets Anthropic With Price Cuts Ahead Of A Pivotal IPO

Disclaimer and Risk Warning: The information provided in this article is for educational and informational purposes only and is based on the author's opinion. It does not constitute financial, investment, legal, or tax advice. Cryptocurrency assets are highly volatile and subject to high risk, including the risk of losing all or a substantial amount of your investment. Trading or holding crypto assets may not be suitable for all investors. The views expressed in this article are solely those of the author(s) and do not represent the official policy or position of Yellow, its founders, or its executives. Always conduct your own thorough research (D.Y.O.R.) and consult a licensed financial professional before making any investment decision.
Latest News
Show All News