Anthropic Restores Global Access to Fable 5 After US Government Ban

Anthropic has officially resumed the worldwide rollout of Fable 5, its most powerful AI model, following a two-week suspension mandated by the US government. The ban was triggered by a critical security finding involving a jailbreak vulnerability that allowed the model to bypass established safety guardrails.

The Vulnerability: From Defensive Research to Security Risk

The sudden restriction stemmed from a security report by Amazon researchers who successfully bypassed Fable 5’s safety protocols. The researchers discovered that the model could identify specific software vulnerabilities and, in one notable instance, generate functional code to exploit them.

While Anthropic characterized this as an "edge case" involving routine defensive cybersecurity work, the potential for misuse necessitated a joint investigation between the company and US government agencies. Interestingly, the investigation revealed that the ability to identify these flaws wasn't unique to Fable 5; other models, including Claude Opus 4.8, GPT-5.5, and Kimi K2.7, exhibited similar capabilities. Even smaller models like Claude Haiku 4.5 produced the same exploit results during testing.

Implementing New Safety Classifiers and the "False Positive" Trade-off

To remediate the issue, Anthropic has deployed an improved safety classifier designed to block the specific exploitation technique identified in the Amazon report with over 99% accuracy. When a user's request triggers this new layer of defense, they receive a notification, and the query is automatically rerouted to the older, more restricted Claude Opus 4.8 model.

However, this enhanced security comes with a functional cost. Anthropic admitted that the new classifier tends to flag harmless requests more frequently during standard coding and debugging tasks. This "safety margin" creates a tension between robustness and usability—a recurring challenge in frontier model deployment where preventing dangerous outputs often leads to increased "refusals" of legitimate developer queries.

A Push for Industry Standards and Government Oversight

The Fable 5 incident has accelerated Anthropic’s push for formalized, industry-wide safety standards. The company is currently collaborating with Amazon, Microsoft, and Google through the "Glasswing" program to build a framework for rating jailbreaks and triggering standardized countermeasures. To bolster this, Anthropic has launched a dedicated 24/7 monitoring team and a new HackerOne program to incentivize security researchers to report cyber-related jailbreaks.

Furthermore, Anthropic is advocating for "strong regulation" applied equally to all frontier model developers. By offering government partners pre-release access to security-sensitive models and committing significant compute for joint research, Anthropic is positioning itself as a leader in the movement toward transparent, government-aligned AI oversight.

Key Takeaways

  • Restored Access: Fable 5 is available again via Claude.ai, Claude Code, and Claude Cowork, with Pro, Max, and Team plans receiving access through July 7.
  • New Defense Layers: Anthropic implemented a safety classifier that blocks 99% of the identified exploit technique, though it may increase false positives in coding workflows.
  • Collaborative Security: Anthropic is partnering with major tech players and the US government to establish shared industry standards for monitoring and responding to frontier model jailbreaks.