Cybersecurity researchers aren’t happy with Anthropic Fable 5 guardrails, here is why

HIGHLIGHTS

Some researchers say Fable blocks too many normal requests.

Experts believe the AI's safeguards can be overly strict at times.

Others say strong protections are useful while the technology is still new.

Cybersecurity researchers aren’t happy with Anthropic Fable 5 guardrails, here is why

Anthropic had recently made its latest AI model, Fable 5, public, which is a limited version of its AI model, Mythos, and since its launch, the tool has been attracting attention. While many people stated that the AI tool was helpful, many cybersecurity researchers said that the extra guardrails and safety controls are just making the use of the AI tool difficult even for the basic security-related tasks. While the company says that the restrictions are meant to prevent misuse, several experts argue that the system is blocking harmless requests and creating frustration among legitimate users. They further added that the current safeguards may be too broad, affecting normal work that has little connection to cyber threats or harmful activities.

Digit.in Survey
✅ Thank you for completing the survey!

Fable was launched on Tuesday as a more widely available version of Mythos, a model that Anthropic first introduced in April through a restricted program. Mythos was designed to help protect software and critical infrastructure and was initially made available only to a select group of organisations.

anthropic-alignment-research

According to users, Fable 5 often stops conversations when it detects topics linked to cybersecurity or biology. When this happens, the model displays a message saying its safety measures have flagged the discussion. Anthropic created these controls to reduce the risk of its technology being used to develop malware, attack software systems, or support biological weapons research.

Also read: Apple iPhone 18 Pro: From 2nm A20 chip to smaller dynamic island, here is what leaks suggest

However, many security professionals believe the model is being overly cautious. Valentina Chompie Palmiotti, a security researcher at IBM X-Force, said Fable rejects requests that are only loosely connected to cybersecurity. She noted that even asking the model to read a blog post can trigger the restrictions. Others have also reported similar experiences while running the AI tool. 

Cybersecurity veteran Matt Suiche said the model sometimes treats requests for secure coding guidance as cybersecurity work and limits its responses. He suggested that the filtering appears to rely heavily on certain keywords, causing ordinary software development discussions to be flagged.

Also read: Google loses legal battle in Germany over inaccurate AI Overviews responses

Despite the criticism, some experts believe the strict approach is understandable during the early stages of deployment. Suiche said it is better for companies to be cautious at first and gradually adjust the safeguards as they learn from real-world use.

Anthropic also offers a Cyber Verification Program that gives approved professionals broader access to cybersecurity capabilities. Similar programmes are being introduced across the industry as AI companies attempt to manage the risks associated with powerful security-focused models.

Bhaskar Sharma

Bhaskar Sharma

Bhaskar is a senior copy editor at Digit India, where he simplifies complex tech topics across iOS, Android, macOS, Windows, and emerging consumer tech. His work has appeared in iGeeksBlog, GuidingTech, and other publications, and he previously served as an assistant editor at TechBloat and TechReloaded. A B.Tech graduate and full-time tech writer, he is known for clear, practical guides and explainers. View Full Profile