Category: Anthropic
Anthropic says its own new model is too dangerous for the public — but not these Big Tech companies

Anthropic is sending out a warning that its artificial intelligence model is sophisticated enough to undo decades of research.
The company operates Claude, the AI chatbot that has been ripped off and turned into a free, public model, and is hoping to get together with a consortium of tech companies to button up the security measures ahead of its release.
‘It has found vulnerabilities, and in some cases crafted exploits.’
Anthropic’s Mythos model of Claude AI will only be available to 40 select companies to be used for the power of good, the company claims.
It represents “the starting point for what we think will be an industry change point, or reckoning, with what needs to happen now,” said Logan Graham, head of Anthropic’s vulnerability testing team.
The company fears that its new AI model is so good at finding cracks in cybersecurity that it must only be shared with companies it deems capable and responsible enough to prepare for possible attacks when Mythos goes public.
“This model is good at finding vulnerabilities that would be well understood and findable by security researchers,” Graham said. “At the same time, it has found vulnerabilities, and in some cases crafted exploits, sophisticated enough that they were both missed by literally decades of security researchers, as well as all the automated tools designed to find them.”
RELATED: How to power the AI race without losing control
Samyukta Lakshmi/Bloomberg/Getty Images
Anthropic will reportedly commit up to $100 million in credits for the project, meaning the amount of money it would typically charge for such a volume of its chatbot’s usage.
Labeled Project Glasswing, the initiative to shore up cybersecurity will grant Mythos access to handpicked companies chosen largely from Big Tech like Amazon, Apple, Google, and Microsoft. The group is rounded out by internet infrastructure and cybersecurity giants like Broadcom, Cisco, CrowdStrike, Nvidia, and Palo Alto Networks, along with financial titan JPMorgan Chase and key open-source nonprofit the Linux Foundation.
This is not the first time an AI company has warned its product is too dangerous for the public, and looking back, readers can gauge whether or not Claude may be as dangerous as its creators purport it to be.
In 2019, OpenAI sent out a warning ahead of its release of GPT-2, claiming that its capabilities — now vastly eclipsed by later models — could be used to mass-produce propaganda or misleading text.
As Wired reported at the time, OpenAI said GPT-2 was too risky to be released to the general public.
RELATED: Claude, Anthropic’s AI assistant, slammed by Elon Musk for anti-white responses to simple prompts
Claude has been in the news for alleged missteps, leaks, and accidental postings throughout the past year, and while it may not be a household name yet, it has raced its way through the tech sector as a go-to for “agentic” work building software, apps, and even companies.
In addition to its model being open-sourced and used by the general public for free, the company has been noted for “accidental” postings of its own code.
Anthropic “accidentally uploaded a file to a public repository that’s just meant to help developers understand how to use their product” and “exposed some of the source code of Claude,” reporter Aaron Holmes explained recently.
Proprietary information was further leaked in another alleged accidental posting, this time through a blog draft that revealed “internal source code.”
The company seems poised for consistent marketing battles, both willing and unwilling, from its high-stakes lawsuit against the federal government labeling it a supply chain risk to the blowback it has received from putting a woman closely linked to the cultish Effective Altruism movement in charge of its AI’s “Constitution.”
Like Blaze News? Bypass the censors, sign up for our newsletters, and get stories like this direct to your inbox. Sign up here!
‘Unprecedented’: AI company documents startling discovery after thwarting ‘sophisticated’ cyberattack

In the middle of September, AI company and Claude developer Anthropic discovered “suspicious activity” while monitoring real-world cyberattacks that used artificial intelligence agents. Upon further investigation, however, the company came to realize that this activity was in fact a “highly sophisticated espionage campaign” and a watershed moment in cybersecurity.
AI agents weren’t just providing advice to the hackers, as expected.
‘The key was role-play: The human operators claimed that they were employees of legitimate cybersecurity firms.’
Anthropic’s Thursday report said the AI agents were executing the cyberattacks themselves, adding that it believed that this is the “first documented case of a large-scale cyberattack executed without substantial human intervention.”
RELATED: Coca-Cola doubles down on AI ads, still won’t say ‘Christmas’
Photo by Samuel Boivin/NurPhoto via Getty Images
The company’s investigation showed that the hackers, whom the report “assess[ed] with high confidence” to be a “Chinese-sponsored group” manipulated the AI agent Claude Code to run the cyberattack.
The innovation was, of course, not simply using AI to assist in the cyberattack; the hackers directed the AI agent to run the attack with minimal human input.
The human operator tasked instances of Claude Code to operate in groups as autonomous penetration testing orchestrators and agents, with the threat actor able to leverage AI to execute 80-90% of tactical operations independently at physically impossible request rates.
In other words, the AI agent was doing the work of a full team of competent cyberattackers, but in a fraction of the time.
While this is potentially a groundbreaking moment in cybersecurity, the AI agents were not 100% autonomous. They reportedly required human verification and struggled with hallucinations such as providing publicly available information. “This AI hallucination in offensive security contexts presented challenges for the actor’s operational effectiveness, requiring careful validation of all claimed results,” the analysis explained.
Anthropic reported that the attack targeted roughly 30 institutions around the world but did not succeed in every case.
The targets included technology companies, financial institutions, chemical manufacturing companies, and government agencies.
Interestingly, Anthropic said the attackers were able to trick Claude through sustained “social engineering” during the initial stages of the attack: “The key was role-play: The human operators claimed that they were employees of legitimate cybersecurity firms and convinced Claude that it was being used in defensive cybersecurity testing.”
The report also responded to a question that is likely on many people’s minds upon learning about this development: If these AI agents are capable of executing these malicious attacks on behalf of bad actors, why do tech companies continue to develop them?
In its response, Anthropic asserted that while the AI agents are capable of major, increasingly autonomous attacks, they are also our best line of defense against said attacks.
search
categories
Archives
navigation
Recent posts
- Maine Democrat Graham Platner Vows To Work With ‘Ron Paul,’ Thomas Massie if Elected to Senate: ‘Very Much Aligned’ April 13, 2026
- Dobol B TV Livestream: April 13, 2026 April 13, 2026
- LIVE UPDATES: Conflict in the Middle East (April 13, 2026) April 13, 2026
- Lebanon PM says he is working to get Israeli troop withdrawal April 13, 2026
- US to begin blockade of Iranian ports Monday –military April 13, 2026
- Hungary”s Orban concedes landmark defeat to centre-right opposition April 13, 2026
- Iran in crisis as US talks collapse, Mojtaba’s ‘mafia’ regime blocks Khamenei burial: analyst April 13, 2026







