Anthropic's Claude Mythos and the new rules of cybersecurity in the AI era

Anthropic's Claude Mythos and the new rules of cybersecurity in the AI era

Erik Davidsson

Head of AI

6 min

Key takeaways of this article

  • AI models have crossed a threshold: they can now find and exploit software vulnerabilities at a scale and speed no human team can match.
  • Attacking is becoming a commodity. The mean time between a vulnerability becoming known and being exploited has gone from 63 days in 2018 to -7 days in 2025.
  • The same capability that lowers the cost of attacking lowers the cost of defending. Organizations that move early gain a structural advantage.

In April 2026, Anthropic announced that Mythos, one of its most powerful AI models, would not be released to the public. The reason behind this decision was that Mythos had proven so effective at finding software vulnerabilities that releasing it broadly would have been plainly irresponsible.

In seven weeks of testing, the model had identified thousands of previously unknown flaws across major operating systems and web browsers, including a 27-year-old vulnerability in OpenBSD, an operating system known primarily for its security.

“During our testing, we found that Mythos Preview is capable of identifying and then exploiting zero-day vulnerabilities in every major operating system and every major web browser when directed by a user to do so. The vulnerabilities it finds are often subtle or difficult to detect.”

Frontier Red Team

- Anthropic

To understand what Mythos represents, the right analogy is not a new weapon. It is an arsenal. A model of this caliber does not need to be paired with a specific tool or exploit: it can decide, on its own, which technique to use against a given target, when to use it, and how to chain it with others. Mythos is like a cannon that picks its own ammunition.

Cybersecurity in the AI era is not a topic that gets resolved with an annual training course. It spans many fronts, and even the most experienced professionals can fall to attacks they did not see coming. Today, executing a cyberattack is a commodity, available to anyone. Every organization needs to be ready.

The first AI-orchestrated cyberattack

Only five months before Anthropic decided to withhold Mythos, the company had already disclosed what happens when a far less capable model is turned against the world.

In November 2025, Anthropic published a report describing the first AI-orchestrated cyber espionage campaign ever documented. The threat actor, attributed with high confidence to a Chinese state-sponsored group, used Claude Code to attack roughly thirty global targets across technology, finance, chemical manufacturing and government.

In November 2025, Anthropic published a report describing the first AI-orchestrated cyber espionage campaign ever documented.


What made this campaign a milestone was its level of autonomy. According to Anthropic's official report, Claude Code executed 80 to 90% of the operation independently, without human intervention, conducting reconnaissance, vulnerability discovery, exploit development, credential harvesting, lateral movement and data exfiltration. Humans intervened only at a few critical decision points per campaign. The model performed thousands of requests per second, a tempo no human team could match. The attackers had bypassed Claude's safety filters by tricking the model into believing it was conducting authorized defensive testing.

The Claude Code attack happened with a tool designed to be helpful, with safety measures actively running, against a frontier system that was not built for offensive use.

The attackers worked with the Claude 4 generation of models available at the time. Months later came Mythos, built with capabilities that exceed those models on the specific axis of vulnerability discovery, which is precisely why Anthropic decided not to release it. The gap between what attackers managed to do with a general-purpose AI tool and what would be possible with a purpose-built offensive arsenal is worth being cautious about.

The new dynamics of AI-powered cyberattacks

Anthropic admitted that Mythos's offensive capabilities were not deliberately trained; rather, they emerged as a consequence of general improvements in coding and reasoning. This is not an isolated case. The UK's AI Security Institute later confirmed comparable capabilities in OpenAI's GPT-5.5 Cyber, establishing that this is a property of frontier AI models.

AI is changing the dynamics of cyberattacks, including the attacker-defender balance. The core barrier preventing malicious actors from conducting cyberattacks used to be knowledge on how to do it, but now that barrier is disappearing. According to the Mandiant M-Trends 2026 report, built on more than 500,000 hours of incident response work in 2025, the mean time to exploit a vulnerability (TTE) has gone negative: from 63 days in 2018 to -7 days in 2025. This means that exploitation now begins, on average, before a patch is even available. On the other hand, the average time to fix a known high or critical vulnerability sits at 39 days, according to Edgescan's 2026 Vulnerability Statistics Report. This asymmetry favors the attacker on a permanent basis.

AI is changing the dynamics of cyberattacks, including the attacker-defender balance.

Of course, the consequences are already reaching companies that are not security firms. After five years as an open-source project, scheduling platform Cal.com moved its core codebase to closed source in April 2026, citing AI-driven vulnerability discovery as the deciding factor. Bailey Pumfleet, the company's co-founder, stated: "we want to be a scheduling company, not a cybersecurity company”. Cal.com is one case, not a universal trend, but it reflects the pressure being felt around this scenario. That forces a question many executives have not yet asked: how does a company prepare for a risk whose cycle is measured in hours?

A narrow window to prepare

Anthropic itself estimates that capabilities similar to Mythos will proliferate to other AI labs within 6 to 18 months. OpenAI has already begun rolling out GPT-5.5 to a restricted set of defenders. The capability is on its way.

This raises (at least) two questions. What is the real window companies have to prepare? And what does “prepared” even mean when the time to exploit is measured in hours?

“Prepared” no longer means what it used to. Modern problems require modern solutions, and this one requires redesigning the operating cycle of security to run at machine speed, with leadership accountability, and with the assumption that the next vulnerability will be found and weaponized sooner rather than later.

There is, however, a silver lining: a structural asymmetry that favors organizations that move early. The same capability that has lowered the cost of attacking has also lowered the cost of defending.

AI labs are collaborating with defenders, producing results at a velocity that was operationally impossible only a year ago. For instance, Mozilla worked with Anthropic to identify 22 vulnerabilities in Firefox in two weeks, 14 of them rated as high severity, and patched them before they reached end users. In addition, Anthropic restricted Mythos to eleven organizations through Project Glasswing, a cybersecurity initiative that uses the model to secure critical software.

The companies that treat AI as an integrated operational capability are finding they can audit their own software, anticipate attacks and respond to incidents at speeds that previously required dedicated cybersecurity teams of significant size. The bottom line: incorporating it before the attacker does.

An industry still writing its own rulebook

The deeper challenge is that nobody has agreed on the rules yet, and recent events suggest agreement is further away than it seemed. Only a few days ago, the White House drafted and then halted a voluntary framework that would have given federal agencies 90 days to review frontier AI models before public release, with no replacement announced. The EU AI Act is phasing in, the UK AI Security Institute conducts independent evaluations, but the United States still has no comparable framework. The labs themselves disagree publicly. OpenAI first criticized Anthropic's restricted-release decision and then announced similar restrictions for its own GPT-5.5 Cyber model weeks later. Every jurisdiction and every lab is figuring out the solutions in real time.

Meanwhile, AI agents are entering critical systems inside enterprises faster than any of this can be regulated. Earlier this year, Gartner identified agentic AI as the number one cybersecurity trend for 2026, driven not by external threats but by internal adoption running ahead of oversight. That raises a third question: is there still time to build governance for AI agents before they are embedded in critical systems? Or has that moment already passed?

We are inside the window, but it is narrowing quickly. The moment to establish cybersecurity practices and build defenses is now, not later under pressure.

The good news: defenders can use the same tools

In a March 2026 advisory written jointly by UK's National Cyber Security Centre and the AI Security Institute, the agencies state the position plainly:

"Since frontier AI capabilities potentially strengthen cyber attackers, cyber defenders must use the same capabilities to drive defensive advantage."

National Cyber Security Centre (NCSC)

- United Kingdom


They identify a clear set of priorities, or “cybersecurity basics”: accurate asset inventories, robust access controls, secure configuration, and comprehensive logging. They also detail three areas where AI is most likely to deliver gains for defenders:

  1. 1.Reducing the attack surface. AI-enabled tools can scan systems continuously at machine speed, identify vulnerabilities and misconfigurations, and map complex attack paths that would take human testers many hours to uncover. Early examples of autonomous remediation, where the system generates and applies code patches, are already in production through initiatives like Google's CodeMender or OpenAI’s Codex Security.
  2. 2.Leveraging AI in threat detection and investigation. AI can help triage alerts, make sense of patterns across diverse logs, and detect slow, subtle intrusions that traditional approaches miss.
  3. 3.Automating mitigation. Blocking traffic flows, quarantining suspicious processes, revoking user access at the speed AI enables.

The advisory closes with a clear takeaway: AI will not compensate for weak security foundations, it will amplify both strengths and weaknesses. Companies that get the fundamentals right first, and then layer AI defensive capability on top of them, will have the upper hand. The window to set those foundations is still open. The work begins now.


Related Content