Claude Mythos: what Anthropic's cyber model means, and how to stay ahead of it

Anthropic is about to open its restricted Mythos cyber model to the public. Here's what it actually does, why the 10,000-vulnerability headline deserves scrutiny, and the one shift that matters for the software you run.

Anthropic confirmed on 22 May that it expects to bring its restricted "Mythos-class" models to all customers in the coming weeks, ending a closed beta that, until now, only a dozen named tech giants and around 40 critical-infrastructure organisations could touch. Mythos is an AI model that finds and weaponises software vulnerabilities autonomously, it is genuinely a capability step, and the marketing around it outruns what anyone can independently verify today. The practical takeaway for everyone else is simpler than the headlines suggest, and I'll get to it.

TL;DR

  • Claude Mythos (formally "Mythos Preview") is an Anthropic model built to find and exploit security vulnerabilities. Its existence leaked via a misconfigured data store in March before the official announcement on 7 April.
  • It was kept behind a closed program, Project Glasswing, because Anthropic says no one yet has safeguards strong enough to stop misuse.
  • Anthropic reports 10,000+ high/critical vulnerabilities found in a month. The publicly traceable footprint is about 40 CVEs, only one of them attributed to Glasswing itself.
  • The autonomous-discovery showcase, a FreeBSD bug, has been credibly challenged as something smaller models already find.
  • The AI-driven surge in vulnerability reports started before Mythos. NIST's vulnerability database already gave up enriching every CVE in April.
  • The real bottleneck is not finding bugs. It's patching them: 75 of 530 disclosed critical bugs fixed at last count. That is the shift that matters for you.

What Mythos actually is, and how it slipped out

Mythos became public the way a lot of things do in 2026: by accident. In March, security researchers found roughly 3,000 unpublished Anthropic assets sitting in a publicly searchable, misconfigured data store, including a draft blog post describing a new model tier (codenamed "Capybara" internally) as "by far the most powerful AI model we've ever developed" and "currently far ahead of any other AI model in cyber capabilities." Anthropic called it a configuration error and pulled the data. Three weeks later it published the real thing.

The official announcement is worth reading for one reason: it carries the bylines of named security researchers, not a communications team. That framing is deliberate, and it tells you Anthropic wants this read as research, not a product launch. The claim at the centre of it is blunt. Mythos can identify and then exploit zero-day vulnerabilities "in every major operating system and every major web browser when directed by a user to do so."

The demonstrations behind that sentence are the strongest part of the story, because they are specific and partly verifiable. Anthropic says Mythos surfaced a 27-year-old denial-of-service flaw in OpenBSD's TCP stack, autonomously exploited a 17-year-old remote code execution bug in FreeBSD's NFS server (now CVE-2026-4747), and against Firefox 147 produced working exploits in 181 of 210 attempts where the previous flagship managed two. Mozilla independently confirmed the collaboration: 271 vulnerabilities found, 423 fixes shipped in April against 31 a year earlier, with a false-positive rate under 5%. That is not a press release talking. That is the organisation that ships Firefox describing real bugs it patched.

Project Glasswing: a closed beta for the people who run the internet

Anthropic did not put Mythos on the open market. It built Project Glasswing around it: a restricted program whose launch partners read like a roll call of who keeps global infrastructure running, including AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA and Palo Alto Networks, plus roughly 40 more critical-software organisations. Access is gated to defensive work, priced at $25 and $125 per million input and output tokens, and backed by $100M in usage credits and $4M in donations to open-source security groups.

The stated reason for the wall is unusually candid for a company about to sell you the thing behind it: "At present, no company, including Anthropic, has developed safeguards strong enough to prevent such models from being misused." That is a direct quote from the Glasswing update. So the rollout Anthropic just announced is explicitly of "Mythos-class models," language that implies a safeguarded derivative rather than the exact tool the closed partners have been using. Cloudflare, one of those partners, noted in its own write-up that the same task "framed differently or presented in a different context, could produce completely different outcomes," which is a polite way of saying the guardrails are not consistent yet.

The closed beta has not stayed sealed. Around 23 May, users briefly spotted a claude-mythos-1-preview toggle in the public builds of Claude Code and the new Claude Security product before Anthropic pulled it. And on 21 April, a group reportedly gained unauthorised access to Mythos through a third-party vendor environment by guessing the model's URL from Anthropic's naming conventions. Anthropic said it was investigating and found no evidence its own systems were affected. It never confirmed a breach. Both incidents matter less for what they revealed than for what they signal: a tool this valuable leaks pressure at every seam.

The CVE storm is real. It didn't start with Mythos

The surge in vulnerability reports is real and measurable. CVE submissions grew 263% between 2020 and 2025, the first quarter of 2026 ran roughly a third ahead of the same period in 2025, and FIRST forecasts more than 50,000 new CVEs this year. In April, NIST's National Vulnerability Database admitted it can no longer enrich every CVE and started triaging toward known-exploited and federal software. The institution the entire industry leans on for vulnerability metadata is underwater.

But none of that is downstream of Mythos. The acceleration was already running. Google's Big Sleep found a SQLite zero-day being prepared for exploitation in July 2025 and went on to find 20+ more bugs in widely used libraries. XBOW, an autonomous AI pentester, topped HackerOne's US leaderboard in June 2025 with over a thousand submissions. Google's AI-assisted OSS-Fuzz has been open source since early 2024. A month before Glasswing launched, Linux kernel maintainer Greg Kroah-Hartman told a KubeCon audience that "something happened a month ago, and the world switched" as genuinely useful AI-found bug reports started landing. The same tooling has a darker mirror: curl shut down its bug bounty under a flood of AI-generated junk reports, the noise side of the same coin.

So Mythos is not the origin of the storm. It's an accelerant poured on a fire that was already spreading, and that distinction changes how you should weight the next part.

Where the marketing outruns the evidence

Anthropic's most-quoted number is 10,000+ high or critical vulnerabilities found in the first month, drawn from a 23,019-figure scan across 1,000+ open-source projects. An external firm sampled 1,752 of those findings and confirmed 90.6% as valid, which is a respectable rate. The problem is what you can check independently. VulnCheck's analysis of the public CVE record finds about 40 CVEs crediting Anthropic researchers, and exactly one attributed to Project Glasswing as an entity: the FreeBSD bug. CSO Online's read was that "the publicly attributable impact of Glasswing itself remains limited so far".

There's a fair defence here: CVE assignment lags discovery, embargoes hide work in progress, and Anthropic has promised a fuller accounting in July. The gap between 10,000 found and 40 publicly traceable is partly a timing artefact. It is also exactly the kind of gap that should make you slow down before repeating a vendor's headline number as fact.

The flagship demo deserves the same scepticism. Security researcher Davi Ottenheimer traced CVE-2026-4747 to University of Michigan code from 2000, patched on another branch in 2007, with the fix sitting in Mythos's likely training corpus. He reports that eight much smaller open-weight models all detected the same bug, one of them a 3.6-billion-parameter model at eleven cents per million tokens, and that a third party reproduced a working exploit using the previous public model within hours of the advisory. His conclusion is that this is scaffold-driven discovery against a chosen target, not the end-to-end frontier magic the framing implies.

The most credible counterweight to the hype is the only independent evaluation. The UK's AI Safety Institute (AISI) tested Mythos Preview and found it genuinely strong: 73% on expert capture-the-flag tasks, and completion of a 32-step network-attack simulation in 3 of 10 runs. Then it added the caveat everyone quoting the offensive numbers tends to drop. Its test environments "lack security features that are often present, such as active defenders and defensive tooling," and it "cannot say for sure whether Mythos Preview would be able to attack well-defended systems." The honest summary is that the model is dangerous against small, weakly defended targets where access has already been gained. That is a real threat. It is not the same as "breaks everything."

Even the industry theatre points the same way. Sam Altman called Anthropic's restricted release "incredible marketing", comparing it to selling a bomb shelter after announcing the bomb. Nine days later OpenAI restricted access to its own cyber model, GPT-5.5-Cyber, the same way. And a 25-year security veteran put the deflating version best to Fortune: "We've never had a problem finding vulnerabilities. We find them every day. We actually have a pile of them that we just don't fix."

What the people doing the work are saying

The most damaging review came from someone with direct access. Daniel Stenberg, who maintains curl and got into Glasswing through the Linux Foundation, signed the contract, waited weeks, and was eventually handed a proxy scan instead of the API access he was promised. Of five "confirmed security vulnerabilities" his team triaged for hours, one low-severity finding survived. His verdict was blunt: "the big hype around this model so far was primarily marketing," and "an amazingly successful marketing stunt for sure." He saw no evidence Mythos reports anything "of a novel kind or something totally new," just fresh instances of the bug classes existing tools already catch. Michal Zalewski, the author of the afl fuzzer, put a number on that: "80%+" of vulnerability research "was already automated with fuzzers."

The business-model optics drew the sharpest cynicism. A quip that recurred across the Hacker News, Tildes and Lobsters threads summed up the mood: "We sold you landmines and now you need them removed? Lucky you we also have mine clearance products." Claude Code writes the bug, Claude Security finds it, Claude Code patches it, everyone bills tokens. It's gallows humor, but it captures why a chunk of the community refuses to read this as straightforward good news.

Not everyone is sour, and the dissent is worth taking seriously. Simon Willison, no reflexive Anthropic cheerleader, landed on "I can live with that," arguing the security risks are credible enough that giving trusted teams a head start is a reasonable trade-off, even with the obvious PR-play optics. That's the honest split in the room: most practitioners accept the underlying capability is real, and still distrust the packaging wrapped around it.

The bottleneck was never finding bugs. It's fixing them

The veteran's line about the pile of bugs nobody fixes is the whole game. Anthropic's own numbers prove it: of 530 high or critical bugs disclosed to maintainers, 75 had been patched as of 22 May. A 14% fix rate. Some open-source maintainers asked Anthropic to slow its disclosures down because they had no capacity to keep up. Anthropic's own words: "Progress on software security used to be limited by how quickly we could find new vulnerabilities. Now it's limited by how quickly we can verify, disclose, and patch." Chris Hughes, who writes the Resilient Cyber newsletter, put the structural problem more sharply: "Maintainers aren't asking for better vulnerability reports. They're asking for fewer of them."

Sit with what that creates. A small group of large firms now holds detailed knowledge of unpatched vulnerabilities in software that runs nearly everywhere, while the volunteers who maintain that software cannot fix it fast enough. That is an information-asymmetry problem dressed up as a security win, and to its credit Anthropic loosened the Glasswing NDAs after criticism, explicitly permitting partners to share findings with outside teams and maintainers. Bruce Schneier's framing is the one I'd bet on: a few turbulent years while we move to "a new normal where verification is paramount and software is patched continuously." Systems that are easy to patch and easy to verify (your phone, your browser, managed services) come out ahead. Systems that can't be patched or can't be verified (cheap IoT, abandoned dependencies, unmaintained plugins) are where offence wins.

What this means for the software you actually run

The defensive move is not to panic about Mythos. It's to assume the capability is now permanent and cheap, and to act on the one thing entirely within your control: shrink your own attack surface before someone else's autonomous tooling maps it for you.

The logic is the same one defenders have always used, just with the clock sped up. If a model can scan a thousand open-source projects and surface real bugs, the projects you depend on, your CMS, your plugins, your container base images, your forgotten dependencies, are on that list whether you participate or not. The rational response is to run the same kind of audit on your own stack first. That means knowing exactly what you're running and on which versions, killing dependencies that no longer get security updates (the kernel bug I covered in copy.fail was itself surfaced by AI-assisted analysis, and the unmaintained-dependency angle is the soft underbelly here), and treating patch cadence as a first-class operational metric rather than a chore. For WordPress specifically, this is the same argument I've made before: plugins are the dominant attack surface, and once an attacker gets a foothold they move exactly the way I walked through in this hack post-mortem. What changes is how quickly the foothold gets found.

When is the panic genuinely unwarranted? If you run a hardened, well-maintained stack with no untrusted code execution, a tight dependency list, and a patch process that doesn't skip, the AISI caveat is your friend: these tools are dangerous against weakly defended systems, and you are not one. The asymmetry that should worry you isn't attacker-versus-defender in the abstract. It's well-maintained-versus-neglected. Mythos rewards the people who already did the boring work and punishes the ones who let things rot. That has always been true. It's just that the cost of neglect used to be a slow leak, and now it's a faster one.

The other shift worth internalising, beyond your own perimeter: the EU is already legislating the AI-disclosure layer of all this. If you run a chatbot or AI feature on a site, the AI Act transparency obligation lands in August 2026, and the same regulatory reflex that produced it will eventually reach offensive-capability models too. The companies that treat AI security as a deliberate, documented practice now will spend the next two years adjusting. The ones treating it as someone else's problem will spend those years reacting.

Key takeaways

  • Claude Mythos is a real capability step: an AI that autonomously finds and exploits vulnerabilities, with Mozilla-confirmed bugs to show for it. It is not science fiction and not pure marketing.
  • The 10,000-vulnerability headline is not yet independently verifiable. Roughly 40 CVEs publicly credit Anthropic researchers, and the flagship FreeBSD demo has been credibly challenged.
  • The AI-driven CVE surge predates Mythos. Google Big Sleep, XBOW and OSS-Fuzz were already running, and NIST's NVD is already overwhelmed.
  • The genuine crisis is the patching bottleneck, not the discovery capability. A 14% fix rate on disclosed critical bugs is the number that should worry you.
  • The defensive response is unglamorous and entirely in your control: know what you run, drop unmaintained dependencies, and treat patch cadence as a metric. Mythos rewards maintained systems and punishes neglected ones.

Trying to figure out what AI means for your platform or software?

AI is changing how platforms and software get built, broken and defended. I help teams separate the signal from the hype: what to adopt, what to ignore, and how to stay secure while everyone else is panicking.

Talk through your AI question

Search this site

Start typing to search, or browse the knowledge base and blog.