areza.
Blog
Claude Mythos & Project Glasswing: Anthropic's Most Powerful Model Is Too Dangerous to Release
Automation

Claude Mythos & Project Glasswing: Anthropic's Most Powerful Model Is Too Dangerous to Release

April 8, 2026

TL;DR

  • On April 7, 2026, Anthropic unveiled Claude Mythos Preview, an unreleased frontier model that has already found thousands of zero-day vulnerabilities in every major operating system and web browser — including a bug hiding in OpenBSD for 27 years.
  • Anthropic is not making Mythos publicly available. Instead, it launched Project Glasswing — a $100M initiative with AWS, Apple, Google, Microsoft, NVIDIA, Cisco, CrowdStrike, Broadcom, JPMorgan Chase, Palo Alto Networks, and the Linux Foundation — to use the model defensively before similar capabilities reach attackers.
  • Mythos doesn't just edge out Claude Opus 4.6. On SWE-bench Pro it scores 77.8% vs 53.4% — a 24-point leap on a benchmark designed to be brutally hard.
  • The strategic message for every business owner: the "AI cybersecurity gap" between attackers and defenders is about to widen dramatically. Most SMBs are not ready.
  • This post breaks down what we actually know (with sources), what's hype, and the three concrete things European B2B businesses should be doing right now.

What Is Claude Mythos Preview?

Claude Mythos Preview is a new general-purpose frontier model from Anthropic — a tier above the currently public Claude Opus 4.6 — that the company describes as a general-purpose, unreleased frontier model that demonstrates AI systems have reached coding capabilities surpassing all but the most skilled humans at finding and exploiting software vulnerabilities.

The existence of Mythos wasn't supposed to be public yet. A misconfiguration in Anthropic's content management system in late March accidentally revealed the company was working on a new model tier larger and more capable than Opus. That leak forced the conversation into the open earlier than Anthropic likely planned.

What makes Mythos different from every other frontier launch in the last two years is that Anthropic is explicitly choosing not to release it to the public. Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, told VentureBeat the company does not plan to make Claude Mythos Preview generally available because of its cybersecurity capabilities. This is one of the first times a major AI lab has held back a flagship model citing societal risk rather than just safety policy theater.


The Benchmarks: A Generational Jump, Not an Increment

Anthropic published a benchmark sheet alongside the announcement, and the gap between Mythos Preview and the current public frontier (Claude Opus 4.6) isn't subtle. It's the kind of jump we last saw between GPT-3.5 and GPT-4.

BenchmarkClaude Mythos PreviewClaude Opus 4.6What It Measures
SWE-bench Verified93.9%80.8%Real-world software engineering tasks
SWE-bench Pro77.8%53.4%The hardest tier of coding benchmarks
SWE-bench Multilingual87.3%77.8%Coding across programming languages
CyberGym83.1%66.6%Vulnerability analysis by AI agents
Humanity's Last Exam (no tools)56.8%40.0%Raw reasoning on near-impossible problems
Humanity's Last Exam (with tools)64.7%53.1%Tool-augmented reasoning
BrowseComp86.9%83.7%Multi-step web research

Numbers sourced from Anthropic's own benchmark publication, reported by OfficeChai, which noted that on SWE-bench Pro Mythos beats Opus 4.6 by 24 points and exceeds GPT-5.3-Codex's previous leading score by more than 21 points.

To put that in perspective: on SWE-bench Verified alone, Mythos's 93.9% would sit more than 13 points above any publicly available model on the market today. This isn't a quarterly update. It's a reset of the leaderboard.

One important caveat that doesn't show up in most coverage: Anthropic itself flagged that Mythos performs well on Humanity's Last Exam even at low compute effort, which the company notes as a possible sign of some memorization. Read those HLE numbers with a grain of salt — but the SWE-bench and CyberGym jumps are very real.


Project Glasswing: The $100M Defensive Pact

Instead of shipping Mythos to ChatGPT-style users, Anthropic is routing it into a coalition. Project Glasswing pairs the unreleased Mythos Preview with twelve major technology and finance companies — Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks — to find and patch software vulnerabilities in critical infrastructure before adversaries can exploit them.

The financial commitment matters: Anthropic has extended access to more than 40 additional organizations that build or maintain critical software, committed up to $100 million in usage credits, and donated $4 million to open-source security organizations including $2.5 million to Alpha-Omega and OpenSSF and $1.5 million to the Apache Software Foundation.

Why does the partner list read like a Who's Who? Because cybersecurity companies that have long prided themselves on proprietary AI are publicly admitting that Anthropic's latest release is catching zero-days no other tools have. CrowdStrike and Palo Alto Networks signing on is the part nobody's talking about loudly enough — those are firms whose entire moat is "we have better security AI than you do."


The Zero-Day Discoveries: Bugs That Hid for Decades

Here's where the story stops feeling like a benchmark update and starts feeling like a turning point.

Over the past few weeks, Anthropic used Claude Mythos Preview to identify thousands of zero-day vulnerabilities — flaws previously unknown to the software's developers — many of them critical, in every major operating system and every major web browser.

The headline example: the oldest bug Mythos discovered was a vulnerability in OpenBSD that had remained unknown and unpatched for 27 years, and the model also chained together several flaws in the Linux kernel to gain superuser access. OpenBSD is the operating system whose entire reputation is built on being audited to death. A 27-year-old hole in it is the kind of finding that makes career security researchers stop and stare.

According to 9to5Mac's coverage of the announcement, some of these vulnerabilities had survived decades of human review and millions of automated security tests. That phrase — "millions of automated security tests" — is the part business leaders should sit with for a minute. The existing arsenal of static analyzers, fuzzers, and SAST tools didn't catch these. A general-purpose language model did.


How Anthropic Actually Tested It

The methodology is worth understanding because it's the template every defensive security team will be copying for the next 18 months. Anthropic launches a container isolated from the internet that runs the project under test along with its source code, then invokes Claude Code with Mythos Preview and prompts it with essentially "find a security vulnerability in this program".

From there, the model runs agentically: it reads the code to hypothesize vulnerabilities, runs the project to confirm or reject its suspicions, adds debug logic or uses debuggers as needed, and finally outputs either that no bug exists or a bug report with a proof-of-concept exploit and reproduction steps.

To make this scale, Anthropic added two clever optimizations. First, they ask Claude to rank each file in the project on a 1-to-5 scale for how likely it is to contain interesting bugs, then start agents on the highest-priority files first. Second, they run a separate validator agent at the end whose only job is to confirm whether each reported bug is real and worth caring about, filtering out technically-valid-but-irrelevant findings.

This is a workflow any competent engineering team can replicate today using publicly available models — at lower capability, but the same shape. That's the actionable insight buried in the announcement.


Why Anthropic Is Holding Mythos Back

Two reasons, one stated and one obvious.

The stated reason: dual-use risk. Anthropic plans to launch new safeguards with an upcoming Claude Opus model first, allowing the company to refine those safeguards on a model that does not pose the same level of risk as Mythos Preview. Translation: the same capability that lets defenders patch zero-days lets attackers find them. Until Anthropic has reliable ways to detect and block offensive use, public release would mean handing a master key to anyone with a credit card.

The obvious reason: compute. A draft blog post that leaked in March described Mythos as a large, compute-intensive model that would be expensive for both Anthropic and its customers to serve. The same week Glasswing launched, Broadcom signed an expanded deal giving Anthropic access to about 3.5 gigawatts of computing capacity drawing on Google's AI processors. 3.5 gigawatts is roughly the output of three nuclear reactors. Even with that, serving Mythos to 20 million ChatGPT-style users would be impossible at current efficiency.

So the "we're being careful" framing is real — and it's also a convenient way to launch a model the company couldn't serve anyway.


What This Actually Means for Your Business

This is the part most coverage skips. Strip out the fanboy benchmark worship and the doomer takes, and three things change for European B2B companies starting now.

1. The "AI Cybersecurity Gap" Is About to Widen Dramatically

For two years, the assumption has been that AI helps both attackers and defenders roughly equally. Mythos is the first credible signal that defenders may pull ahead — but only the defenders inside the Glasswing coalition. Everyone else is operating on Claude Opus 4.6, GPT-5, and Gemini 3 Pro — capable models, but not Mythos-class.

If you run a SaaS, an e-commerce store, or any business with a customer database, the practical implication is this: in 12–18 months, the public versions of these models will catch up to where Mythos is today. Your attackers will have access at the same time you do. The window to harden your systems is now, while the asymmetry still favors defenders who act early.

2. Code-Level Security Audits Just Became Affordable

Until this announcement, a serious application security audit cost €15,000–€80,000 and took weeks. The Mythos methodology — containerized code, agentic vulnerability hunting, validator agents — can be run today on Claude Opus 4.6 or Sonnet 4.6 at a fraction of that cost. Not at Mythos quality, but at "catches the obvious stuff that would otherwise become a breach" quality.

For most European SMBs running Next.js, Laravel, or Rails apps, that's the difference between never having a security audit and having one every quarter. This is where AI consultancies — including ours at areza.digital — should be building productized offerings right now.

3. The Software Supply Chain Question Gets Sharper

Project Glasswing's stated focus is open-source and critical infrastructure. Linux Foundation Executive Director Jim Zemlin pointed out that security expertise has historically been a luxury reserved for organizations with large security teams, while open-source maintainers have been left to figure out security on their own.

If you build on open source — and every modern business does — your supply chain is about to get audited by Mythos whether you participate or not. Vulnerabilities in your dependencies will get found and patched faster than you can apply them. Your patch cadence becomes a competitive risk factor. Companies that auto-update dependencies weekly will be safer than companies that update quarterly. That's a process change, not a tooling change, and it costs nothing to implement.


What's Hype, What's Real

A few claims circulating on LinkedIn and X today that are worth flagging carefully:

  • "Mythos is 5x more expensive than Opus 4.6" — Not officially confirmed in any of Anthropic's published materials. Originated from secondary commentary. Treat as speculation.
  • "Zero security training engineer found exploits overnight" — Anecdote reportedly shared in Anthropic's internal materials, repeated in viral posts but not in the official announcement. Plausible, not verified.
  • "USAMO math olympiad 97.6%" and "Cybench 100% solve rate" — These appear in some social posts but could not be confirmed in Anthropic's published benchmark sheet. The verified benchmarks are the ones in the table above.

If you're writing about Mythos for your own audience, stick to the sourced numbers. The verified story is dramatic enough — you don't need the embellishments.


FAQ

What is Claude Mythos Preview? Claude Mythos Preview is an unreleased frontier AI model from Anthropic, more capable than the currently public Claude Opus 4.6. It demonstrates significant improvements in coding, reasoning, and especially cybersecurity vulnerability discovery.

Can I use Claude Mythos? No. Anthropic is not making Mythos generally available. Access is limited to Project Glasswing partners — currently 12 major technology and finance companies plus around 40 additional organizations maintaining critical software infrastructure.

What is Project Glasswing? Project Glasswing is a $100M defensive cybersecurity initiative launched by Anthropic on April 7, 2026, partnering with AWS, Apple, Google, Microsoft, NVIDIA, Broadcom, Cisco, CrowdStrike, JPMorgan Chase, the Linux Foundation, and Palo Alto Networks to use Claude Mythos Preview to find and patch vulnerabilities in critical infrastructure.

How many vulnerabilities did Mythos find? Anthropic reports thousands of zero-day vulnerabilities discovered across every major operating system and web browser, including a 27-year-old bug in OpenBSD and a chain of Linux kernel flaws that allowed superuser escalation.

When will Mythos be publicly released? Anthropic has not committed to a public release date. The company plans to develop new safeguards using an upcoming Claude Opus model first before considering broader Mythos-class deployment.

What should my business do about this? Three things: tighten your dependency update cadence, run vulnerability scans on your own codebase using currently available models (Opus 4.6 or Sonnet 4.6 with the methodology Anthropic published), and treat the next 12–18 months as the window to harden your systems before equivalent capabilities reach attackers.


The Bottom Line

Claude Mythos isn't just another model launch. It's the first time a frontier AI lab has said, on the record, that one of its own models is too capable to release — and meant it enough to give it away to a coalition of competitors instead. Whether you read that as responsible stewardship or strategic theater, the underlying capability is real, and the asymmetry it creates between coalition insiders and everyone else is going to shape European B2B software security for the next 18 months.

If you're a founder, CTO, or operator trying to figure out what to actually do with this — not just nod knowingly on LinkedIn — that's the conversation we have at areza.digital every week. We help European businesses translate frontier AI capabilities into systems that ship, secure, and scale. Book a 30-minute discovery call →


Written by Nikita Janochkin, founder of areza.digital. Sources: Anthropic Frontier Red Team blog, Anthropic Glasswing announcement, VentureBeat, TechCrunch, The New Stack, OfficeChai, IT Pro, 9to5Mac, Axios. Last updated April 8, 2026.

Stop losing leads to a slow website

Book a free friction audit and see exactly where your website is leaking money.

Book a call →