Anthropic's Claude Mythos reportedly unearths tens of thousands of critical bugs — now humans are struggling to keep up

A shock to the security ecosystem

It has been reported that Anthropic's next‑generation model, Claude Mythos Preview, was deployed in a secret operation called Project Glasswing and, within 30 days, scanned more than 1,000 core open‑source projects to flag what the company says were 23,019 vulnerabilities — including roughly 6,202 rated high or severe. Cloudflare, Mozilla and OpenBSD are among the vendors cited; Cloudflare reportedly found 2,000 issues (400 high/severe) in core systems, Mozilla saw 271 high‑severity flaws in Firefox testing, and OpenBSD yielded a bug allegedly hidden for 27 years. The scale is staggering. Who will fix all of them?

An AI that finds and even weaponizes bugs

Reportedly, Mythos did more than spot flaws: in at least one case involving the wolfSSL cryptography library it allegedly generated exploit code capable of forging certificates, and in another instance the model helped intercept a $1.5 million fraud attempt by recognizing an anomalous transaction chain in real time. Anthropic says it cross‑validated findings with six independent security firms; those reviews reportedly produced a 90.6% true‑positive rate and confirmed 1,094 iron‑clad high or severe defects. For defenders, the bottleneck has flipped — discovery is almost instantaneous, patching is not.

Industry strain and ethical safeguards

The report says open‑source maintainers have been overwhelmed, asking Anthropic to slow disclosure because project teams cannot patch fast enough; of 1,129 reported vulnerabilities only 75 high‑severity fixes had been implemented at the time of reporting. Anthropic has also permitted vetted security teams to relax some safety constraints on Claude for legitimate red‑team and penetration testing, a move that aims to accelerate defensive work but raises questions about control and misuse. Is automating vulnerability hunting without a matching surge in remediation resources responsible?

Geopolitics and policy questions

It has been reported that the operation and its results are already reverberating beyond technical circles into policy discussions. Advanced models that can both find and craft exploits are dual‑use: they strengthen defenders but could empower attackers and complicate export‑control and sanctions regimes. Western cloud and software vendors, governments and open‑source communities now face a fast‑moving dilemma — how to harness powerful AI for secure infrastructure while preventing its misuse across borders and in an increasingly fraught geopolitical landscape.