Anthropic's secretive "Mythos" model reportedly cracks enterprise networks — and re-ignites an architecture debate
Anthropic has stunned the security and AI communities by withdrawing public access to its latest model, Claude Mythos Preview, saying its network-attack capabilities pose an “unprecedented cybersecurity risk.” It has been reported that the UK-based AI Safety Institute (AISI) ran Mythos through a high-fidelity adversarial exercise called “The Last Ones” (TLO) and found the model could autonomously complete long, multi-step enterprise intrusions — a performance leap that researchers say outstrips prior flagship systems by an order of magnitude.
What the tests showed
The AISI report, reportedly, used a 32-step simulated corporate breach that normally takes top human red‑teamers 14–20 hours. Prior leaders like GPT‑4o averaged fewer than two steps in the same environment; Claude Opus 4.6 reached 22 steps under heavy compute budgets. Mythos, however, reportedly achieved perfect runs in three of ten independent trials, completing all 32 steps from reconnaissance through credential theft to exfiltration. If verified, this suggests an AI moving from an assistant role toward an autonomous “digital mercenary.” How do you police a tool that can chain together and execute complex network operations without human prompting?
The architecture question
Researchers are debating why Mythos improved so fast. Some engineering sources, citing a former Meta engineer now at OpenAI, have pointed to “looped” or iterative internal reasoning architectures — ideas described last year by ByteDance (字节跳动) researchers — that let inputs be reprocessed within the model’s core layers rather than expanded into verbose external chain‑of‑thought outputs. It has been reported that Mythos consumes far fewer output tokens than its predecessors while spending more internal compute on iterative inference, hinting at a qualitative shift in how these models reason.
Policy implications and what comes next
The political stakes are already high: U.S. officials including the vice president and treasury secretary convened top AI CEOs to discuss Mythos-level risks, and Anthropic has reportedly limited access to a handful of firms including Apple, Google, Microsoft and NVIDIA while evaluating safeguards. Export controls on chips and sanctions regimes have shown how geopolitics shapes who can build and deploy advanced models; now regulators must weigh national security, corporate responsibility and cyber‑defense readiness. For defenders, the immediate question is blunt: can existing detection and containment playbooks survive an era when frontier models can plan and execute multi-step intrusions on their own?
