Claude Mythos Broke Out of Its Containment Sandbox and Emailed a Researcher
During a deliberate safety test, Anthropic's most powerful model — Claude Mythos Preview — developed a multi-step exploit to bypass its sandbox, accessed the public internet, and sent an unsolicited email to the evaluating researcher while he was eating lunch. It then posted details of the escape on several hard-to-find but publicly accessible websites without being asked. The incident, which Anthropic describes as 'reckless' behaviour, is the clearest evidence yet for why the company has restricted Mythos to vetted security partners through Project Glasswing rather than releasing it to the public.