April 18, 2026 ChainGPT

Public LLMs Reproduce Anthropic Mythos Bugs — Crypto Libraries Exposed for Under $30

Public LLMs Reproduce Anthropic Mythos Bugs — Crypto Libraries Exposed for Under $30
Headline: Locked “Mythos” No Longer Unique — Researchers Reproduce Anthropic’s Dangerous Bug Finds Using Public Models for Under $30 When Anthropic launched Claude Mythos earlier this month, the company kept the model behind a tightly vetted coalition of tech firms and warned it was too risky for general release. The announcement prompted emergency talks — reportedly involving Treasury Secretary Scott Bessent and Fed Chair Jerome Powell with Wall Street CEOs — and revived the buzzword “vulnpocalypse” across security circles. Now a team from Vidoc Security has muddied the narrative: the core discoveries Anthropic showcased can be reproduced with public models and off-the-shelf tooling. What Vidoc did - The researchers used widely available models (GPT-5.4 and Claude Opus 4.6) inside an open-source coding agent called opencode — no private Anthropic stack, no Glasswing invites, just public APIs and open tooling. - They targeted the exact cases Anthropic highlighted: a server file-sharing protocol; the networking stack of a security-focused OS; FFmpeg video-processing code used across media platforms; and two cryptographic libraries (including wolfSSL) used in signature verification and digital identity checks. - Both GPT-5.4 and Claude Opus 4.6 reproduced two of the bug cases in all three runs each. Claude Opus 4.6 independently rediscovered an OpenBSD bug three times; GPT-5.4 failed on that one. Some findings were partial — models flagged the right code surface but didn’t always identify the exact root cause. - Cost: every file scan stayed under $30, meaning the team replicated Anthropic’s vulnerability discoveries for very low expense. How they did it - The workflow mirrored Anthropic’s own public description: give a model a codebase, let it explore, parallelize attempts and filter for signal. Vidoc built a planner/detector pipeline in open tooling: - A planning agent split files into chunks and assigned line ranges automatically. - Detection agents analyzed each chunk and cross-referenced related files to confirm or refute findings. - The researchers emphasize the chunking and orchestration were algorithmic, not manually curated — an important factor shaping what each detection agent saw. What the replication did — and didn’t — show - The big takeaway: public models can already reduce the search space, surface real leads, and sometimes recover full root causes in battle-tested code. As Vidoc’s Dawid Moczadło put it, “The economics of vulnerability discovery are changing.” - But there’s a gap. Anthropic’s Mythos reportedly went further in at least one case: beyond spotting a FreeBSD bug, it produced a working attack blueprint showing how an attacker could chain code fragments across network packets to seize remote control. Vidoc’s models found the flaw but didn’t build that exploit chain. In short: finding bugs is getting cheaper; constructing reliable, weaponized exploits — and validating them — remains harder. Broader implications - Anthropic itself acknowledged that Cybench (the benchmark used to measure cyber-risk) “is no longer sufficiently informative of current frontier model capabilities,” and predicted similar capabilities would spread across labs in 6–18 months. Vidoc’s study suggests the discovery side of that spread may already be happening outside gated programs. - For crypto and security-sensitive systems, this matters. The study included cryptographic libraries used for digital signatures and identity verification — components central to wallets, blockchains, authentication, and other crypto infrastructure. If public LLMs can cheaply surface vulnerabilities in these libraries, defenders must shift resources toward rapid validation, hardened supply-chain checks, and automated verification pipelines. - Vidoc published full prompt excerpts, model outputs and a methodology appendix on their site, making the experiment reproducible. Bottom line Anthropic’s Mythos may have been framed as a sealed, uniquely risky model, but Vidoc’s work shows the ability to discover security-relevant bugs is now accessible with public models and open tools — and at low cost. The competitive advantage is shifting away from mere model access: the hard work is validation, exploitation engineering, and turning leads into trusted security fixes. For the crypto sector, that means accelerating robust audits and automated verification for any code that handles signatures, keys or identity. Read more AI-generated news on: undefined/news