The Irony Problem: AI Security Firm Leaked 500K Lines of Its Own Code

Srikanth
By
Srikanth
Srikanth is the founder and editor-in-chief of TechStoriess.com — India's emerging platform for verified AI implementation intelligence from practitioners who are actually building at the frontier....

On the morning of March 31, 2026, Anthropic published a routine update to Claude Code, its widely used AI coding CLI, to the public npm registry. Deep inside version 2.1.88 of the @anthropic-ai/claude-code package was a 59.8 MB file: a JavaScript source map revealing the tool’s complete, unobfuscated TypeScript source code – roughly 512,000 lines across nearly 1,900 files. It was a crucial oversight with possibly far-reaching consequences. To a trained eye, this was a complete blueprint of the product.

The leak was first spotted by security researcher Chaofan Shou, who posted the discovery on X to more than 27 million views within hours, Techcrunch reported. Anthropic’s response came a bit too late; by the time it moved to pull the package, GitHub mirrors had already multiplied into 41,500+ forks – faster than the company could file takedown notices.

What set the incident apart was its contents. Buried inside the leaked code was “Undercover Mode” – a subsystem purpose-built to prevent Claude Code from leaking Anthropic’s internal secrets into public commits when employees deploy the tool on open-source projects. InfoQ and multiple independent technical breakdowns verified that the mode injects instructions into the model’s system prompt blocking it from referencing internal codenames, unreleased version numbers, or the fact that it is an AI at all. Ironically, it was to prevent exactly this kind of exposure that Anthropic built a dedicated secrecy layer for this exact scenario – then exposed the rest of its proprietary codebase in plain text anyway.

 How the leak happened

NodeSource delivers the most granular technical account of this failure, tracing it back to Anthropic’s own infrastructure choices. Anthropic absorbed the Bun team in late 2024 and adopted the runtime as its JavaScript runtime and bundler. That move came with an inherited bug: even when developers explicitly disabled production source maps, Bun’s bundler continued generating them. Even if that bug were the sole cause, a correctly configured .npmignore file would still have blocked .map files from the published package. But Claude Code did not. The result was a simple, preventable failure: not a breach or an exploit, but an omitted exclusion rule that escaped everyone’s attention before publishing.

According to a widely referenced breakdown of the leak, the map file also linked out to a ZIP archive hosted on an Anthropic-owned Cloudflare R2 bucket, accessible to anyone without requiring any credentials. This exposed identical material to anyone, bypassing the npm package entirely.

 What the code revealed

The leak not only exposed the tool’s current architecture but also offered a window into Claude Code’s internal roadmap. The leak was reported in detail by multiple independent technical writeups – including analyses from The New Stack and InfoQ, alongside developer blogs that combed through the codebase and arrived at largely consistent findings.

deep within the code were dozens of feature flags gating unshipped capabilities (estimates across sources range from roughly 40 to over 100, depending on how modules are counted). These unannounced gating capabilities enable capability withheld from the public release.

The leaked material centered on three standout projects, led by KAIROS, which was referenced over 150 times. The unreleased “daemon mode” would let Claude Code run continuously in the background, logging observations and periodically consolidating them into memory through what internal comments describe as a “dreaming” process. ULTRAPLAN would complement this by offloading complex planning to a remote Claude Opus instance for up to 30 minutes before returning the completed plan to the user’s terminal. By contrast, BUDDY was a Tamagotchi-style AI companion pet system with multiple species and rarity tiers – an unlikely social media highlight that users described as “adorable” even amid ethical concerns over Undercover Mode.

The code also referred to the internal model codenames Capybara and Fennec – names that appear to identify unreleased Claude 4.6-generation models. That detail mattered because only days earlier, a separate incident had exposed Anthropic’s internal model naming, according to multiple outlets. In that incident, a content management system misconfiguration exposed internal documents referencing an upcoming, more capable model; Anthropic later confirmed to Fortune that it was testing the model with early customers.

 Anthropic’s response, and the real security risk

Anthropic maintained a brief, consistent public response across all media outlets that received a statement, including CNET: “This was a release packaging issue caused by human error, not a security breach. No sensitive customer data or credentials were involved.” It later withdrew the affected package version and pursued DMCA takedown requests for GitHub mirrors, with a comprehensive public post-mortem yet to be published.

According to security researchers who examined the incident’s aftermath, that framing overlooks the second-order risks that emerged after the leak. The Hacker News and cybersecurity firm Zscaler reported that attackers rapidly registered npm package names closely matching Anthropic’s internal or placeholder packages in an apparent attempt to exploit developer confusion. Analysts identified the tactic as dependency confusion, targeting developers compiling or extending the leaked code. Separately, Zscaler documented fraudulent “leaked Claude Code” GitHub repositories distributing trojanized installers carrying the Vidar credential stealer and the GhostSocks proxy tool – malware that, while unrelated to the original leak, quickly capitalized on the attention surrounding it.

Though unrelated, this incident created a second, independent point of failure. Because Claude Code depends on axios, developers updating the tool during that window risked installing the compromised dependency entirely independent of the source map issue – a coincidence making Anthropic’s supply chain’s bad day considerably worse.

 Why it matters beyond the embarrassment

For enterprise teams evaluating or already running Claude Code in production, the leak brings concrete, if contained, consequences. Both Anthropic and outside researchers agree that no model weights, safety training data, or customer information were exposed – only the client application layer was leaked, not the underlying models it calls. But readable, commented source code makes vulnerabilities meaningfully easier to locate in Claude Code’s permission system, its bash-command validation logic, and its multi-stage context management pipeline, all of which now draw closer scrutiny from defenders and attackers alike than an obfuscated bundle would have invited.

Still, Menlo Ventures data shows Claude Code commanding 54% of the enterprise AI coding market (Neura Market) as of early 2026 – a lead position earned through model quality and agentic depth, not the CLI interface exposed by the leak. Whether a source-map disclosure weakens that position is a separate question from whether it equips rivals to catch up.

The more durable takeaway, though, is the one implied by the “Undercover Mode” irony itself: sophisticated internal controls designed to prevent one category of leak are no substitute for basic release hygiene. A missing line in a config file defeated a purpose-built secrecy subsystem in a single npm publish. For any enterprise shipping AI tooling of its own, the incident is a clear reminder that build-pipeline discipline – exclusion rules, CI checks that fail on stray .map files, manual review before publishing – stays unglamorous, unavoidable, and fully capable of undoing far more sophisticated security investments elsewhere in the stack.

Follow:
Srikanth is the founder and editor-in-chief of TechStoriess.com — India's emerging platform for verified AI implementation intelligence from practitioners who are actually building at the frontier. Based in Bengaluru, he has spent 5 years at the intersection of enterprise technology, emerging markets, and the human stories behind AI adoption across India and beyond.
Leave a Comment