DayBreak is the Avengers Initiative for Cybersecurity. The Problem Is, AI Arms it Both Sides.

By Addy · May 12, 2026

On May 11, 2026, Sam Altman posted eleven words on X: "AI is already good and about to get super good at cybersecurity."

He meant it as a reassurance. Read it a second time.

OpenAI launched Daybreak the same day - a platform combining its language models, Codex as an agentic harness, and security partner integrations to help companies continuously secure their software. The framing is unambiguous: AI for defense, deployed broadly, open to as many companies as possible. The Daybreak page describes a vision of software that is "resilient by design" rather than patched reactively. OpenAI's cyber researcher Fouad Matin said it directly in April: "No one should be in the business of picking winners and losers when it comes to cybersecurity."

Both statements are true and both miss the harder point. The same capability that finds a vulnerability to patch it finds a vulnerability to use it. The same model that generates a patch generates an exploit. The same context window that lets a defender reason across an entire codebase lets an attacker reason across an entire codebase. Daybreak does not change this. It accelerates both sides simultaneously and bets that defenders will benefit more from the acceleration than attackers will.

That is a reasonable bet. It is not a guaranteed one. And the evidence from the past six weeks suggests the bet is being tested in real time, on infrastructure that most of the world's software depends on.

What the Last Six Weeks Actually Looked Like

On March 31, 2026, the axios npm package - a JavaScript HTTP client with over 100 million weekly downloads - was compromised. An attacker hijacked the lead maintainer's npm account and published two malicious versions containing a cross-platform remote access trojan. The attack introduced a phantom dependency that executed a postinstall hook to deliver persistent malware on macOS, Windows, and Linux, then erased its own evidence by replacing its files with clean decoys. The attacker bypassed GitHub Actions' OIDC Trusted Publisher safeguards by manually publishing poisoned versions using a stolen npm token, leaving no trace in the official GitHub repository. The same day the axios attack happened, the Claude Code source code leaked via a misconfigured npm package. Two separate incidents on the same infrastructure on the same day - one deliberate, one accidental.

On April 3, Trend Micro reported that threat actors had been running AI-themed malware lures since at least February 2026 - fake tools and repositories designed to attract developers - with the Claude Code leak accelerating the cycle. Threat actors cycled through fake tools and repositories to attract developer interest, exploiting the trust signals that developer tools carry.

In April, a sustained npm supply chain attack published malicious packages containing multi-stage credential theft pipelines, targeting developers through packages designed to appear as legitimate telemetry modules.

On April 28, Wiz disclosed CVE-2026-3854 - a critical remote code execution vulnerability in GitHub's internal git infrastructure. By exploiting an injection flaw in GitHub's internal protocol, any authenticated user could execute arbitrary commands on GitHub's backend servers with a single git push command, using nothing but a standard git client. The impact on GitHub.com enabled cross-tenant exposure - an attacker reading millions of repositories on shared storage nodes, regardless of organization or user. 88% of GitHub Enterprise Server instances had not patched within weeks of the fix being available.

Notably, CVE-2026-3854 was one of the first critical vulnerabilities discovered in closed-source binaries using AI - highlighting a shift in how these flaws are identified.

That last sentence is the one that matters most. A security firm used AI to find a critical vulnerability in GitHub. The same methodology, applied by a different actor with different intentions, finds a different critical vulnerability in a different piece of infrastructure. The finding technique does not know which side hired it.

The Number That Describes the Problem

In 2020, the average time between a vulnerability being disclosed and an attacker developing a working exploit was over 700 days. By 2025, that number had compressed to 44 days. By 2026, Mandiant's M-Trends report found that time-to-exploit had gone effectively negative - exploits are now routinely arriving before patches, with 28.3% of CVEs exploited within 24 hours of disclosure.

Read that again. 28.3% of publicly disclosed vulnerabilities are being exploited within 24 hours. Not within a week. Not within a month. Within the same day.

The human security researcher who reads a CVE disclosure, understands the vulnerability, writes a proof-of-concept exploit, tests it, and deploys it in under 24 hours is extraordinarily skilled and fast. There are not many of them. The AI model that reads the same CVE disclosure, reasons about the affected code path, generates a working exploit, and tests it is doing the same work - faster, at greater scale, without needing sleep.

Throughout 2024, 2025, and early 2026, the performance of frontier models on SWE-bench - a test of software development capability - went from 33% in August 2024 to just under 81% by December 2025. The same capability that resolves 81% of real GitHub issues can analyze 81% of publicly disclosed vulnerabilities well enough to understand the code path. The overlap between "good at software engineering" and "good at finding exploits" is not total, but it is substantial. The benchmarks are not separated by a moral gap. They are the same skill applied to different inputs.

What Makes AI Different From Every Previous Attack Tool

Every generation of attack automation has been compared to the previous one. Script kiddies running existing exploits. Toolkits that packaged known techniques for non-expert attackers. Botnets that scaled human-designed attacks across millions of infected hosts. Each generation expanded the attacker pool without fundamentally changing the ceiling of what was possible. The most sophisticated attacks still required human expertise to design.

AI changes the ceiling.

A human security researcher has a context window measured in what they can hold in working memory while reading code. A capable language model has a context window measured in millions of tokens. In 2025, malicious npm packages posing as popular libraries included documentation, unit tests, and code structured to appear as legitimate telemetry modules. Static analysis and signature scanners missed them entirely - because the code, likely AI-generated, looked like real software.

That is the specific threat that has no clean precedent. A malicious package that passes code review because it looks like real software - structured, documented, tested, indistinguishable from a legitimate dependency. The detection tools that exist were designed to find things that look wrong. AI-generated malicious code can be designed to look right.

A model with a million-token context window reading a large codebase is not doing what a human attacker does when they scan for vulnerabilities. It is doing something more thorough. A human attacker looks for patterns they recognize, techniques they have learned, entry points that match their mental model of how systems break. A model with sufficient context can hold the entire dependency graph of a large application in active attention, reason about interactions between components that are separated by thousands of lines of code, and identify vulnerabilities in the space between components - where no single piece of code looks wrong but the interaction between pieces creates an exploitable state.

The GitHub RCE that Wiz found using AI is the proof of concept for this. A single git push command was enough to exploit a flaw in GitHub's internal protocol and achieve code execution on backend infrastructure. The flaw was in the interaction between user-supplied input and an internal service header - not in any single function, but in the gap between them. That gap was visible to a model reasoning across the full system in a way it was not visible to the engineers who built each component separately.

The same reasoning that found it defensively can find different gaps offensively. The white-hat researchers who discovered CVE-2026-3854 work for Wiz. The methodology they used is not proprietary to Wiz.

The Distance Between Glasswing and Daybreak

Anthropic's answer to this problem, announced April 7, was Project Glasswing - a coalition of twelve pre-selected technology companies, $100 million in usage credits, a model described as too dangerous to release broadly, and a controlled deployment with explicit safety gates. The justification was that Claude Mythos Preview's cybersecurity capabilities were sophisticated enough that releasing them without restriction would help attackers faster than it helped defenders. The gate was the product.

OpenAI's answer, announced May 11, was Daybreak - open to as many companies as possible, built on the premise that the right response to AI-enabled attacks is AI-enabled defense at scale, deployed before the attackers can consolidate their advantage. The Daybreak page explicitly positions "democratized access" as a design principle. The justification is that you cannot pick winners and losers in cybersecurity - everyone defending infrastructure needs the same tools.

Both positions are internally coherent. The tension between them describes the actual hard problem.

Anthropic's position assumes that the most capable cybersecurity AI is also the most dangerous, and therefore controlled deployment is worth the cost in accessibility. The cost is that most companies - the ones without the relationships or the profile to join a Project Glasswing coalition - defend themselves with inferior tools while sophisticated attackers develop their own capabilities regardless of what Anthropic does with Mythos.

OpenAI's position assumes that broad access to good defensive tools matters more than the marginal additional capability that bad actors gain from the same access. The cost is that the defensive tools and the offensive methodology are the same thing, and broad access means broad access.

Neither company has stated the uncomfortable implication of the other's position clearly. Anthropic has not said that OpenAI is arming attackers. OpenAI has not said that Anthropic is leaving most defenders without adequate tools. But both implications are present, and the infrastructure incidents of the past six weeks suggest that the window in which either position can be fully vindicated is closing.

The Infrastructure That Was Not Designed for This

The GitHub squash merge bug that this publication covered two weeks ago - the one that silently reverted committed code across 2,092 pull requests - is part of the same story as the RCE vulnerability, the axios supply chain attack, and the npm malware campaigns. They are not separate events. They are evidence of a single structural condition: the infrastructure that the world's software runs on was not designed to operate at the speed, volume, or threat level that AI has introduced.

As Chainguard CEO Dan Lorenc observed: "The complexity and scale of vulnerability management has outgrown the capabilities of most organizations to manage on their own."

GitHub is processing peaks of 90 million merged pull requests per period - driven primarily by AI agents opening, reviewing, and merging code faster than the platform's architecture was designed to handle. The merge queue that broke under that load is the same infrastructure that processes every PR. The RCE vulnerability that Wiz found using AI had been present in the codebase for an unknown period before it was discovered. The axios attack succeeded because the attacker bypassed the safeguards that were supposed to catch exactly this kind of supply chain compromise.

These are not edge cases in a stable system. They are symptoms of a system operating past its designed limits in an environment that did not exist when the system was designed.

The developers who installed Claude Code between 00:21 and 03:29 UTC on March 31 - the window when the compromised axios package was live - did not make a mistake. They installed a tool from a trusted publisher using the standard package manager. The npm package they downloaded had been reviewed. The dependency it installed looked like real software. The malware executed on their machines before any human at any organization had time to react.

What Daybreak Actually Is Solving For

Daybreak is not a naive product. The technical capabilities it ships - secure code review, threat modeling, patch validation, dependency risk analysis - are the right tools for the problem that Wiz demonstrated when it found CVE-2026-3854 using AI. If AI is going to be used to find vulnerabilities, then defenders need AI to find them first. The race is real and the tooling gap between sophisticated defenders and everyone else is real.

The part Daybreak does not solve - and cannot solve through product design alone - is the asymmetry between attack and defense. Defense requires finding every exploitable path. Attack requires finding one. A model reasoning across a million tokens of codebase for defensive purposes needs to identify every vulnerable interaction. The same model reasoning across the same codebase for offensive purposes needs to identify one that works. The attacker's problem is strictly easier.

The exploit window - the time between disclosure and active exploitation - has compressed from 700 days to effectively zero for the most critical vulnerabilities. The AI that accelerates patch development also accelerates exploit development. The question is which side can iterate faster given the same underlying capability.

Anthropic's Mythos Preview found thousands of zero-day vulnerabilities across every major operating system and browser. It reported them. The patches are being developed. The same capability, pointed at systems that have not been patched yet, does the same reasoning in the other direction.

Daybreak and Glasswing are both correct that something needs to be done. They disagree about what. The incident record of the past six weeks does not clearly vindicate either position. It suggests that both positions are being tested against an adversarial environment that does not pause while the industry debates the right access model.

The Question Neither Company Is Asking Loudly Enough

Altman's eleven words - "AI is already good and about to get super good at cybersecurity" - describe a trajectory. The question underneath the trajectory is: good for whom, and at what pace on each side?

The answer, based on available evidence, is: for both sides, faster than most defenders are prepared for.

The npm ecosystem that processes billions of package installations per week does not have a real-time AI review layer. The GitHub infrastructure that hosts 420 million repositories had an RCE vulnerability that 88% of enterprise instances had not patched weeks after a fix was available. The developers installing packages from trusted publishers have no reliable signal that the package they are installing was generated by an AI instructed to look legitimate.

Daybreak is the right vision. Broad access to AI-powered defensive tooling, continuously applied, integrated into the development loop from the start. The problem is not the vision. The problem is the deployment gap between the vision and the current state of the infrastructure it is supposed to protect.

The next major supply chain attack will not look like the axios incident. It will look like something nobody has seen before - because the AI designing it will have read every published post-mortem about every previous attack, reasoned across every known detection pattern, and designed something that does not match any of them.

That is not a prediction. It is the logical output of the same capability curve that made SWE-bench scores go from 33% to just under 81% in sixteen months.

The defenders just got better tools. So did everyone else. The race is not new. The speed of it is.

Sources:

Daybreak - OpenAI
Sam Altman on X, May 11, 2026 - X
Cybersecurity in the Intelligence Age - OpenAI
OpenAI rolls out tiered access to advanced AI cyber models - Axios
Axios npm supply chain compromise - Datadog Security Labs
Weaponizing Trust Signals: Claude Code Lures and GitHub Release Payloads - Trend Micro
GitHub RCE Vulnerability: CVE-2026-3854 Breakdown - Wiz Research
2026: The year of AI-assisted attacks - Chainguard
Project Glasswing - Anthropic
An update on GitHub availability - GitHub Blog

Previously on TheQuery: Anthropic Built a Model Too Dangerous to Sell. So It Gave It Away to Fix the Internet. and Vibe Coding Broke GitHub. That Is Not the Surprising Part.