// Reading nowStart

The US Government Is Now Approving AI Models Before They Ship. The Question Nobody Is Asking Is Why Not Fix the Software Instead.

By Addy · June 27, 2026

On June 12, the US Commerce Department issued an export control directive requiring Anthropic to suspend access to Claude Fable 5 and Mythos 5 for every foreign national, including foreign Anthropic employees. Anthropic said the only immediate way it could comply was to disable both models for every customer on Earth.

On June 26, the Trump administration asked OpenAI to restrict GPT-5.6, three models named Sol, Terra, and Luna, to approximately twenty government-vetted partners before any broader rollout. OpenAI complied while saying this should not become the long-term default.

Hours later, the government partially reversed the Anthropic shutdown. Commerce cleared Mythos 5 to return for a limited list of approved cyber defenders, infrastructure providers, and their foreign employees. Export controls remain in place for everyone outside that list, and Fable 5 remains restricted.

That update does not weaken the story. It clarifies it.

The US government is no longer only setting rules for what frontier AI companies may export. It is deciding which models may operate, which organizations may use them, and which customers appear on the approved list. The model may already exist. The lab may believe its safeguards are sufficient. The product may already have shipped. Access now depends on government review.

Two frontier model families gated within fourteen days. The same justification both times: these models have advanced cybersecurity capability. They can find software vulnerabilities at a speed and scale that humans cannot match. That capability is too dangerous to distribute broadly before the government has assessed it.

That argument is coherent. It is also incomplete. And the question it avoids is the one that deserves the loudest attention: if AI models are dangerous because the software infrastructure underneath them is full of exploitable vulnerabilities, why is the response to gate the AI rather than fix the software?

What the Government Is Actually Saying

The official justification for both restrictions is specific and consistent.

Officials are concerned that GPT-5.6 Sol and Mythos-class models can identify software vulnerabilities and help navigate complex attack chains with less human effort. Anthropic said the June 12 directive followed government concern about a possible Fable jailbreak. OpenAI said Sol is better at helping people find and fix vulnerabilities than carrying out end-to-end attacks, but acknowledged that unforeseen risks could emerge when the model is combined with other tools.

The pattern suggests pre-release government review is becoming a recurring feature of frontier model launches rather than a one-off event. Three indicators point in that direction: government briefing on model capabilities before launch, customer-by-customer access during preview, and formal coordination between AI labs and the Office of the National Cyber Director and Office of Science and Technology Policy before public availability.

The June executive order described participation in federal model vetting as voluntary and called for a formal process that has not yet been fully built. In practice, the current process is already consequential. OpenAI limited GPT-5.6. Anthropic disabled Fable 5 and Mythos 5, then received permission to restore Mythos only to entities named in a government annex. Commerce reserved the right to alter that list at any time.

OpenAI says GPT-5.6 Sol is competitive with Mythos Preview while using roughly a third of the output tokens. That efficiency claim matters because it means the capability the government is concerned about is becoming cheaper and more accessible with every generation, not more restricted. Gating one model buys time. It does not change the trajectory.

The government's argument is essentially this: the software infrastructure that runs the world's critical systems is brittle, contains vulnerabilities that have existed for years, and is not being patched fast enough. A model that can find and chain those vulnerabilities at AI speed is a force multiplier for whoever uses it offensively. Until defensive infrastructure catches up, the model should not be broadly available.

This is a reasonable argument. The premise is accurate. The conclusion is where the disagreement begins.

The Software Is Old. Very Old.

Here is the premise the government's argument is built on, stated plainly: the critical software infrastructure that runs hospitals, power grids, financial systems, government networks, and the internet's core protocols is full of vulnerabilities that have been there for years, sometimes decades, and are not being fixed.

This is not speculation. It is a documented condition with a paper trail going back to the earliest era of networked computing.

Log4Shell was disclosed in 2021 after vulnerable behavior had sat inside a widely deployed logging library for years. The OpenSSL Heartbleed bug disclosed in 2014 had been present for roughly two years before anyone found it, in a library responsible for encryption across a significant portion of internet traffic. The SolarWinds compromise gave a foreign state actor prolonged access to government and enterprise networks through a software supply chain so complex that thousands of organizations did not know they were running the affected code.

CVE-2026-3854, the GitHub remote code execution vulnerability TheQuery covered in April, was found in the git push pipeline after it had been present for an unknown period. The npm Shai-Hulud campaign ran across multiple waves and turned trusted package infrastructure into a malware distribution layer. DeepSWE found verifier infrastructure accepting wrong implementations 8.5% of the time and rejecting correct ones 24% of the time on a benchmark the AI industry relied on to measure coding capability.

The pattern is not a series of isolated incidents. It is a structural condition. Software accretes complexity faster than it accretes security. Dependencies compound. Legacy code survives because replacement is expensive and risky. Security patches get deferred because they require downtime in systems that cannot afford downtime. The result is an infrastructure layer that looks modern from the outside, cloud-native, containerized, API-driven, and contains code written decades ago at its foundations.

The government's concern about Mythos-class AI is that it can inspect far more of this system at once than a human researcher can. A model with a million-token context window can reason across large dependency graphs and look for interactions between components separated by thousands of lines of code. The GitHub RCE that Wiz found using AI was in the gap between user-supplied input and an internal service header, not in any single function but in the interaction between them.

The government looked at this capability and decided to gate the model. The question is whether that decision addresses the structural condition or merely delays the moment when someone else, with a different model or the same techniques, does the same reasoning without asking permission.

The Gatekeeping Argument and Its Limits

The case for restricting frontier AI before broad release is not stupid. It deserves engagement rather than dismissal.

The export-control logic is real. A model that can chain zero-day vulnerabilities in major operating systems provides genuine offensive uplift to any state or non-state actor that gets access to it. The difference between a sophisticated attacker without Mythos-class AI and one with it may be the difference between months of human expert work and days of AI-assisted research. The Calif team's Apple M5 bypass happened in less than a week. The gap is real, and the government is correct to take it seriously.

The controlled-deployment model has precedent. Nuclear technology is not freely distributed. Certain biological research requires biosafety classification. Export controls on advanced semiconductors can slow adversaries' access to compute. There is a legitimate tradition of restricting dangerous dual-use technologies while defensive infrastructure catches up.

OpenAI itself says GPT-5.6 Sol introduces an ultra mode that uses coordinated subagents to solve highly complex tasks. A multi-agent system designed for complex task resolution, with advanced cybersecurity capability, is not a consumer product in the same category as a better autocomplete. The government's instinct that this requires scrutiny before broad release is not irrational.

But here is where the argument runs into its own limits.

Restricting Fable 5 and GPT-5.6 Sol does not restrict the underlying techniques. It restricts access to those specific models. DeepSeek V4 is MIT-licensed and available on Hugging Face. GLM 5.2 is MIT-licensed and beats GPT-5.5 on SWE-bench Pro in its published comparison. Other foreign and open-weight models are improving on timelines that US access-control policy does not govern.

None of these models needs to match Mythos exactly to weaken the gatekeeping logic. They only need to provide enough uplift for vulnerability discovery that restricting one American endpoint no longer restricts the capability globally.

The argument assumes that controlling US-jurisdiction frontier models meaningfully controls access to advanced cyber capability. That assumption was more defensible in 2023, when American labs held a clear lead. It is less defensible in June 2026, when capable open-weight and foreign models sit much closer to the gated frontier.

Fix the Software

Here is the argument the government's framework does not engage with, stated as directly as possible.

The reason Mythos-class AI is dangerous for cybersecurity is that the software infrastructure it would be used against is full of vulnerabilities that have existed for years and are not being fixed at the speed required. The AI did not create those vulnerabilities. The AI can find them faster than humans can. Gating the AI does not patch the vulnerabilities.

Every day that vulnerabilities like CVE-2026-3854 remain in production systems, every day that package ecosystems operate without reforms that make Shai-Hulud-class attacks harder, every day that legacy government systems depend on unsupported software, is a day that the threat grows independently of what any AI lab ships.

The question is not whether dangerous AI should ever be gated. The question is whether gating dangerous AI while leaving the vulnerable infrastructure it would exploit unchanged makes that infrastructure more secure or merely delays the acknowledgment that it is not.

OpenAI's Daybreak initiative and Anthropic's Project Glasswing were built on the opposite premise: use the AI that finds vulnerabilities to find them defensively, report them to vendors, and get them patched before attackers find them. OpenAI's June 22 Daybreak update describes Codex Security, GPT-5.5-Cyber, a cyber partner program, and Patch the Planet, an initiative with Trail of Bits, HackerOne, Calif, and open-source maintainers to move from vulnerability findings to deployed fixes.

That is the part of this debate that matters most. OpenAI says the bottleneck has shifted from finding vulnerabilities to patching them. If that is true, then a policy focused mainly on restricting discovery capability is acting on the previous bottleneck.

Project Glasswing had expanded to roughly 150 organizations before the June 12 shutdown. Commerce has now allowed Mythos 5 to return only for an approved subset. The vulnerable infrastructure that hundreds or thousands of qualified security teams could be auditing is still being audited by a small, government-selected group.

Three indicators deserve attention: whether the US formalizes review procedures under the new cyber executive order, how quickly the approved-partner lists expand, and how fast OpenAI and Anthropic transition from staged access to broader availability.

The most important indicator is the one those lists do not include: how many critical vulnerabilities get patched during the preview period. That number is the real test of whether the policy is making the world safer or merely making the most capable defensive tools harder to access.

Who the Restriction Actually Affects

The initial partner list for GPT-5.6 contains approximately twenty organizations. Their participation was shared with and approved by the government. They are, by definition, large enough and connected enough to be in the room when these decisions are made.

The developer at a midsize security firm auditing client infrastructure is not necessarily in the room. The university researcher building tools to help hospitals secure patient-data systems is not necessarily in the room. The engineer at a healthcare startup trying to comply with HIPAA while defending a small team is not necessarily in the room. The independent penetration tester finding vulnerabilities in small-business websites that would otherwise never be audited is not necessarily in the room.

These are the users who lose access when a model is gated. A sophisticated state-level adversary may have its own research programs, compute, access to foreign alternatives, and a timeline that does not depend on a ChatGPT subscription or an Anthropic API key.

The restriction widens the gap between what large vetted institutions can do with AI and what everyone else can do. In cybersecurity, that gap can be dangerous. The hospitals attacked by ransomware are often not large institutions with Glasswing access. They are organizations running old imaging systems because replacement is expensive and downtime is unacceptable. The midsize manufacturer hit by a supply-chain attack may have three IT staff and no dedicated security team.

Those are the organizations that need AI-assisted security audits most. Those are also the organizations least likely to appear on an approved list.

What a Better Policy Looks Like

The government's instinct, that Mythos-class AI requires scrutiny before broad release, is not wrong. The implementation, gate the model, approve access customer by customer, restrict access through an ad hoc process, is the wrong instrument for the whole problem.

A better policy would include mandatory disclosure of critical vulnerabilities found by AI systems, with defined timelines for vendor response before public disclosure. This is the responsible-disclosure framework Glasswing uses, applied at scale rather than limited to a small group. Software vendors whose products run in critical infrastructure would have a defined window to ship a patch before the finding becomes public. The AI tool can be available to verified defenders while the vulnerability information is controlled.

A better policy would require security audits for software running in critical infrastructure, using AI-assisted tools, with government funding for organizations that cannot afford commercial security teams. A rural hospital running outdated systems gets an AI-assisted security audit the same way it gets inspections for other health and safety requirements. The audit is subsidized. The finding triggers a remediation requirement. The AI is the tool, not the target of restriction.

A better policy would invest in open-source security infrastructure: package signing, provenance attestation, reproducible builds, supply-chain monitoring, and maintainers responsible for software the entire economy depends on. OpenAI's Patch the Planet effort is directionally right. It is still tiny relative to the size of the dependency ecosystem.

None of these policies requires pretending access controls have no role. They change what the policy optimizes for. Instead of measuring success by how few people can reach the model, they measure success by how quickly discovered vulnerabilities become deployed patches.

The Precedent That Is Being Set

The era when the most capable AI systems arrived as ordinary consumer product drops may be ending faster than the public understands. Frontier models are becoming infrastructure, and Washington has begun treating access to them as a national-security decision rather than a software subscription.

The precedent being set in June 2026 is that frontier AI capability is an asset the government can review, gate, and approve on its own timeline. OpenAI's statement that this should not become the long-term default is a preference, not a veto. The government asked. OpenAI complied. Anthropic disagreed publicly with the June 12 order. It complied too.

The fact that Mythos 5 returned on June 27 does not undo the precedent. It reinforces it. Commerce decided which entities regained access, left Fable 5 restricted, and reserved the right to change the list later.

OpenAI is working with the administration on a repeatable process for future model releases. The word repeatable is the tell. This is not a one-time exception. It is the design of a new system.

The question is what that system optimizes for. If it optimizes for controlling who has access to the most capable AI, it will consistently favor large, well-connected, well-funded institutions over the distributed, smaller organizations that constitute much of the attack surface the government claims to be protecting.

If it optimizes for reducing the number of exploitable vulnerabilities in critical infrastructure, then the AI is not the only problem to control. The software is the problem to fix.

The AI can help fix it. That is the capability the government is gating. The alternative is not reckless public access with no safeguards. It is controlled defensive access at a scale that matches the problem, paired with disclosure requirements, patch deadlines, monitoring, and funding for the maintainers and institutions that cannot afford this work alone.

The software that advanced AI could exploit is old, brittle, and everywhere. The AI that could find and help fix its vulnerabilities is here, capable, and available only through government-approved lists while the vulnerabilities continue to exist.

Both statements are true. The policy that follows from the first is different from the policy that follows from the second. Washington is currently building the first. The case for the second has never been stronger.

Sources:

Previously on TheQuery: The Defenders Just Got Better Tools. So Did Everyone Else. and Claude Mythos Cracked Apple M5 Security in Five Days. Apple Spent Five Years Building It.