>_TheQuery
← All Articles

Codex vs Claude Code: Who's Winning?

By Addy · March 25, 2026

OpenAI has 1.6 million active weekly Codex users. Usage measured in tokens has grown 5x since February. Sam Altman announced a "bunch of new products" coming in the next month. The company is merging ChatGPT, its browser, and Codex into a single desktop super app. The marketing machine is running at full speed.

Meanwhile, Anthropic shipped auto mode for Claude Code on March 25. Quietly. In an engineering blog post. No press release. No Sam Altman tweet. Just a technical writeup about classifiers, threat models, and a 17% false-negative rate that the team published because developers should know.

This has been the pattern of the last three months. OpenAI talks loudly. Anthropic ships quietly. The market is noticing.


What OpenAI Is Doing

The Codex story is genuinely impressive on its own terms. GPT-5.3-Codex launched February 5, minutes before Anthropic launched Claude Opus 4.6, a timing that did not look accidental. One million downloads of the Codex desktop app. 1.6 million active weekly users, more than tripling since launch. Major enterprise deployments at Cisco, NVIDIA, Ramp, Rakuten, and Harvey.

OpenAI is also eating its own cooking. An internal researcher published that Codex now writes 100% of their code. The company used early versions of GPT-5.3-Codex to debug its own training, manage deployment, and scale GPU clusters. Sam Altman: "It was amazing to watch how much faster we were able to ship 5.3-Codex by using 5.3-Codex."

The super app announcement, merging ChatGPT, Codex, and the Atlas browser into one interface, is the consolidation move of a company trying to reduce product fragmentation while preparing for a public-market story. Fidji Simo described it as doubling down on what is working.

The marketing is loud because OpenAI needs it to be. The company is operating under what multiple reports describe as a competitive pressure cycle: Claude surged in app rankings, and business market share shifted rapidly over the last year.

Codex growth is real. It is also the story OpenAI needs to tell to counter a narrative that is not moving in its favor.


What Anthropic Is Actually Shipping

Between February and today, Claude Code shipped the following without a major press event:

Code review (March 9): A multi-agent system dispatching parallel AI agents on pull requests, with a verification layer that filters false positives. Substantive review coverage at Anthropic reportedly jumped from 16% to 54% of PRs after adoption.

Swarm mode: Parallel agent orchestration for complex multi-repository tasks. Shipped quietly alongside OpenAI's Symphony announcement, two companies solving the same problem from opposite architectural directions.

Auto mode (March 25): By default Claude Code asks for human approval before running commands or modifying files. Auto mode replaces that human approval with a classifier, a middle ground between manually approving every action and the --dangerously-skip-permissions flag that disables guardrails entirely. The classifier evaluates each action against user intent before execution. Anthropic documented two defensive layers: a prompt injection probe at input and a transcript classifier at output. Reported rates on real traffic were 0.4% false positives and 17% false negatives on real overeager actions.

The auto mode engineering post is worth reading for what it reveals about real agent failures. Anthropic's own examples include deleting remote git branches from a misinterpreted instruction, uploading a GitHub auth token to internal compute, and attempting migrations against a production database.

Anthropic's response was to build a classifier that catches many of those failures, and to publish how well it works, including where it still misses.


The Market Numbers

The contrast in communication style would be interesting on its own. The revenue and spending numbers made it a bigger story.

Claude Code reportedly crossed $2.5 billion in annualized revenue. OpenAI Codex reportedly reached $1 billion annualized by early 2026. Business invoice share data cited in market reporting showed a large shift toward Anthropic over the same period.

The developer community noticed a product-level shift. Spending datasets that track enterprise software usage, rather than self-reported sentiment, pointed to sustained adoption of Claude Code in many teams.

One developer on X captured the tone: "OpenAI is sprinting to catch up to where Claude Code was six months ago."

That framing is probably too harsh. GPT-5.3-Codex is a genuinely capable model and Codex has real enterprise traction. But the sequence is broadly accurate: Claude Code became the reference point for many technical teams, and OpenAI's communication volume appears partly responsive to that reality.


Two Different Theories of the Market

The product difference is not just features. It is what each company thinks developers want.

OpenAI's theory: developers want a complete integrated product. One app. One subscription. Code, chat, browser, and agent in a unified experience that is easy to start and hard to leave. The super app move reflects this. So does the emphasis on large user metrics.

Anthropic's theory: developers want a tool that does not break production infrastructure and is candid about risk. The auto mode blog post, with incident examples, classifier tables, and an explicit 17% miss rate, is a product pitch to engineers who have already seen an agent go rogue.

Both theories can be right for different buyers. Teams that want a unified managed experience may prefer OpenAI's packaging. Teams deploying AI agents in high-stakes environments may prefer Anthropic's transparency.


The Honest Read

OpenAI is not losing in absolute terms. The Codex numbers are real, the enterprise footprint is real, and the product is capable.

But the dynamic of the last three months is clear. One company is publishing implementation details and failure rates. The other is emphasizing growth metrics and ecosystem consolidation.

In developer tools, the first approach often wins the long game. Developers distrust marketing and trust engineers who show their work.

The auto mode post is the kind of artifact that gets bookmarked, shared in team channels, and referenced later when someone asks why a team switched to Claude Code. It is not a headline play. It is evidence.

OpenAI has the louder voice right now. Anthropic has the receipts.


Sources:

Previously on TheQuery: Claude Code Review vs CodeRabbit: Two Philosophies of AI Code Review - where this story started in March.