Claude Opus 4.7 Proved the Race Has No Finish Line
By Addy · April 16, 2026
On April 15, this publication wrote that the cheap model was winning.
The point of that piece was simple: open and open-weight systems like Kimi are compressing the price of intelligence faster than frontier labs expected, and any company still charging a premium now has to justify that premium every single week.
On April 16, Anthropic answered.
Claude Opus 4.7 shipped hours later.
This is the AI industry in April 2026. One day an open model narrows the gap. The next day a frontier lab moves the line again. Then someone else ships something cheaper. There is no stable news cycle anymore. There is only velocity.
What Opus 4.7 Actually Ships
Anthropic's official announcement is direct. Claude Opus 4.7 is generally available today across Claude products, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. The price is unchanged from Opus 4.6: $5 per million input tokens and $25 per million output tokens.
The capability, however, is not unchanged.
Anthropic positions Opus 4.7 as its most capable generally available model, with specific gains in advanced software engineering, long-horizon agentic work, knowledge work, memory, and vision. In the same announcement, the company also says something unusual: Opus 4.7 is still less broadly capable than Mythos Preview, the model Anthropic is keeping behind Project Glasswing's controlled-release gate.
That framing matters as much as the benchmarks.
Anthropic is not just launching a better model. It is explicitly telling the market that the thing it is willing to sell broadly is not the ceiling of what it can build.
The Benchmark Picture
The benchmark story comes from Anthropic's published comparison charts and the reporting that unpacked them.
According to Anthropic's release data as reported by VentureBeat, Opus 4.7 scores 87.6% on SWE-bench Verified, up from 80.8% for Opus 4.6. On SWE-bench Pro, the harder private-set evaluation built specifically to reduce memorization effects, it reaches 64.3%, versus 53.4% for its predecessor. On GPQA Diamond, it hits 94.2%.
Those are not cosmetic gains.
They mean Anthropic did not just make Opus 4.6 slightly more polished. It moved the model materially on the hardest coding and reasoning tasks people actually use to compare frontier systems.
The official release also includes partner-reported improvements that are arguably more useful than the headline tables.
Cursor says Opus 4.7 clears 70% on CursorBench versus 58% for Opus 4.6. Notion reports a 14% lift on complex multi-step workflows at fewer tokens and one-third the tool errors, calling it the first model to pass its implicit-need tests. Rakuten says Opus 4.7 resolves three times more production tasks than Opus 4.6 on Rakuten-SWE-Bench. CodeRabbit reports recall improving by more than 10% on difficult bugs while precision stayed stable.
These are not toy metrics. They are signals from products already built around long-running coding workflows.
The Features Developers Will Actually Notice
Benchmarks get headlines. Product features decide whether teams upgrade.
The most important official addition is xhigh, a new effort level between high and max. Anthropic's platform docs describe it as a finer-grained tradeoff between intelligence, latency, and token spend for hard problems, and Claude Code's official changelog adds support for it across /effort, the model picker, and related workflows.
That is a bigger change than it sounds like.
It means Anthropic is not treating deeper inference as an edge case anymore. It is making a higher-effort reasoning mode a first-class part of coding work.
Second, Anthropic's API docs add task budgets in public beta. Instead of letting a long agentic loop consume tokens blindly, task budgets give the model a visible advisory budget across thinking, tool calls, tool results, and final output. That is a direct answer to one of the most consistent complaints about production AI workflows: unpredictability.
Third, Claude Code's official changelog adds /ultrareview, a cloud-based review mode using parallel multi-agent analysis and critique. Anthropic's release notes position it as a more comprehensive review pass; CodeRabbit's launch feedback makes the intended use case clear. This is supposed to catch the bugs and design problems a careful senior reviewer would normally catch, not just the easy syntax failures.
Then there is vision.
Anthropic's platform docs say Opus 4.7 is the first Claude model with high-resolution image support up to 2,576 pixels on the long edge, or about 3.75 megapixels. In Anthropic's own release, XBOW reports its visual-acuity benchmark jumping from 54.5% on Opus 4.6 to 98.5% on Opus 4.7. For teams building computer-use agents, that is the difference between "sometimes sees enough" and "can actually navigate dense interfaces."
The Honest Benchmark Story
Anthropic did something unusual in this launch.
It shipped Opus 4.7 and, in the same breath, reminded everyone that Mythos Preview is still stronger.
That is not normal behavior for labs. Most companies do not publish a release and simultaneously clarify that their gated model still wins the internal family race. Anthropic did it anyway.
The message is clear: Opus 4.7 is the best model Anthropic is willing to release broadly, not the strongest model it has in the lab.
That creates a new kind of product segmentation.
This is not just free versus paid, or consumer versus API. It is risk-tiered capability. Anthropic says Opus 4.7 had its cyber capability deliberately reduced relative to Mythos during training, and is using Opus 4.7 as the first broad commercial deployment of new cyber safeguards while Mythos remains restricted.
In other words, broad availability now comes with a capability ceiling by design.
That is a very different kind of moat from simple pricing.
The Week That Describes the Industry
On April 13, Moonshot quietly confirmed k2.6-code-preview to beta testers. A cheap, open-weight-adjacent coding model kept improving without a formal launch.
On April 16, Anthropic shipped Opus 4.7. Same price as Opus 4.6. Materially stronger. Better coding, better tools, better vision, better control over token usage, and a clearer product story around long-horizon AI agents.
Both things are true at once.
The open models are compressing the price of intelligence. The frontier labs are still moving the capability frontier. Neither dynamic cancels the other.
That is what makes this market so hard to read with old instincts. Open models are not "winning" because closed models stopped improving. Closed models are not "safe" because open models are still behind on some tasks. The whole system is moving in parallel: cheaper models are catching up while frontier labs keep re-earning the premium.
The question is which curve compounds faster.
What Opus 4.7 Does Not Tell You
Two absences matter.
First, the broader design-tool story around Anthropic that surfaced in reporting this week did not ship here. April 16 was the model release, not the bigger product stack people were speculating about.
Second, Mythos Preview remains behind the Glasswing gate. The model Anthropic says is more capable than Opus 4.7 is still accessible only to a small set of preselected partners and critical-software organizations.
That means Opus 4.7 is best understood as the answer to a narrower question: what is the strongest model Anthropic can release broadly, safely, and at the same price point as Opus 4.6?
The answer is impressive.
But the fact that Anthropic is now visibly separating "best generally available" from "best overall" may be the more important long-term signal.
The Race, At This Speed
The industry does not really have quarters anymore. It barely has weeks.
Kimi K2.6 confirmed on April 13. Opus 4.7 shipped on April 16. Something else will ship next week.
The benchmarks are not settling. The pricing is not settling. The feature sets are not converging. The most capable model you can buy and the most capable model that exists are now openly different categories.
That is what Claude Opus 4.7 proved.
Not just that Anthropic can still move the frontier. Not just that frontier pricing can still buy real performance. Not just that the open-model pressure is real.
It proved that none of these questions are close to resolved.
The race has no finish line.
It has faster runners every week.
Sources:
- Introducing Claude Opus 4.7 - Anthropic
- What's new in Claude Opus 4.7 - Claude API Docs
- Anthropic releases Claude Opus 4.7, narrowly retaking lead for most powerful generally available LLM - VentureBeat
Previously on TheQuery: The Cheap Model Is Winning and AI Can Write Your Code. It Just Cannot Understand It. - the cost and benchmark context this release landed into.