MiniMax M3
LLM ModelsMiniMax's June 2026 frontier open-weight model claim combining coding strength, 1M context, native multimodality, and low-cost agentic API access.
A promising research prototype shown on stage with the test results, roadmap, and price tag already posted, but the keys to inspect the engine have not been handed over yet.
MiniMax M3 is a June 2026 model from Shanghai-based AI lab MiniMax. MiniMax positions it as the first open-weight model to combine three frontier-style capabilities in one system: strong coding and agentic benchmark results, a native 1 million token context window, and native multimodal input for text, images, and video, plus desktop computer operation.
What it claims
MiniMax reports that M3 scores 59.0% on SWE-bench Pro, slightly ahead of GPT-5.5's 58.6% and Gemini 3.1 Pro's 54.2% on that vendor-reported comparison. It also reports 66.0% on Terminal-Bench 2.1, 74.2% on MCP Atlas, and a leading result on SVG-Bench against Claude Opus 4.7.
The model uses MiniMax Sparse Attention, or MSA, a sparse attention architecture designed to make million-token contexts cheaper and faster than dense attention. MiniMax says MSA gives M3 up to a 1M token context window and improves long-context prefill and decoding speed compared with its previous generation.
Why it matters
The important part is not only the score. It is the combination. Before M3, open-weight models usually forced a tradeoff: strong coding but shorter context, long context but weaker agentic behavior, or multimodality without frontier coding. M3 claims all three at once.
If the weights ship as promised and independent evaluations hold up, M3 becomes a serious option for local or self-hosted coding agents, document analysis, long-context retrieval, and multimodal workflows where code or data cannot leave the organization.
The verification caveat
At launch, M3 was available through MiniMax products and API access, but the promised model weights had not yet been released on Hugging Face. That makes the open-weight label a commitment rather than a fully verifiable fact until the weights, license, and technical report are public.
The benchmark caveat matters too. MiniMax disclosed that several results were run on its own infrastructure and sometimes used scaffolding such as Claude Code, Mini-SWE-Agent, Terminus, and Codex. That does not make the numbers false, but it means they should be read as launch-day vendor results until independent teams reproduce them under standardized harnesses such as DeepSWE.
Practical tradeoffs
MiniMax M3's API economics are aggressive. Launch pricing on routing platforms was reported around USD 0.30 per million input tokens and USD 1.20 per million output tokens during the promotion, while MiniMax token plans started at USD 20 per month for very large token quotas.
The tradeoff is trust and timing. Until the weights are public, self-hosting is not available. Until independent benchmarks run, the coding claim is not settled. And because MiniMax is a Chinese company, API use also carries the jurisdictional and procurement considerations that apply to Chinese AI vendors.
References & Resources
Related Terms
Last updated: June 2, 2026