GPT-5
LLM ModelsOpenAI's major generational leap released in August 2025, achieving 94.6% on AIME 2025 and 45% fewer factual errors than GPT-4o.
GPT-5, released in August 2025, represents a major generational leap in OpenAI's language model capabilities. The model achieves 94.6% on AIME 2025 without any tools and 74.9% on SWE-bench, demonstrating substantial improvements in mathematical reasoning and software engineering compared to previous generations.
A critical advancement in GPT-5 is its significant reduction in factual errors-45% fewer than GPT-4o when using web search, and 80% fewer than o3 when engaging in reasoning tasks. On specialized benchmarks, GPT-5 scores 46.2% on HealthBench Hard (medical reasoning) and 84.2% on MMMU (massive multi-discipline understanding), showing strong performance across diverse domains.
GPT-5 also demonstrates improved efficiency, performing better than o3 while using 50-80% fewer output tokens for the same tasks. This combination of higher accuracy, reduced hallucinations, and greater efficiency makes GPT-5 a significant milestone in language model development, suitable for high-stakes applications in healthcare, legal reasoning, and other domains where factual accuracy is paramount.
References & Resources
Related Terms
Last updated: February 22, 2026