o3
ActiveOpenAI's most powerful reasoning model, successor to o1.
Overview
o3 is OpenAI's top-tier reasoning model trained with reinforcement learning on chain-of-thought. It leads on AIME, GPQA, and SWE-bench across all published models as of mid-2025.
Benchmarks
| Benchmark | Score | Source |
|---|---|---|
| AIME 2024Math | 96.7% accuracy | Self-reported OpenAI o3 system card |
| GPQA DiamondReasoning | 87.7% accuracy | Self-reported OpenAI o3 system card |
| MATHMath | 97.8% accuracy | Self-reported OpenAI o3 system card |
| MMLUGeneral knowledge | 91% accuracy | Third-party Artificial Analysis |
| MMLU-ProGeneral knowledge | 81.2% accuracy | Third-party Artificial Analysis |
| SWE-bench VerifiedCoding | 71.7% resolved | Third-party Papers With Code |
Integrations & tooling support
- Tool calling
- Supported
- Structured outputs
- Supported
Price vs quality
Great value
Strong performance at mid-tier pricing.
- Quality percentile
- 97.2%
- Effective price
- $6.5/1M
- Pricing breakdown
- $2/1M in
$8/1M out
vs 6 benchmarks
/ 1M tokens (input + 3× output)
Community ratings
No ratings yet. Be the first to rate o3.
Rate o3
Sign in to rate and review.
Comments
Sign in to leave a comment.