o3

Active

OpenAI's most powerful reasoning model, successor to o1.

Overview

o3 is OpenAI's top-tier reasoning model trained with reinforcement learning on chain-of-thought. It leads on AIME, GPQA, and SWE-bench across all published models as of mid-2025.

Benchmarks

Benchmark	Score	Source
AIME 2024Math	96.7% accuracy	Self-reported OpenAI o3 system card
GPQA DiamondReasoning	87.7% accuracy	Self-reported OpenAI o3 system card
MATHMath	97.8% accuracy	Self-reported OpenAI o3 system card
MMLUGeneral knowledge	91% accuracy	Third-party Artificial Analysis
MMLU-ProGeneral knowledge	81.2% accuracy	Third-party Artificial Analysis
SWE-bench VerifiedCoding	71.7% resolved	Third-party Papers With Code

Integrations & tooling support

Tool calling: Supported
Structured outputs: Supported

Price vs quality

Great value

Strong performance at mid-tier pricing.

Quality percentile: 97.2%
Effective price: $6.5/1M
Pricing breakdown: $2/1M in
$8/1M out

Community ratings

No ratings yet. Be the first to rate o3.

o3