o3

Active

OpenAI's most powerful reasoning model, successor to o1.

Overview

o3 is OpenAI's top-tier reasoning model trained with reinforcement learning on chain-of-thought. It leads on AIME, GPQA, and SWE-bench across all published models as of mid-2025.

Benchmarks

BenchmarkScoreSource
AIME 2024Math96.7% accuracySelf-reported
OpenAI o3 system card
GPQA DiamondReasoning87.7% accuracySelf-reported
OpenAI o3 system card
MATHMath97.8% accuracySelf-reported
OpenAI o3 system card
MMLUGeneral knowledge91% accuracyThird-party
Artificial Analysis
MMLU-ProGeneral knowledge81.2% accuracyThird-party
Artificial Analysis
SWE-bench VerifiedCoding71.7% resolvedThird-party
Papers With Code

Integrations & tooling support

Tool calling
Supported
Structured outputs
Supported

Price vs quality

Great value

Strong performance at mid-tier pricing.

Quality percentile
97.2%
vs 6 benchmarks
Effective price
$6.5/1M
/ 1M tokens (input + 3× output)
Pricing breakdown
$2/1M in
$8/1M out

Community ratings

No ratings yet. Be the first to rate o3.

Rate o3

Sign in to rate and review.

Comments

Sign in to leave a comment.