Best AI models for coding

Ranked AI models for writing, reviewing, and debugging code — filtered by coding benchmarks (SWE-bench, HumanEval, Aider) and pricing.

These models excel at software engineering tasks: generating code, explaining unfamiliar codebases, writing tests, and fixing bugs. Rankings are based on coding-heavy benchmarks and updated automatically.

1
DeepSeek V3
84.0Frontier
DeepSeek
Context:
128K
Input:
$0.27/1M
Output:
$1.1/1M
MathAgenticFrontierOpen sourceCode
2
Gemini 2 Pro
82.8Frontier
Google
Context:
2M
Input:
$1.25/1M
Output:
$5/1M
VisionMathAgenticLong contextFrontierCode
3
GPT-5
77.9Strong
OpenAI
Context:
272K
Input:
$1.25/1M
Output:
$10/1M
VisionMathAgenticLong contextReasoningCode
4
Claude Opus 4
75.7Strong
Anthropic
Context:
200K
Input:
$5/1M
Output:
$25/1M
VisionAgenticLong contextReasoningCode
5
o3-mini
74.5Strong
OpenAI
Context:
200K
Input:
$1.1/1M
Output:
$4.4/1M
MathAgenticLong contextCode
6
Claude Sonnet 4.6
44.8Basic
Anthropic
Context:
1M
VisionAgenticLong contextCode