Best AI models for coding
Ranked AI models for writing, reviewing, and debugging code — filtered by coding benchmarks (SWE-bench, HumanEval, Aider) and pricing.
These models excel at software engineering tasks: generating code, explaining unfamiliar codebases, writing tests, and fixing bugs. Rankings are based on coding-heavy benchmarks and updated automatically.
- 1
DeepSeek V3
84.0FrontierDeepSeek- Context:
- 128K
- Input:
- $0.27/1M
- Output:
- $1.1/1M
MathAgenticFrontierOpen sourceCode - 2
Gemini 2 Pro
82.8FrontierGoogle- Context:
- 2M
- Input:
- $1.25/1M
- Output:
- $5/1M
VisionMathAgenticLong contextFrontierCode - 3
GPT-5
77.9StrongOpenAI- Context:
- 272K
- Input:
- $1.25/1M
- Output:
- $10/1M
VisionMathAgenticLong contextReasoningCode - 4
Claude Opus 4
75.7StrongAnthropic- Context:
- 200K
- Input:
- $5/1M
- Output:
- $25/1M
VisionAgenticLong contextReasoningCode - 5
o3-mini
74.5StrongOpenAI- Context:
- 200K
- Input:
- $1.1/1M
- Output:
- $4.4/1M
MathAgenticLong contextCode - 6
Claude Sonnet 4.6
44.8BasicAnthropic- Context:
- 1M
VisionAgenticLong contextCode