Claude Opus 4
ActiveAnthropic's most capable model for complex reasoning and long-context work.
Overview
Claude Opus 4 is Anthropic's flagship model, optimized for deep reasoning, coding, and extended agentic workflows. Supports 200k context, tool use, and structured outputs.
Benchmarks
| Benchmark | Score | Source |
|---|---|---|
| AIME 2024Math | 48% accuracy | Third-party Artificial Analysis |
| GPQA DiamondReasoning | 74% accuracy | Self-reported Anthropic model card |
| GSM8KMath | 95.4% accuracy | Self-reported Anthropic model card |
| HumanEvalCoding | 93pass@1 % | Self-reported Anthropic model card |
| MATHMath | 78.2% accuracy | Self-reported Anthropic model card |
| MMLUGeneral knowledge | 87.5% accuracy | Self-reported Anthropic model card |
| MMLU-ProGeneral knowledge | 77.5% accuracy | Self-reported Anthropic model card |
| SWE-bench VerifiedCoding | 52% resolved | Self-reported Anthropic Claude 4 announcement |
Integrations & tooling support
- Tool calling
- Supported
- Structured outputs
- Supported
Price vs quality
Overpriced
Mid-tier performance at frontier pricing.
- Quality percentile
- 44.1%
- Effective price
- $20/1M
- Pricing breakdown
- $5/1M in
$25/1M out
vs 8 benchmarks
/ 1M tokens (input + 3× output)
Community ratings
No ratings yet. Be the first to rate Claude Opus 4.
Rate Claude Opus 4
Sign in to rate and review.
Comments
Sign in to leave a comment.