Best vision AI models

AI models that understand images: chart reading, OCR, UI understanding, and visual reasoning.

All models below accept image input. Ranked by overall intelligence and multimodal benchmark performance.

  1. 1

    o3

    83.5Frontier
    OpenAI
    Context:
    200K
    Input:
    $2/1M
    Output:
    $8/1M
    VisionMathAgenticLong contextFrontierReasoning
  2. 2

    Gemini 2 Pro

    82.8Frontier
    Google
    Context:
    2M
    Input:
    $1.25/1M
    Output:
    $5/1M
    VisionMathAgenticLong contextFrontierCode
  3. 3

    Grok 3

    80.1Frontier
    xAI
    Context:
    131K
    Input:
    $3/1M
    Output:
    $15/1M
    VisionMathAgenticFrontierReasoning
  4. 4

    GPT-5

    77.9Strong
    OpenAI
    Context:
    272K
    Input:
    $1.25/1M
    Output:
    $10/1M
    VisionMathAgenticLong contextReasoningCode
  5. 5

    GPT-4o mini

    77.1Strong
    OpenAI
    Context:
    128K
    Input:
    $0.15/1M
    Output:
    $0.6/1M
    VisionMathAgenticBudget
  6. 6

    o1

    76.3Strong
    OpenAI
    Context:
    200K
    Input:
    $15/1M
    Output:
    $60/1M
    VisionMathAgenticLong contextReasoning
  7. 7

    Claude Opus 4

    75.7Strong
    Anthropic
    Context:
    200K
    Input:
    $5/1M
    Output:
    $25/1M
    VisionAgenticLong contextReasoningCode
  8. 8

    Claude Sonnet 4

    66.2Competent
    Anthropic
    Context:
    200K
    Input:
    $3/1M
    Output:
    $15/1M
    VisionAgenticLong context
  9. 9

    GPT-5.4

    59.3Competent
    OpenAI
    Context:
    1.1M
    VisionAgenticLong context
  10. 10

    Claude Opus 4.7

    57.2Competent
    Anthropic
    Context:
    1M
    VisionAgenticLong context
  11. 11

    Qwen3.5-27B

    53.6Basic
    Alibaba
    Context:
    262K
    Input:
    $0.195/1M
    Output:
    $1.56/1M
    VisionLong context
  12. 12

    Gemma 4 31B

    45.1Basic
    Google
    Context:
    262K
    Input:
    $0.13/1M
    Output:
    $0.38/1M
    VisionLong contextBudget
  13. 13

    Claude Sonnet 4.6

    44.8Basic
    Anthropic
    Context:
    1M
    VisionAgenticLong contextCode
  14. 14

    GPT-5.4 nano

    43.3Basic
    OpenAI
    Context:
    272K
    VisionAgenticLong context
  15. 15

    GPT-5.4 mini

    34.6Limited
    OpenAI
    Context:
    272K
    VisionAgenticLong context
  16. 16

    Claude Haiku 4.5

    31.1Limited
    Anthropic
    Context:
    200K
    Input:
    $1/1M
    Output:
    $5/1M
    VisionAgenticLong context