Generative AI Model Ranking Matrix

Snapshot · April 2026

Side-by-side comparison of frontier language, video, and image generation models — benchmarks, capabilities, release dates, and price.

Frontier LLMs — coding & reasoning

Higher is better on benchmarks. Price is $/1M input tokens (lower is better).
Model Vendor License Capabilities Released Context SWE-Bench Verified SWE-Bench Pro Terminal-Bench Reasoning* $/M in Best for
Claude Opus 4.7 Anthropic closed
VTRC
2026-04-16 1M 87.6 64.3 69.4 95 5 Leads SWE-Bench Verified (87.6) & Pro (64.3); new tokenizer (~35% more tokens)
Claude Opus 4.6 Anthropic closed
VTRC
2026-02 1M 80.8 57.3 92 15 Large-codebase reasoning, multi-file refactors
Claude Sonnet 4.6 Anthropic closed
VTRC
2025-12 1M 79.6 86 3 Default coding driver — 98% of Opus at ⅕ cost
Claude Haiku 4.5 Anthropic closed
VTRC
2025-10-15 200k 73.3 78 1 4–5× faster than Sonnet 4.5; cheap multi-agent driver
GPT-5.5 OpenAI closed
VATRC
2026-04-24 1M 87.6 58.6 82.7 95 5 SOTA Terminal-Bench 82.7; GDPval 84.9; OSWorld 78.7; ties Opus 4.7 on SWE-V
GPT-5.3 Codex OpenAI closed
VTRC
2026-03 400k 80.0 56.8 77.3 91 10 Full SDLC agent: debug, terminal, PRDs, tests
Gemini 3.1 Pro Google closed
VATRC
2026-03 2M 78.0 94 7 Adjustable Deep Think; agentic browsing (BrowseComp 85.9)
Grok 4.20 Beta 2 xAI closed
VT
2026-03-03 256k 75 80 2 4-agent backbone · IFBench #1 (83); 8% of Opus cost
Qwen3.6-Max-Preview Alibaba closed
VTR
2026-04-20 1M 79.5 77.1 89 6 Agent loops — preserve_thinking across tool calls
DeepSeek V4-Pro DeepSeek open
TR
2026-04-27 128k 80.6 55.4 67.9 92 0.9 1.6T MoE / 49B active · matches Opus 4.6 at fraction of cost
DeepSeek V4-Flash DeepSeek open
TR
2026-04-27 128k 0.3 Fresh · 158B fast variant of V4
Kimi K2.6 Moonshot open
VTR
2026-04-29 256k 80.2 58.6 66.7 94 0.6 1.1T MoE · ties GPT-5.5 on SWE-Pro at $0.6/M; HLE 54 leads all
GLM-5.1 Zhipu / Z.ai open (MIT)
VTR
2026-04-07 128k 79.0 58.4 88 0.11 754B · #1 SWE-Bench Pro among open weights
MiniMax M2.7 MiniMax open
TR
2026-04-20 205k 78 56.22 57.0 87 0.3 229B · self-evolving; matches Codex on SWE-Pro
Qwen3.5-397B-A17B Alibaba open
VTR
2026-04-24 256k 80.0 54.0 92 403B MoE / 17B active · GPQA 88.4, MMLU-Pro 87.8, SWE-V 80.0
Qwen3.6-35B-A3B Alibaba open
VTR
2026-04-24 256k 73.4 49.5 84 36B / 3B active · 73.4 SWE-V on tiny active params
MiMo-V2.5-Pro Xiaomi open
TR
2026-04-29 1M 57.2 89 1.02T MoE / 42B active · matches Opus 4.6 on SWE-Pro w/ 40% fewer tokens
Tencent Hy3-preview Tencent open (preview)
VTR
2026-04-24 128k 74.4 54.4 88 295B / 21B active · +40pts SWE-V vs Hy2; topped Tsinghua math PhD exam
Mistral Large 3 Mistral open (Apache 2.0)
VT
2025-12-02 256k 50 2 675B MoE / 41B active · non-reasoning (AIME ~40, GPQA ~44) · Apache 2.0
Mistral Medium 3.5 Mistral open weights
TR
2026-04-30 128k 77.6 80 0.6 128B · SWE-V 77.6 beats Devstral 2 + Qwen3.5; τ³-Telecom 91.4
Devstral-2-123B Mistral open weights
TC
2026-02-25 128k 0.6 125B · code-specialized variant
Gemma 4 31B-it Google open
VT
2026-04-29 256k 52 87 31B dense · AIME 89.2, GPQA 84.3, τ²-Retail 86.4 (12× jump from G3)
Nemotron-3-Nano-Omni NVIDIA open
VATR
2026-04-29 128k Fresh (~17h) · 30B-A3B any-to-any reasoning
Meta Avocado Meta rumored / closed tbd 2026-Q2 (est.) Llama successor, slipping to May/Jun 2026
Capabilities key: V Vision (image input) A Audio T Tools / function calling R Extended reasoning C Computer / browser use
top tier mid lower *Reasoning is a composite score (GPQA Diamond / HLE / AIME / ARC-AGI-2, normalized 0–100, directional).

Video generation models

Quality is Artificial Analysis Image-to-Video rank (1 = best).
Model Vendor License Released I2V Rank Max len (s) Resolution Audio $/gen Best for
Kling 3.0 Kuaishou closed 2026-02 1 15 (multi-shot) 1080p synced dialogue + SFX 1.00 Cinematic shots; #1 general-purpose
Veo 3.1 Google closed 2026-01 2 8 1080p yes 0.75 High-fidelity realism, prompt adherence
Sora 2 OpenAI closed 2025-10 3 12 1080p yes 0.90 Imaginative T2V; ChatGPT-integrated
Seedance 2.0 ByteDance closed 2026-04 4 15 1080p yes 0.50 Product ads, e-comm, character consistency
LTX-2 Lightricks open 2025-Q4 5 10 4K@50fps native sync 0.20 Open-weights leader; 4K + audio
Wan 2.2 Alibaba open 2025-Q3 6 6 720p 0.05 Runs on a 4070; novel MoE denoiser

Image generation models

Quality is directional ELO-style score from public leaderboards. Speed is typical wall-clock per image.
Model Vendor License Released Max res Quality* Speed (s) $/img Best for
Midjourney v8Midjourneyclosed2026-032K95100.04Aesthetic king · rewritten engine, 5× faster than v7
FLUX.2 [pro]Black Forest Labsclosed (API)2026-Q12K934.50.04Photoreal commercial — best skin, lighting, materials
FLUX.2 [dev]Black Forest Labsopen weights2026-Q12K8960Top open-weights photoreal model
GPT Image 2OpenAIclosed2026-Q11K+9260.04Best prompt adherence — complex composed scenes
Imagen 4 UltraGoogleclosed2025-Q42K9450.04Photoreal flagship; strong text rendering
Imagen 4 FastGoogleclosed2026-031K8620.02Cheapest fast quality at $0.02/img
Nano Banana 2Google (Gemini 3.1 Flash Image)closed2026-031K821.50.02Fastest end-to-end (~1–3s) · in-chat editing
Recraft V4Recraftclosed2025-Q42K8850.04Brand / design assets, vector-style outputs
Ideogram 3Ideogramclosed2025-Q41K8450.03Typography & in-image text rendering
Seedream 4.5ByteDanceclosed2026-Q12K9050.03Asian aesthetics, character consistency
Stable Diffusion 4Stability AIopen2026-Q12K8060Open ecosystem · ControlNet / LoRA backbone
Hunyuan Image 3Tencentopen2026-Q12K8350Open Tencent flagship; bilingual prompts