Generative AI Model Ranking Matrix

Snapshot · April 2026

Side-by-side comparison of frontier language, video, and image generation models — benchmarks, capabilities, release dates, and price.

Frontier LLMs — coding & reasoning

License Caps / filter 13 tabs
Model Vendor Params (B) License Capabilities Released Context SWE-Bench Verified SWE-Bench Pro Terminal-Bench Reasoning* $/M in Best for
Claude Opus 4.7 Anthropic closed
VTRC
2026-04-16 1M 87.6 64.3 69.4 95 5 Leads SWE-Bench Verified (87.6) & Pro (64.3); new tokenizer (~35% more tokens)
Claude Opus 4.6 Anthropic closed
VTRC
2026-02 1M 80.8 57.3 92 15 Large-codebase reasoning, multi-file refactors
Claude Sonnet 4.6 Anthropic closed
VTRC
2025-12 1M 79.6 86 3 Default coding driver — 98% of Opus at ⅕ cost
Claude Haiku 4.5 Anthropic closed
VTRC
2025-10-15 200k 73.3 78 1 4–5× faster than Sonnet 4.5; cheap multi-agent driver
GPT-5.5 OpenAI closed
VATRC
2026-04-24 1M 87.6 58.6 82.7 95 5 SOTA Terminal-Bench 82.7; GDPval 84.9; OSWorld 78.7; ties Opus 4.7 on SWE-V
GPT-5.3 Codex OpenAI closed
VTRC
2026-03 400k 80.0 56.8 77.3 91 10 Full SDLC agent: debug, terminal, PRDs, tests
Gemini 3.1 Pro Google closed
VATRC
2026-03 2M 78.0 94 7 Adjustable Deep Think; agentic browsing (BrowseComp 85.9)
Grok 4.20 Beta 2 xAI closed
VT
2026-03-03 256k 75 80 2 4-agent backbone · IFBench #1 (83); 8% of Opus cost
Qwen3.6-Max-Preview Alibaba closed
VTR
2026-04-20 1M 79.5 77.1 89 6 Agent loops — preserve_thinking across tool calls
DeepSeek V4-Pro DeepSeek 861.6 open
TR
2026-04-27 128k 80.6 55.4 67.9 92 0.9 1.6T MoE / 49B active · matches Opus 4.6 at fraction of cost
DeepSeek V4-Flash DeepSeek 158.1 open
TR
2026-04-27 128k 0.3 Fresh · 158B fast variant of V4
Kimi K2.6 Moonshot 1000 open
VTR
2026-04-29 256k 80.2 58.6 66.7 94 0.6 1.1T MoE · ties GPT-5.5 on SWE-Pro at $0.6/M; HLE 54 leads all
GLM-5.1 Zhipu / Z.ai 753.9 open (MIT)
VTR
2026-04-07 128k 79.0 58.4 88 0.11 754B · #1 SWE-Bench Pro among open weights
MiniMax M2.7 MiniMax 228.7 open
TR
2026-04-20 205k 78 56.22 57.0 87 0.3 229B · self-evolving; matches Codex on SWE-Pro
Qwen3.5-397B-A17B Alibaba 397 open
VTR
2026-04-24 256k 80.0 54.0 92 403B MoE / 17B active · GPQA 88.4, MMLU-Pro 87.8, SWE-V 80.0
Qwen3.6-35B-A3B Alibaba 35 open
VTR
2026-04-24 256k 73.4 49.5 84 36B / 3B active · 73.4 SWE-V on tiny active params
MiMo-V2.5-Pro Xiaomi 1023.2 open
TR
2026-04-29 1M 57.2 89 1.02T MoE / 42B active · matches Opus 4.6 on SWE-Pro w/ 40% fewer tokens
Tencent Hy3-preview Tencent 298.8 open (preview)
VTR
2026-04-24 128k 74.4 54.4 88 295B / 21B active · +40pts SWE-V vs Hy2; topped Tsinghua math PhD exam
Mistral Large 3 Mistral 675 open (Apache 2.0)
VT
2025-12-02 256k 50 2 675B MoE / 41B active · non-reasoning (AIME ~40, GPQA ~44) · Apache 2.0
Mistral Medium 3.5 Mistral 128 open weights
TR
2026-04-30 128k 77.6 80 0.6 128B · SWE-V 77.6 beats Devstral 2 + Qwen3.5; τ³-Telecom 91.4
Devstral-2-123B Mistral 123 open weights
TC
2026-02-25 128k 0.6 125B · code-specialized variant
Gemma 4 31B-it Google 31 open
VT
2026-04-29 256k 52 87 31B dense · AIME 89.2, GPQA 84.3, τ²-Retail 86.4 (12× jump from G3)
Nemotron-3-Nano-Omni NVIDIA 30 open
VATR
2026-04-29 128k Fresh (~17h) · 30B-A3B any-to-any reasoning
Meta Avocado Meta rumored / closed tbd 2026-Q2 (est.) Llama successor, slipping to May/Jun 2026
Ring-2.6-1T live InclusionAI 1025.7 open 2026-05-18 Ring-2.6-1T is a trillion-parameter (~1T) MoE reasoning model with novel async RL training (IcePop algorithm), agent-focused capabilities, and adjustable reasoning effort, representing a frontier-scale release from inclusionAI.
Ling-2.6-1T live InclusionAI 1025.7 open 2026-05-03 Trillion-parameter MoE model with hybrid MLA+Linear Attention architecture achieving claimed open-source SOTA on multiple agentic benchmarks, representing a genuine frontier-scale release with architectural innovation.
Kimi-K2.6-NVFP4 live trusted-lab nvidia open 2026-05-15 FP4 quantization of Kimi-K2.6 (1T param MoE, 32B active) is a significant frontier-scale model but represents a repackaged quantization rather than novel architecture or capability advancement.
Nemotron-Labs-Diffusion-14B live trusted-lab nvidia 13.5 open 2026-05-19 Tri-mode architecture (AR + diffusion + self-speculation) with shared KV cache is a genuinely novel capability from a major lab, but 13.5B params and very low adoption keep composite score at 21, right at the threshold requiring review.
Capabilities key: V Vision (image input) A Audio T Tools / function calling R Extended reasoning C Computer / browser use
top tier mid lower *Reasoning is a composite score (GPQA Diamond / HLE / AIME / ARC-AGI-2, normalized 0–100, directional).

Video generation models

License I2V rank (1 = best)
Model Vendor License Released I2V Rank Max len (s) Resolution Audio $/gen Best for
Kling 3.0 Kuaishou closed 2026-02 1 15 (multi-shot) 1080p synced dialogue + SFX 1.00 Cinematic shots; #1 general-purpose
Veo 3.1 Google closed 2026-01 2 8 1080p yes 0.75 High-fidelity realism, prompt adherence
Sora 2 OpenAI closed 2025-10 3 12 1080p yes 0.90 Imaginative T2V; ChatGPT-integrated
Seedance 2.0 ByteDance closed 2026-04 4 15 1080p yes 0.50 Product ads, e-comm, character consistency
LTX-2 Lightricks open 2025-Q4 5 10 4K@50fps native sync 0.20 Open-weights leader; 4K + audio
Wan 2.2 Alibaba open 2025-Q3 6 6 720p 0.05 Runs on a 4070; novel MoE denoiser

Image generation models

License Quality = directional leaderboard score
Model Vendor License Released Max res Quality* Speed (s) $/img Best for
Midjourney v8Midjourneyclosed2026-032K95100.04Aesthetic king · rewritten engine, 5× faster than v7
FLUX.2 [pro]Black Forest Labsclosed (API)2026-Q12K934.50.04Photoreal commercial — best skin, lighting, materials
FLUX.2 [dev]Black Forest Labsopen weights2026-Q12K8960Top open-weights photoreal model
GPT Image 2OpenAIclosed2026-Q11K+9260.04Best prompt adherence — complex composed scenes
Imagen 4 UltraGoogleclosed2025-Q42K9450.04Photoreal flagship; strong text rendering
Imagen 4 FastGoogleclosed2026-031K8620.02Cheapest fast quality at $0.02/img
Nano Banana 2Google (Gemini 3.1 Flash Image)closed2026-031K821.50.02Fastest end-to-end (~1–3s) · in-chat editing
Recraft V4Recraftclosed2025-Q42K8850.04Brand / design assets, vector-style outputs
Ideogram 3Ideogramclosed2025-Q41K8450.03Typography & in-image text rendering
Seedream 4.5ByteDanceclosed2026-Q12K9050.03Asian aesthetics, character consistency
Stable Diffusion 4Stability AIopen2026-Q12K8060Open ecosystem · ControlNet / LoRA backbone
Hunyuan Image 3Tencentopen2026-Q12K8350Open Tencent flagship; bilingual prompts