Generative AI Model Ranking Matrix

Snapshot · April 2026

Side-by-side comparison of frontier language, video, and image generation models — benchmarks, capabilities, release dates, and price.

Frontier LLMs — coding & reasoning

Higher is better on benchmarks. Price is $/1M input tokens (lower is better).

Model↕	Vendor↕	License↕	Capabilities↕	Released↕	Context↕	SWE-Bench Verified↕	SWE-Bench Pro↕	Terminal-Bench↕	Reasoning*↕	$/M in↕	Best for↕
Claude Opus 4.7	Anthropic	closed	VTRC	2026-04-16	1M	87.6	64.3	69.4	95	5	Leads SWE-Bench Verified (87.6) & Pro (64.3); new tokenizer (~35% more tokens)
Claude Opus 4.6	Anthropic	closed	VTRC	2026-02	1M	80.8	57.3	—	92	15	Large-codebase reasoning, multi-file refactors
Claude Sonnet 4.6	Anthropic	closed	VTRC	2025-12	1M	79.6	—	—	86	3	Default coding driver — 98% of Opus at ⅕ cost
Claude Haiku 4.5	Anthropic	closed	VTRC	2025-10-15	200k	73.3	—	—	78	1	4–5× faster than Sonnet 4.5; cheap multi-agent driver
GPT-5.5	OpenAI	closed	VATRC	2026-04-24	1M	87.6	58.6	82.7	95	5	SOTA Terminal-Bench 82.7; GDPval 84.9; OSWorld 78.7; ties Opus 4.7 on SWE-V
GPT-5.3 Codex	OpenAI	closed	VTRC	2026-03	400k	80.0	56.8	77.3	91	10	Full SDLC agent: debug, terminal, PRDs, tests
Gemini 3.1 Pro	Google	closed	VATRC	2026-03	2M	78.0	—	—	94	7	Adjustable Deep Think; agentic browsing (BrowseComp 85.9)
Grok 4.20 Beta 2	xAI	closed	VT	2026-03-03	256k	75	—	—	80	2	4-agent backbone · IFBench #1 (83); 8% of Opus cost
Qwen3.6-Max-Preview	Alibaba	closed	VTR	2026-04-20	1M	79.5	—	77.1	89	6	Agent loops — preserve_thinking across tool calls
DeepSeek V4-Pro	DeepSeek	open	TR	2026-04-27	128k	80.6	55.4	67.9	92	0.9	1.6T MoE / 49B active · matches Opus 4.6 at fraction of cost
DeepSeek V4-Flash	DeepSeek	open	TR	2026-04-27	128k	—	—	—	—	0.3	Fresh · 158B fast variant of V4
Kimi K2.6	Moonshot	open	VTR	2026-04-29	256k	80.2	58.6	66.7	94	0.6	1.1T MoE · ties GPT-5.5 on SWE-Pro at $0.6/M; HLE 54 leads all
GLM-5.1	Zhipu / Z.ai	open (MIT)	VTR	2026-04-07	128k	79.0	58.4	—	88	0.11	754B · #1 SWE-Bench Pro among open weights
MiniMax M2.7	MiniMax	open	TR	2026-04-20	205k	78	56.22	57.0	87	0.3	229B · self-evolving; matches Codex on SWE-Pro
Qwen3.5-397B-A17B	Alibaba	open	VTR	2026-04-24	256k	80.0	—	54.0	92	—	403B MoE / 17B active · GPQA 88.4, MMLU-Pro 87.8, SWE-V 80.0
Qwen3.6-35B-A3B	Alibaba	open	VTR	2026-04-24	256k	73.4	49.5	—	84	—	36B / 3B active · 73.4 SWE-V on tiny active params
MiMo-V2.5-Pro	Xiaomi	open	TR	2026-04-29	1M	—	57.2	—	89	—	1.02T MoE / 42B active · matches Opus 4.6 on SWE-Pro w/ 40% fewer tokens
Tencent Hy3-preview	Tencent	open (preview)	VTR	2026-04-24	128k	74.4	—	54.4	88	—	295B / 21B active · +40pts SWE-V vs Hy2; topped Tsinghua math PhD exam
Mistral Large 3	Mistral	open (Apache 2.0)	VT	2025-12-02	256k	—	—	—	50	2	675B MoE / 41B active · non-reasoning (AIME ~40, GPQA ~44) · Apache 2.0
Mistral Medium 3.5	Mistral	open weights	TR	2026-04-30	128k	77.6	—	—	80	0.6	128B · SWE-V 77.6 beats Devstral 2 + Qwen3.5; τ³-Telecom 91.4
Devstral-2-123B	Mistral	open weights	TC	2026-02-25	128k	—	—	—	—	0.6	125B · code-specialized variant
Gemma 4 31B-it	Google	open	VT	2026-04-29	256k	52	—	—	87	—	31B dense · AIME 89.2, GPQA 84.3, τ²-Retail 86.4 (12× jump from G3)
Nemotron-3-Nano-Omni	NVIDIA	open	VATR	2026-04-29	128k	—	—	—	—	—	Fresh (~17h) · 30B-A3B any-to-any reasoning
Meta Avocado	Meta	rumored / closed	tbd	2026-Q2 (est.)	—	—	—	—	—	—	Llama successor, slipping to May/Jun 2026

Capabilities key: V Vision (image input) A Audio T Tools / function calling R Extended reasoning C Computer / browser use

top tier mid lower *Reasoning is a composite score (GPQA Diamond / HLE / AIME / ARC-AGI-2, normalized 0–100, directional).

Video generation models

Quality is Artificial Analysis Image-to-Video rank (1 = best).

Model↕	Vendor↕	License↕	Released↕	I2V Rank↕	Max len (s)↕	Resolution↕	Audio↕	$/gen↕	Best for↕
Kling 3.0	Kuaishou	closed	2026-02	1	15 (multi-shot)	1080p	synced dialogue + SFX	1.00	Cinematic shots; #1 general-purpose
Veo 3.1	Google	closed	2026-01	2	8	1080p	yes	0.75	High-fidelity realism, prompt adherence
Sora 2	OpenAI	closed	2025-10	3	12	1080p	yes	0.90	Imaginative T2V; ChatGPT-integrated
Seedance 2.0	ByteDance	closed	2026-04	4	15	1080p	yes	0.50	Product ads, e-comm, character consistency
LTX-2	Lightricks	open	2025-Q4	5	10	4K@50fps	native sync	0.20	Open-weights leader; 4K + audio
Wan 2.2	Alibaba	open	2025-Q3	6	6	720p	—	0.05	Runs on a 4070; novel MoE denoiser

Image generation models

Quality is directional ELO-style score from public leaderboards. Speed is typical wall-clock per image.

Model↕	Vendor↕	License↕	Released↕	Max res↕	Quality*↕	Speed (s)↕	$/img↕	Best for↕
Midjourney v8	Midjourney	closed	2026-03	2K	95	10	0.04	Aesthetic king · rewritten engine, 5× faster than v7
FLUX.2 [pro]	Black Forest Labs	closed (API)	2026-Q1	2K	93	4.5	0.04	Photoreal commercial — best skin, lighting, materials
FLUX.2 [dev]	Black Forest Labs	open weights	2026-Q1	2K	89	6	0	Top open-weights photoreal model
GPT Image 2	OpenAI	closed	2026-Q1	1K+	92	6	0.04	Best prompt adherence — complex composed scenes
Imagen 4 Ultra	Google	closed	2025-Q4	2K	94	5	0.04	Photoreal flagship; strong text rendering
Imagen 4 Fast	Google	closed	2026-03	1K	86	2	0.02	Cheapest fast quality at $0.02/img
Nano Banana 2	Google (Gemini 3.1 Flash Image)	closed	2026-03	1K	82	1.5	0.02	Fastest end-to-end (~1–3s) · in-chat editing
Recraft V4	Recraft	closed	2025-Q4	2K	88	5	0.04	Brand / design assets, vector-style outputs
Ideogram 3	Ideogram	closed	2025-Q4	1K	84	5	0.03	Typography & in-image text rendering
Seedream 4.5	ByteDance	closed	2026-Q1	2K	90	5	0.03	Asian aesthetics, character consistency
Stable Diffusion 4	Stability AI	open	2026-Q1	2K	80	6	0	Open ecosystem · ControlNet / LoRA backbone
Hunyuan Image 3	Tencent	open	2026-Q1	2K	83	5	0	Open Tencent flagship; bilingual prompts