DeepSeek V4 Launch: 1.6T-Param Pro at Unbeatable Prices

DeepSeek released its V4 series on April 24, 2026: DeepSeek-V4-Pro (1.6T total / 49B active) and DeepSeek-V4-Flash (284B / 13B active), both with 1M-token context and MIT license.
V4-Pro is now the largest open-weights model — larger than Kimi K2.6 (1.1T) and GLM-5.1 (754B), and more than twice DeepSeek V3.2 (685B).
Flash is priced at $0.14/M input and $0.28/M output, undercutting GPT-5.4 Nano, Gemini 3.1 Flash-Lite, and Claude Haiku 4.5.
DeepSeek’s paper reports V4-Pro reaches 27% of V3.2’s per-token FLOPs and 10% of its KV cache size at 1M-token context; Flash reaches 10% of FLOPs and 7% of KV cache.

What Happened

Chinese AI lab DeepSeek released the first models in its V4 series on April 24, 2026: DeepSeek-V4-Pro and DeepSeek-V4-Flash, documented by Simon Willison on April 24. Both are Mixture-of-Experts models with 1-million-token context windows, released under the MIT license. The release follows DeepSeek V3.2 from December 2025 as the lab’s first major model launch of 2026.

Why It Matters

DeepSeek-V4-Pro is now the largest open-weights model available — larger than Moonshot’s Kimi K2.6 (1.1 trillion parameters) and Z.ai’s GLM-5.1 (754 billion), and more than twice the size of DeepSeek V3.2 (685B). The pricing is more notable than the parameter count: V4-Flash undercuts GPT-5.4 Nano (the cheapest OpenAI option), Gemini 3.1 Flash-Lite, and Claude Haiku 4.5 on per-million-token rates. V4-Pro is the cheapest of the larger frontier-tier models. For any deployment where cost is the binding constraint and where open weights or self-hosting are options, the calculus changes immediately.

Technical Details

DeepSeek-V4-Pro has 1.6 trillion total parameters with 49 billion activated per token; DeepSeek-V4-Flash has 284 billion total with 13 billion activated per token. Pro weighs 865 GB on Hugging Face; Flash weighs 160 GB. Pricing on DeepSeek’s API: V4-Flash at $0.14/M input and $0.28/M output; V4-Pro at $1.74/M input and $3.48/M output. Comparator pricing (per the DeepSeek-published table): GPT-5.4 Nano at $0.20/$1.25, Gemini 3.1 Flash-Lite at $0.25/$1.50, Claude Haiku 4.5 at $1/$5, Claude Sonnet 4.6 at $3/$15, Claude Opus 4.7 at $5/$25, and GPT-5.5 at $5/$30.

The efficiency claim from DeepSeek’s accompanying paper: at 1-million-token context, V4-Pro attains 27% of V3.2’s per-token FP8 FLOPs and 10% of V3.2’s KV cache size. V4-Flash pushes further at the same context length — 10% of V3.2’s FLOPs and 7% of its KV cache. DeepSeek positions a “V4-Pro-Max” reasoning configuration as outperforming GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks but trailing GPT-5.4 and Gemini-3.1-Pro by what the paper describes as a 3-to-6-month gap behind state-of-the-art frontier models.

Who’s Affected

Open-source deployment teams gain the largest available open-weights model and the cheapest commercial API rate at frontier-comparable quality. Inference providers — Together, Fireworks, Anyscale, Novita — gain a major new model to host. The closest competitive impact lands on Anthropic and OpenAI: V4-Pro at $1.74/$3.48 input/output is roughly 3x cheaper than Claude Sonnet 4.6 and 1.4x cheaper than GPT-5.4 for users who can accept the 3-to-6-month frontier-trailing capability gap. Willison notes he intends to test a quantized V4-Flash on his 128 GB M5 MacBook Pro — a useful proxy for whether the model becomes practical for individual developers.

What’s Next

Quantized versions from Unsloth and similar groups are expected within days. Independent benchmarks against the 3-to-6-month frontier gap claim will determine whether V4-Pro holds up outside DeepSeek’s self-reported numbers. Watch for U.S. policy responses given the ongoing scrutiny of Chinese-origin AI models in federal procurement, and for downstream pricing pressure on the cheapest tiers from OpenAI (GPT-5.4 Nano) and Google (Gemini 3.1 Flash-Lite).

DeepSeek V4 Launches with 1.6T-Param Pro and 284B Flash, Pricing Below All Frontier Rivals

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

DeepSeek V4 Launches with 1.6T-Param Pro and 284B Flash, Pricing Below All Frontier Rivals

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

Apple Plans iOS 27 ‘Extensions’ Feature Letting Users Pick Third-Party AI Models

OpenAI Releases GPT-5.5 Instant: 52.5% Fewer Hallucinations on High-Stakes Prompts vs GPT-5.3

OpenAI Releases Symphony: Open-Source Spec Turning Linear Into a Codex-Agent Command Center