- DeepSeek V4 is a 1-trillion-parameter multimodal model optimized for Huawei Ascend and Cambricon chips, not Nvidia GPUs.
- The model uses a sparse Mixture-of-Experts architecture that activates only 32 billion parameters per token, cutting compute costs sharply.
- DeepSeek denied Nvidia and AMD early access, giving Chinese chipmakers a multi-week optimization head start.
- Projected pricing of $0.10-$0.30 per million input tokens would undercut GPT-5.2 and Claude Opus by 10-50x.
What Happened
Chinese AI lab DeepSeek released V4 in early March 2026, a trillion-parameter multimodal model trained and optimized entirely on Chinese-made semiconductors. The model runs on Huawei’s Ascend 910B series and Cambricon processors rather than Nvidia’s A100 or H100 GPUs, which have been restricted by U.S. export controls since late 2022.
The release was timed to coincide with China’s annual “Two Sessions” parliamentary meetings, a period when Chinese technology companies frequently showcase major achievements. DeepSeek published V4 under an Apache 2.0 open-source license, continuing the lab’s established pattern of releasing model weights publicly rather than restricting access behind commercial APIs.
According to reporting from multiple outlets, DeepSeek blocked Nvidia and AMD from receiving early access to V4, giving domestic chipmakers like Huawei a several-week head start on optimization work. This reversal of the typical dynamic, where U.S. companies receive preferential access, attracted attention from policymakers on both sides of the Pacific.
Why It Matters
V4 is the first frontier-class AI model built from the ground up for non-Nvidia hardware. Previous Chinese models, including DeepSeek’s own V3, were trained primarily on Nvidia chips acquired before export restrictions took full effect. V4 represents a fundamental shift: it validates that China’s domestic chip ecosystem can support training at the trillion-parameter scale without relying on Western semiconductor supply chains.
Huawei shipped approximately 1,900 Ascend 910B servers per month in Q4 2024 and has been scaling production throughout 2025 and into 2026. DeepSeek’s decision to optimize V4 specifically for Ascend hardware signals that the chips are now viable for frontier workloads, not just smaller inference tasks or fine-tuning runs.
The pricing implications are significant. DeepSeek projects V4 API access at $0.10 to $0.30 per million input tokens. For comparison, OpenAI’s GPT-5.2 costs $1.75 per million tokens and Anthropic’s Claude Opus charges $5.00. If DeepSeek can maintain quality at these price points, it would put substantial pressure on Western AI providers’ margins.
Technical Details
V4 uses a sparse Mixture-of-Experts architecture with 1 trillion total parameters but activates only about 32 billion per token. This keeps inference costs low while maintaining performance across multiple modalities including text, image, video, and audio generation. The model supports a 1-million-token context window, among the longest available in any publicly released model.
The model introduces three architectural innovations. Engram Conditional Memory uses hash-based lookup for constant-time retrieval, meaning the computational cost of processing 1 million tokens is roughly the same as 128,000 tokens. Manifold-Constrained Hyper-Connections cap signal amplification at 1.6x to prevent attention sink problems that degrade performance in long sequences. Dynamic Sparse Attention with a Lightning Indexer reduces compute overhead by roughly 50 percent compared to standard attention mechanisms.
Internal benchmarks suggest V4 outperforms both Claude and ChatGPT on long-context coding tasks, which DeepSeek has identified as a primary optimization target. Independent verification of these claims has not yet been published by third-party evaluators.
Who’s Affected
Nvidia faces a direct competitive challenge. If Chinese labs can train frontier models without Nvidia hardware, the strategic rationale behind U.S. chip export controls weakens considerably. Cloud providers offering Nvidia-based inference may face pricing pressure from DeepSeek’s substantially lower API costs, potentially forcing margin compression across the industry.
Developers and enterprises building on commercial AI APIs could benefit from lower costs if DeepSeek’s pricing holds at scale. Open-source researchers gain access to a trillion-parameter model with full weights under a permissive license, enabling fine-tuning and experimentation that would otherwise require millions of dollars in compute.
What’s Next
Independent benchmark results from third-party evaluators will determine whether V4’s performance claims hold up outside DeepSeek’s internal testing environment. The model’s actual availability and throughput on Ascend hardware at production scale remain unconfirmed. U.S. policymakers are likely to scrutinize whether existing export controls need further tightening in response to a frontier model that was explicitly designed to circumvent them.