TOOL UPDATES

OpenAI Releases GPT-5.4 Mini to Free ChatGPT Users, Doubles Speed Over GPT-5 Mini

M megaone_admin Mar 22, 2026 2 min read
Engine Score 8/10 — Important

This news significantly impacts millions of free ChatGPT users by providing access to a faster, newer GPT-5.4 mini model. The direct user benefit and the novelty of a model update make it highly relevant, despite the moderate source reliability.

Editorial illustration for: OpenAI Releases GPT-5.4 Mini to Free ChatGPT Users, Doubles Speed Over GPT-5 Mini

OpenAI has made GPT-5.4 mini available to free-tier ChatGPT users, marking the first time the company’s newest model family has been accessible without a subscription. The model, announced on March 17 and rolled out to ChatGPT on March 18, runs more than twice as fast as its predecessor GPT-5 mini while matching or exceeding its performance on standard benchmarks.

GPT-5.4 mini is positioned as a lightweight, high-throughput model optimized for tasks where speed and cost matter more than maximum capability. OpenAI reports the model scores 54.4 percent on SWE-Bench Pro, approaching GPT-5.4’s full score of 57.7 percent — a narrow gap that makes the mini variant competitive for most coding and reasoning tasks at a fraction of the compute cost. Alongside the mini release, OpenAI also introduced GPT-5.4 nano, available through the API for applications requiring even lower latency.

The decision to offer GPT-5.4 mini to free users reflects a strategic shift. OpenAI has historically reserved its newest models for paying subscribers, using the performance gap as a conversion mechanism. By closing that gap earlier in the product cycle, the company appears to be prioritizing user acquisition and platform stickiness over immediate subscription revenue — a move that follows increased competition from Anthropic’s Claude, Google’s Gemini, and open-weight alternatives like Qwen and Llama.

For developers, the API pricing on GPT-5.4 mini is structured to compete directly with Claude 3.5 Haiku and Gemini Flash as the default choice for high-volume workloads, coding assistants, and multi-agent pipelines where subagent calls need to be fast and cheap. The model supports multimodal inputs including images and files, maintaining feature parity with its larger siblings.

The nano variant targets edge cases where response latency under 200 milliseconds is critical — real-time autocomplete, inline suggestions, and embedded assistants. OpenAI has not disclosed nano’s benchmark scores, suggesting the model trades accuracy for speed in ways that make direct comparison with mini or full-size models misleading. Both models are available immediately through the OpenAI API and Codex.

Share

Enjoyed this story?

Get articles like this delivered daily. The Engine Room — free AI intelligence newsletter.

Join 500+ AI professionals · No spam · Unsubscribe anytime

M
MegaOne AI Editorial Team

MegaOne AI monitors 200+ sources daily to identify and score the most important AI developments. Our editorial team reviews 200+ sources with rigorous oversight to deliver accurate, scored coverage of the AI industry. Every story is fact-checked, linked to primary sources, and rated using our six-factor Engine Score methodology.

About Us Editorial Policy