Gemini 3.5 Live Translate: Real-Time Speech Translation

Q: What happened?

Google DeepMind released Gemini 3.5 Live Translate, its latest audio model for live speech-to-speech translation, according to its official blog. The model “automatically detects 70+ languages and generates smooth, natural-sounding translated speech that preserves the speakers’ intonation, pacing and pitch.” DeepMind framed the release as the next step in a 20-year arc that began with Google Translate as one of the company’s early machine-learning experiments.

Q: What are the technical details?

The model handles more than 70 languages with automatic detection and, unlike turn-by-turn pipelines, generates continuous translated speech rather than waiting for a speaker to finish. By preserving intonation, pacing, and pitch, it aims to keep the speaker’s vocal character intact across languages. Google says its translation systems now process over a trillion words per month across its products.

Google DeepMind released Gemini 3.5 Live Translate, an audio model for live speech-to-speech translation.
It automatically detects 70+ languages and preserves the speaker’s intonation, pacing, and pitch.
Unlike turn-by-turn systems, it produces continuous, natural-sounding translated speech.
Google says it translates over a trillion words monthly across its products.

What Happened

Google DeepMind released Gemini 3.5 Live Translate, its latest audio model for live speech-to-speech translation, according to its official blog. The model “automatically detects 70+ languages and generates smooth, natural-sounding translated speech that preserves the speakers’ intonation, pacing and pitch.”

DeepMind framed the release as the next step in a 20-year arc that began with Google Translate as one of the company’s early machine-learning experiments.

Why It Matters

Live, voice-preserving translation moves machine translation from text utility toward real-time conversation. It also extends Google’s AI product push into a high-visibility consumer feature, against a backdrop of rapid frontier releases tracked in the GPT-5.6 and Claude model race.

Technical Details

The model handles more than 70 languages with automatic detection and, unlike turn-by-turn pipelines, generates continuous translated speech rather than waiting for a speaker to finish. By preserving intonation, pacing, and pitch, it aims to keep the speaker’s vocal character intact across languages. Google says its translation systems now process over a trillion words per month across its products.

Who’s Affected

Multilingual users, travelers, and businesses running cross-language communication are the direct beneficiaries. The feature also pressures dedicated translation apps, since it ships inside Google’s existing product surface.

What’s Next

DeepMind’s blog post describes the model’s capabilities; broad rollout details, latency figures, and on-device versus cloud processing were not fully specified. Independent testing across low-resource languages will determine how evenly the 70-plus-language claim holds.

Google Releases Gemini 3.5 Live Translate for Real-Time Speech-to-Speech

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

Google Releases Gemini 3.5 Live Translate for Real-Time Speech-to-Speech

What Happened

Why It Matters

Technical Details

Who’s Affected

What’s Next

Enjoyed this story?

White House Says Moonshot Used Banned Nvidia Chips to Build Kimi K3

OpenAI Details How AP, POLITICO and Axios Use Its Models in Newsrooms

Google Ships Gemini 3.6 Flash With 17% Fewer Output Tokens