Automatic language detection
The model automatically detects 70+ languages and can translate live speech without manual language setup, which reduces friction in multilingual conversations.
Gemini 3.5 Live Translate is Google’s near real-time speech translation model for developers, Google Meet, and the Google Translate app. It supports 70+ languages and is designed to produce natural-sounding translated audio during live conversations.
Gemini 3.5 Live Translate is Google’s audio model for live speech-to-speech translation. It detects more than 70 languages, processes speech as it is streamed, and generates translated audio that stays close to the speaker in near real time.
Google is rolling out the model across three product surfaces: developers can access it through the Gemini Live API and Google AI Studio, enterprises can use it in private preview in Google Meet, and consumers can use it in the Google Translate app on Android and iOS. The model is aimed at live interpretation scenarios where low latency, natural voice output, and multilingual conversation matter.
The model automatically detects 70+ languages and can translate live speech without manual language setup, which reduces friction in multilingual conversations.
Instead of waiting for a full turn to finish, the model translates continuously and stays only a few seconds behind the speaker, helping conversations sound more natural.
Google says the translated voice preserves the speaker’s intonation, pacing, and pitch, which helps the output sound closer to the original delivery.
The model is designed to handle streamed speech and noisy, unpredictable environments, making it suitable for live calls, meetings, lessons, and broadcasts.
Google says the model can be used through the Gemini Live API and Google AI Studio, and that partners such as Agora, Fishjam, LiveKit, Pipecat, and Vision Agents use those APIs to build voice translation apps.
Google says all generated audio is watermarked with SynthID so AI-generated speech remains detectable.
Use the model for live interpretation when two or more speakers need to keep talking naturally across language barriers, such as interviews, calls, or cross-border conversations.
Teams running meetings in Google Workspace can use the Meet integration for speech translation during business calls, with access described as private preview at launch.
Developers can build voice translation experiences in Google AI Studio or through the Gemini Live API, including apps that plug into real-time media infrastructure.
People using the Google Translate app on Android or iOS can use Live translate for more seamless translation on the go, including headphones-based listening and Android earpiece mode.
The model is suitable for settings such as lessons, broadcasts, and noisy environments where streaming translation and low latency matter more than turn-based transcription.
Google says 3.5 Live Translate is rolling out to developers in public preview through the Gemini Live API and Google AI Studio, to enterprises in private preview in Google Meet, and to everyone in the Google Translate app on Android and iOS. The rollout is described as starting with these products and previews.
In Google Meet, speech translation will soon use 3.5 Live Translate. Google says the update will offer 70+ languages, support over 2,000 language combinations in one meeting, and update the interface for instant access to speech translation.
In Google Translate, the Live translate feature uses the model globally on Android and iOS. Google says users can connect any pair of headphones for a more seamless experience, and Android users are also getting a listening mode that streams translated audio through the phone’s earpiece.
The launch post does not list standalone pricing. The public signal in the source is availability by product surface and preview tier: public preview for developers, private preview for select business Google Workspace customers, and global rollout in Google Translate.
Google notes that all audio generated by its models is watermarked with SynthID, and it points readers to the model card for safety and responsibility details.
Wallie is an open-source AI streamer that watches your screen, hears chat, and generates live commentary in a configurable persona. It runs locally on your machine with your own keys and is aimed at faceless content, autonomous streams, and real-time reactions.
Sanota is an app that turns spoken memories, reflections, and interviews into clear written stories. It supports personal storytelling, family history, and shared memories, with guided prompts and subscription pricing.
Carbon Voice is an asynchronous voice messaging app for teams and individuals, with transcripts, AI catch-up, and cross-device access. It helps people and agents communicate without needing a live call.
BeFreed is a personalized audio learning app that turns books and other knowledge sources into narrated listening experiences. It helps people learn on demand through interactive audio, voice selection, and built-in learning tools.
MagicSlides is an AI presentation generator that turns text, topics, documents, URLs, and videos into slide decks. It creates presentations in Google Slides by default and supports PowerPoint export, with multilingual output and AI-assisted editing.
Microsoft Translator is a Bing translation web app for translating short text between English and more than 100 languages. It also supports image capture translation and basic output actions like listen and copy.