Gemini 3.5 Live Translate 是 Google 為開發者、Google Meet 與 Google 翻譯應用提供的近即時語音翻譯模型,支援 70+ 種語言,適合即時對話的自然語音翻譯。
MAI-Voice-2 是 Microsoft AI 的文字轉語音模型,提供自然且富表現力的語音,適用於助理、客服、長篇旁白與無障礙情境。可於 Microsoft Foundry 使用,支援 15 種語言/地區、情緒控制與短參考自訂聲音建立。
Voiser AI Voiceover turns text into spoken audio for voiceovers, with multilingual voice options and style controls for different narration needs. It supports a web studio workflow and shows free, paid, and enterprise paths on the site.
Our Stories is a family storytelling web app for creating, reading, and listening to custom stories in multiple languages. It is designed for multilingual households and for families who want to share bedtime stories across distance.
Wallie 是開源 AI streamer,可觀看你的螢幕、聆聽聊天室,並以可設定的人設即時生成直播評論;支援本機執行與自有金鑰,適合無真人出鏡、自治直播與即時互動。
Reader Alive is an AI ebook reader for iPhone and iPad that supports EPUB, PDF, MOBI and AZW3 files. It adds translation, text-to-speech, summaries and book-aware chat for personal ebook libraries.
Selectable is a macOS OCR and text-capture utility for extracting text from anywhere on your screen, including images and videos. It supports copying, translation on newer macOS versions, text-to-speech, and output cleanup.
FlowSpeech is a context-aware text-to-speech studio that turns scripts and uploaded files into human-like audio. It offers multiple generation modes, pause and emotion control, and a free plan alongside paid tiers.
Gemini 3.1 Flash TTS is Google’s preview text-to-speech model for generating expressive AI speech with fine-grained control over style and delivery. It is available across the Gemini API, Google AI Studio, Vertex AI, and Google Vids.
Smallest.ai Lightning TTS is a text-to-speech API for generating spoken audio from text with low latency, multilingual support, and fast voice cloning. It is aimed at developers and product teams building voice agents, narrated content, and other production speech workflows.
Claude voice mode is a beta feature that enables spoken conversations with Claude on the web and in Claude Mobile for iOS and Android. It supports hands-free and push-to-talk interaction, voice selection, and switching between speech and text within the same chat.
免費線上誦讀《古蘭經》:音訊朗讀與翻譯,並提供18種語言的逐字分析,含全部114章。
Voxtral TTS is Mistral’s text-to-speech model for generating lifelike, multilingual speech for voice agents and enterprise voice workflows. It supports short-reference voice adaptation, low-latency output, and access through Mistral Studio, Le Chat, the API, and open weights on Hugging Face.
Clipchamp 的 AI 画外音生成器是一项在线文本转语音功能,可为视频生成旁白和配音。它支持多语言语音选择、语速与音色调整,并可直接在浏览器中使用。
TADA is Hume AI's open-source speech-language model for generating speech with one-to-one text-acoustic synchronization. It is aimed at developers and researchers who need fast, reliable text-to-speech that can also fit on-device or long-form use cases.
Ondoku 是一款在线文字转语音软件,可将文本或图片中的内容转换为语音,并支持下载为 .mp3。它提供免费额度、分层付费方案和多语言语音选择,适合个人、教育和商业用途。
Xeder 是一款 Chrome 擴充功能,可將你的 X/Twitter 動態唸出來,讓你不用滑動也能接收更新;首頁標示為一次性 19 美元購買,適合工作或處理其他事情時被動聆聽。
TADA is Hume AI’s open-source speech-language model for generating speech with one-to-one text-acoustic alignment. It is aimed at developers and researchers building faster, more reliable voice systems, including on-device and long-form speech applications.
Fish Audio S2 is an open-source text-to-speech model for expressive speech generation, multi-speaker dialogue, and low-latency voice applications. It includes API and SDK access for developers building narration, assistants, and voice-enabled products.
魔音工坊 (Moying Gongfang) 是一個智慧化的線上文字轉語音 (TTS) 平台,它能利用逼真的人類聲音和多種口音,將書面文字轉換成高品質的旁白。