Gemini 3.5 Live Translate is Google’s near real-time speech translation model for developers, Google Meet, and the Google Translate app. It supports 70+ languages and is designed to produce natural-sounding translated audio during live conversations.
MAI-Voice-2 是 Microsoft AI 的文本转语音模型,适用于助手、客服、长篇叙述和无障碍场景。可在 Microsoft Foundry 使用,支持 15 种语言/地区、情绪控制和基于短参考音频的自定义声音创建。
Voiser AI Voiceover turns text into spoken audio for voiceovers, with multilingual voice options and style controls for different narration needs. It supports a web studio workflow and shows free, paid, and enterprise paths on the site.
Our Stories is a family storytelling web app for creating, reading, and listening to custom stories in multiple languages. It is designed for multilingual households and for families who want to share bedtime stories across distance.
Wallie 是一款开源 AI 直播助手,能观看屏幕、聆听聊天并以可配置人设生成实时解说。支持本地运行、使用自有密钥,适合无真人出镜内容、自动化直播和实时互动。
Reader Alive is an AI ebook reader for iPhone and iPad that supports EPUB, PDF, MOBI and AZW3 files. It adds translation, text-to-speech, summaries and book-aware chat for personal ebook libraries.
Selectable is a macOS OCR and text-capture utility for extracting text from anywhere on your screen, including images and videos. It supports copying, translation on newer macOS versions, text-to-speech, and output cleanup.
FlowSpeech is a context-aware text-to-speech studio that turns scripts and uploaded files into human-like audio. It offers multiple generation modes, pause and emotion control, and a free plan alongside paid tiers.
Gemini 3.1 Flash TTS is Google’s preview text-to-speech model for generating expressive AI speech with fine-grained control over style and delivery. It is available across the Gemini API, Google AI Studio, Vertex AI, and Google Vids.
Smallest.ai Lightning TTS is a text-to-speech API for generating spoken audio from text with low latency, multilingual support, and fast voice cloning. It is aimed at developers and product teams building voice agents, narrated content, and other production speech workflows.
Claude voice mode is a beta feature that enables spoken conversations with Claude on the web and in Claude Mobile for iOS and Android. It supports hands-free and push-to-talk interaction, voice selection, and switching between speech and text within the same chat.
使用 easyquran.ai 免费在线阅读古兰经,含音频诵读与翻译,并提供18种语言逐词解析,便于深入理解与学习。
Voxtral TTS is Mistral’s text-to-speech model for generating lifelike, multilingual speech for voice agents and enterprise voice workflows. It supports short-reference voice adaptation, low-latency output, and access through Mistral Studio, Le Chat, the API, and open weights on Hugging Face.
Clipchamp 的 AI 画外音生成器是一项在线文本转语音功能,可为视频生成旁白和配音。它支持多语言语音选择、语速与音色调整,并可直接在浏览器中使用。
TADA is Hume AI's open-source speech-language model for generating speech with one-to-one text-acoustic synchronization. It is aimed at developers and researchers who need fast, reliable text-to-speech that can also fit on-device or long-form use cases.
Ondoku 是一款在线文字转语音软件,可将文本或图片中的内容转换为语音,并支持下载为 .mp3。它提供免费额度、分层付费方案和多语言语音选择,适合个人、教育和商业用途。
Xeder 是一款 Chrome 扩展,可将你的 X/Twitter 动态读成语音,让你无需刷屏也能听更新。官网定位为一次性 19 美元购买,适合工作或处理其他事务时被动收听。
TADA is Hume AI’s open-source speech-language model for generating speech with one-to-one text-acoustic alignment. It is aimed at developers and researchers building faster, more reliable voice systems, including on-device and long-form speech applications.
Fish Audio S2 is an open-source text-to-speech model for expressive speech generation, multi-speaker dialogue, and low-latency voice applications. It includes API and SDK access for developers building narration, assistants, and voice-enabled products.
魔音工坊 (Moying Gongfang) 是一个智能在线文本转语音 (TTS) 平台,它使用逼真的人声和各种口音,将书面文本转换为高质量的画外音。