speech-core is a C++17 on-device voice-agent pipeline engine for VAD, streaming and batch speech-to-text, diarization, and text-to-speech.
Voiser.ai is an AI text-to-speech and voiceover generator for fast narration, promotional content, and multilingual audio projects.
Podio: News Podcast Maker uses AI to turn your chosen topics and news interests into a personalized daily podcast stream for hands-free listening on iPhone and iPad.
Tico is an AI assistant for Windows that listens to your voice questions, understands your screen, and gives spoken guidance with click points.
Yeta AI translates and dubs public YouTube videos in real time with AI voices in 10+ languages. Start free: 15 min/month, no card.
Morph combines ebooks and audiobooks into a synced experience—read, listen, or both at once—plus an AI assistant for book questions.
FlowSpeech converts scripts to human-like AI text-to-speech with context-aware emotion, precise pause control, and 30+ voices in 70+ languages.
Grok Speech to Text and Text to Speech APIs by xAI convert audio and text with low-latency REST/WebSocket endpoints, multilingual support, diarization.
Gemini 3.1 Flash TTS by Google is a text-to-speech model for natural, expressive AI speech with granular audio tags and SynthID watermarking.
Configurable safety and behavioral controls for ElevenAgents, guiding voice AI responses and blocking unsafe or off-policy outputs before users see them.
HeyGen Developers offers an API platform to generate, translate, and lipsync avatar videos with TTS models—built for scalable production workflows.
Lightning TTS v3 is Smallest.ai’s low-latency, multilingual text-to-speech API with voice cloning—made for voice agents and production audio.
Voxtral TTS is Mistral AI’s multilingual TTS model for natural, low-latency speech and adaptable speaker voices for voice agents.
Gemini 3.1 Flash Live is Google’s real-time audio and voice model for natural, reliable voice interactions across Google products and developer APIs.
Turn any article into a podcast episode. Paste a link to listen in your podcast app or subscribe to a daily topic-curated feed.
Voizematic is AI voice agent software for inbound/outbound phone automation, including unlimited calls, Google Calendar booking, and follow-ups in 25+ languages.
Clipchamp AI Voice Over Generator is an online text-to-speech tool to create realistic voiceovers for videos. Choose languages, speed & emotion.
Maestra is an AI media translation platform that generates transcripts, subtitles, and multilingual voiceovers for video and audio localization.
Inworld AI offers realtime text-to-speech, speech-to-text, and speech-to-speech APIs—plus a Router for failover across multiple LLM providers.
Fliki creates AI videos and voiceovers from text, ideas, PPTs, blogs, or product URLs—multilingual with AI avatars. Start free, no credit card.