AI Speech Synthesis

78 products

SKI

SKI is a desktop app for Mac and Windows that lets developers talk to AI coding agents and hear replies aloud. Local, offline voice loop with optional meeting transcription.

Chariot

Chariot is an AI text-to-speech tool for English, Hindi, and Hinglish with low-latency streaming and API access for natural-sounding voice apps.

gstack

gstack adds Garry Tan’s open-source specialist personas to live meetings as voice bots for Google Meet, Zoom, and Teams with a local bring-your-own-brain workflow.

PodcastorAI

PodcastorAI is an AI podcast studio that turns topics, documents, URLs, notes, or recordings into scripts, audio episodes, and video podcasts.

VocalVia

VocalVia turns PDFs, articles, notes, Markdown and web sources into editable podcast drafts and audio, so you can review the outline and script first.

SpeechifyAI

SpeechifyAI is a voice AI platform for text-to-speech, voice cloning, and voice agents, with multilingual audio and calling workflows through one API.

Alvoff Inference

Alvoff Inference is an OpenAI-compatible API for speech-to-text, text-to-speech, embeddings, and chat/code generation.

speech-core

speech-core is a C++17 library for on-device speech orchestration, with VAD, streaming and batch STT, diarization, TTS, and a voice-agent pipeline.

Voiser AI

Voiser AI voiceover turns text into spoken audio with multilingual voices and style controls for fast, natural voiceovers in the web studio.

Tico

Tico is an AI assistant for Windows that follows your cursor, reads the screen, and guides you by voice. Free plan with daily limits and paid upgrades.

Yeta AI

Yeta AI is a browser-based tool that translates and dubs public YouTube videos in real time with AI voices for tutorials, lectures, and long-form videos in 10+ languages.

Morph

Morph is a web reading platform for public-domain classics with synced narration and AI help, so you can read, listen, and browse curated titles on one page.

FlowSpeech

FlowSpeech is a context-aware text-to-speech studio that turns scripts and uploaded files into human-like audio. Free plan and paid tiers available.

Grok Speech to Text and Text to Speech APIs

Grok Speech to Text and Text to Speech APIs from xAI add transcription and speech generation to apps via REST and WebSocket, with multilingual STT and expressive TTS.

Gemini 3.1 Flash TTS

Gemini 3.1 Flash TTS is Google’s preview text-to-speech model for expressive AI speech with fine-grained style and delivery control across Gemini API, Google AI Studio, Vertex AI, and Google Vids.

Guardrails 2.0

Guardrails 2.0 by ElevenLabs keeps AI voice agents on-topic, policy-aligned, and safer to deploy in support, sales, marketing, reception, and internal workflows.

HeyGen Developers

Official HeyGen API docs for AI avatar videos, video translation, lipsync, and interactive video-agent sessions via API, MCP, and CLI workflows.

Smallest.ai Lightning TTS

Smallest.ai Lightning TTS is a low-latency text-to-speech API with multilingual speech and fast voice cloning for voice agents and production audio workflows.

Voxtral TTS

Voxtral TTS is Mistral’s text-to-speech model for lifelike multilingual speech, voice agents, and enterprise voice workflows with low latency.

Gemini 3.1 Flash Live

Gemini 3.1 Flash Live is Google’s real-time audio and voice model for natural dialogue across developer, enterprise, and consumer surfaces.

AI Speech Synthesis

Products

SKI

Chariot

gstack

PodcastorAI

VocalVia

SpeechifyAI

Alvoff Inference

speech-core

Voiser AI

Tico

Yeta AI

Morph

FlowSpeech

Grok Speech to Text and Text to Speech APIs

Gemini 3.1 Flash TTS

Guardrails 2.0

HeyGen Developers

Smallest.ai Lightning TTS

Voxtral TTS

Gemini 3.1 Flash Live

AI Speech Synthesis

Products

SKI

Chariot

gstack

PodcastorAI

VocalVia

SpeechifyAI

Alvoff Inference

speech-core

Voiser AI

Tico

Yeta AI

Morph

FlowSpeech

Grok Speech to Text and Text to Speech APIs

Gemini 3.1 Flash TTS

Guardrails 2.0

HeyGen Developers

Smallest.ai Lightning TTS

Voxtral TTS

Gemini 3.1 Flash Live