KI Spracherkennung

102 Produkte

Stream

Stream is Sandbar’s private voice ring and conversational notes app for capturing ideas, dictating text, and managing spoken notes on iOS and Mac. It is designed for hands-free voice input with push-to-talk control and silent haptic confirmation.

Dictately

Dictately is a macOS voice dictation app that types cleaned-up speech directly into desktop apps. It is designed for people who want faster writing, context-aware output, and multilingual support without browser extensions or plugins.

QuickQuill

QuickQuill is a macOS dictation and transcription app that runs locally on the device. It helps users record meetings, transcribe audio, generate summaries, and export notes without using a cloud service.

Lispr

Lispr is a free Windows voice dictation app with built-in translation. It types at your cursor in any app, with no account required and support for roughly 99 spoken languages.

Universal-3.5 Pro

Universal-3.5 Pro is AssemblyAI’s async speech-to-text model for recorded audio and video. It adds native code-switching, speaker diarization, and contextual prompting, with a base price of $0.21 per hour of audio submitted.

VoicePad AI

VoicePad AI is an offline dictation app for Windows, Android, iOS, and macOS that uses on-device Whisper AI to turn speech into text. It is designed for private, cross-platform voice typing with clipboard or direct-insert workflows.

Alvoff Inference

Alvoff Inference is an OpenAI-compatible API for speech-to-text, text-to-speech, embeddings, and chat/code generation. It is built for developers who want to swap in a different base URL, use familiar SDKs, and pay per request.

VTT

VTT is a native macOS dictation app that keeps transcription on-device by default and supports optional cloud engines with your own API key. It’s built for private voice-to-text on Mac, with per-language routing and local transcript history.

EchoTranscribe

EchoTranscribe is an open-source desktop app for local audio transcription using Whisper models. It keeps files on-device, supports batch transcription, and exports transcripts as TXT, SRT, or JSON.

Synopsule

Synopsule is a Mac and iPhone app that records and transcribes conversations locally, with optional on-device summaries and exports for sharing notes outside the app. It is built for users who want speaker-labeled transcripts without sending audio to the cloud.

Lucidly

Lucidly ist eine AI-Kommunikations-Coaching-App für iPhone, die in kurzen täglichen Sprechübungen und mit gezieltem Feedback hilft, Ideen klar zu erklären.

Signal Recorder SR-7

Signal Recorder SR-7 is a voice recorder for Mac and iPhone that transcribes and summarizes audio locally, with Markdown export and optional iCloud sync. It is a one-time purchase with no accounts or subscription.

speech-core

speech-core is a C++17 library for on-device speech orchestration, including VAD, streaming and batch STT, diarization, TTS, and a voice-agent pipeline. It runs locally and uses optional ONNX Runtime or LiteRT backends for model inference.

Krisp Voice Translation API

Krisp Voice Translation API is a self-serve real-time speech translation API for developers. It translates live speech, returns translated audio and transcripts, and includes background voice cancellation plus custom vocabulary support.

Vox

Vox is an on-device AI dictation app for Mac and Windows. It lets you speak, clean up the output, and paste text locally without a cloud round-trip or account sign-up.

Wave

Wave is a native macOS dictation app that converts speech to text with a hold-to-talk shortcut, local Whisper support, and an optional Groq mode for faster transcription. It is free and open source, and the app itself does not require an account.

Daisy

Daisy is an open-source meeting recorder and push-to-talk dictation app for Mac. It records locally, transcribes on-device, and can expose transcripts through a local MCP server for Claude Desktop, Cursor, and other compatible clients.

LocalClicky

LocalClicky is a macOS voice assistant that runs locally and helps control apps, files, reminders, and browser actions by voice. It combines wake-word listening, local transcription, Ollama-based reasoning, and optional screen vision without cloud APIs or subscriptions mentioned in the repository.

Sun

Sun is a realtime voice API for building collaborative voice interaction experiences. It is positioned for developers who need live voice capabilities beyond one-on-one chat.

Ringg AI

Ringg Parrot STT V1 is a real-time speech-to-text API for voice AI agents, contact centers, and transcription workflows. It supports Hindi, English, and code-mixed speech, with a playground for evaluation and production access handled by Ringg AI approval.