Type4Me
Type4Me is a macOS speech input tool with real-time transcription and optional LLM prompt-based processing, supporting local offline and cloud streaming.
What is Type4Me?
Type4Me is a macOS speech input tool that provides real-time speech-to-text recognition and optional LLM-based text processing. It supports both local (offline) and cloud recognition engines and is designed to run with credentials and recognition history stored locally.
Its core purpose is to help users convert spoken Chinese (and, with the available local models, bilingual Chinese-English) into text with faster on-device inference when using local recognition, while also enabling configurable prompt-based workflows when using cloud-based models.
Key Features
- Local speech recognition (offline): Uses the SherpaOnnx engine (Paraformer/Zipformer) for on-device recognition without API keys, cloud account setup, or network dependency.
- Cloud streaming recognition: Connects to the Volcengine (豆包) streaming ASR to generate text while speaking, with a performance mode that can use double-channel recognition and then optimize using the full recording.
- Multiple processing modes (including custom prompts): Built-in modes cover quick real-time typing, performance-oriented double-channel flow, English translation, prompt optimization, and a command mode where speech can instruct an LLM to act on selected text and clipboard contents; users can also write their own prompts.
- Prompt context variables: Prompt templates support variables such as {text} (recognized speech), {selected} (currently selected text at recording start), and {clipboard} (clipboard content at recording start), enabling “voice becomes command” workflows.
- Local data storage: Credentials are saved locally at
~/Library/Application Support/Type4Me/credentials.json(permission 0600), recognition history is stored in a local SQLite database, and there is support to export history as CSV by date range. - Vocabulary management for ASR: Adds hot words (e.g., proper nouns) to improve recognition accuracy and supports phrase replacement (e.g., speaking an email label and substituting the real address).
How to Use Type4Me
- Install on macOS 14+: Download the DMG for Type4Me v1.2.0 and drag Type4Me.app into Applications. The first launch may show a standard macOS security warning for non–App Store apps; it can be resolved via System Settings or terminal
xattr. - Choose a recognition engine:
- Cloud-only install: The DMG flow supports cloud recognition engines.
- Local offline recognition (optional): If building from source, you can enable the local Paraformer engine and download ASR model files into
~/Library/Application Support/Type4Me/Models/.
- Configure engines and keys if using cloud: Follow the repo’s setup guidance during the first-run wizard to enter Volcengine App Key, Access Key, and Resource ID.
- Configure modes and shortcuts: In settings, select local/Paraformer or cloud engines, then use the built-in modes or custom prompts. Each mode can be bound to its own global shortcut and can use “press-and-hold to speak” or “press once to start/stop.”
Use Cases
- Offline dictation for environments without reliable network: Use the local Paraformer (SherpaOnnx) engine to transcribe speech fully on-device without API keys.
- Real-time typing with minimal delay: Use the Quick mode to have recognition inserted as soon as the recognition result is ready.
- Bilingual output workflows: With a bilingual local model, dictate Chinese speech and output English translations using the English Translation mode.
- Voice commands that act on what you’re viewing: Select text in an editor, press the bound shortcut, say a command (e.g., “translate the selected text”), and let the prompt receive
{selected}and{clipboard}context. - Improving accuracy with domain-specific vocabulary: Add organization names, product names, or technical terms as ASR hot words, and use phrase replacement for repeatable sensitive formats like email addresses.
FAQ
-
Why does macOS warn on first launch? macOS shows a security warning when opening apps that are not from the App Store. The repo provides two methods to allow opening (System Settings recommended, or terminal
xattr -d com.apple.quarantine). -
Do I need an API key for local recognition? No. When using the local SherpaOnnx-based engine, recognition runs on the device and does not require API keys or cloud accounts.
-
Where are my credentials and recognition history stored? Credentials are saved locally to
~/Library/Application Support/Type4Me/credentials.jsonwith permission 0600. Recognition history is stored in a local SQLite database and can be exported to CSV by date range. -
Can I customize how the recognized text is processed? Yes. Type4Me includes built-in modes and supports custom prompt templates. Prompt variables include
{text},{selected}, and{clipboard}. -
Is local recognition available in the prebuilt DMG? The repo notes that the DMG download flow supports cloud recognition engines. Local offline recognition requires building from source and downloading the relevant SherpaOnnx model files.
Alternatives
- macOS built-in Dictation: A convenient native option for speech-to-text, typically limited in how you can integrate prompt-based LLM processing and offline engine selection.
- Local/offline speech-to-text tools (ASR apps or CLIs): These can run without network like Type4Me’s local mode, but may not provide the same prompt-driven modes and shortcut/clipboard context workflow.
- Cloud-based transcription platforms with APIs: Useful when you want managed accuracy from a cloud model, but require network access and generally involve account/API key management unlike Type4Me’s local-first capability.
- Browser/desktop voice typing products: These focus on direct dictation inside apps; Type4Me’s distinguishing workflow is combining recognition with configurable prompt modes and local storage/export of recognition history.
Alternatives
Tactiq
Tactiq is an AI meeting assistant that provides live transcription, AI summaries, action items, and custom AI prompts for Google Meet, Zoom, and Teams.
Tavus
Tavus builds AI systems for real-time, face-to-face interactions that can see, hear, and respond, with APIs for video agents, twins & companions.
AakarDev AI
AakarDev AI is a powerful platform that simplifies the development of AI applications with seamless vector database integration, enabling rapid deployment and scalability.
Sanota
Sanota turns your voice into clear, beautiful text—capture memories and ideas easily, then start for free.
BookAI.chat
BookAI allows you to chat with your books using AI by simply providing the title and author.
skills-janitor
Audit, track usage, and compare your Claude Code skills with skills-janitor—nine focused slash commands and zero dependencies.