A macOS menu bar app that provides local speech-to-text transcription using OpenAI's Whisper model. All processing happens on-device via WhisperKit - no cloud services required.
- Push-to-talk: Hold Fn key to record, release to transcribe
- Auto-insert: Transcribed text is automatically typed into the active application
- Privacy-focused: 100% local processing, no data leaves your Mac
- Multiple Whisper models: Choose between speed and accuracy
- Menu bar interface: Always accessible, minimal footprint
- Audio feedback: Sound cues for recording start and transcription complete
- macOS 14.0 (Sonoma) or later
- Apple Silicon Mac (M1/M2/M3)
- ~150 MB disk space for the Base model (more for larger models)
- Clone the repository
- Run the build script:
./build-app.shThis will:
- Build the app in release mode
- Create the app bundle
- Code sign the app
- Install to
/Applications
- Launch from
/Applications/VoiceScribe.app
Note: You'll need to update the code signing identity in build-app.sh to match your own Apple Developer certificate, or remove the signing step for local testing.
- First launch: Grant the required permissions when prompted
- Download a model: Open Settings from the menu bar icon and download a Whisper model
- Start transcribing: Hold the Fn key while speaking, release when done
- The transcribed text will be automatically inserted at your cursor position
Hold Fn → Speak → Release Fn → Text appears
| Model | Size | Speed (10s audio) | Description |
|---|---|---|---|
| Tiny | ~75 MB | ~0.1s | Fastest, basic accuracy |
| Base | ~145 MB | ~0.1s | Good balance (recommended) |
| Small | ~480 MB | ~0.2s | Better accuracy |
| Medium | ~1.5 GB | ~0.6s | High accuracy |
| Large v3 | ~3 GB | ~1.1s | Best accuracy |
Speeds measured on Apple Silicon. All models transcribe faster than real-time.
Models are downloaded on-demand and cached locally. You can download multiple models and switch between them in Settings.
VoiceScribe requires the following system permissions:
| Permission | Purpose | Required |
|---|---|---|
| Files & Folders | Store downloaded Whisper models in Documents | Yes |
| Microphone | Capture audio for transcription | Yes |
| Accessibility | Insert transcribed text into apps | Yes |
| Input Monitoring | Detect Fn key press/release | Yes |
Optional:
| Setting | Purpose |
|---|---|
| Launch at Login | Start VoiceScribe automatically when you log in |
On first launch, an onboarding screen will guide you through granting these permissions. You can also manage them in Settings or System Settings > Privacy & Security.
Note: Granting Input Monitoring permission will restart the app automatically.
- Swift / SwiftUI
- WhisperKit - OpenAI Whisper models optimized for Apple Silicon
- AVFoundation - Audio recording
- IOKit - Fn key monitoring
- Accessibility API - Text insertion
MIT