Releases: 3choff/dictate
v1.13.0
Internationalisation, Voice Control & Tray Improvements
This release brings major usability and accessibility improvements, with full UI localisation, multi-language voice commands, and a significantly refined system tray experience. Dictate is now more flexible, more international, and more seamless to use as a background tool.
✨ Highlights
- Multi-language voice commands with native support for 9 languages
- Full UI internationalisation (i18n) across the entire application
- Re-engineered system tray for instant, native-like performance
- Word correction system to fix common mis-transcriptions automatically
🆕 What’s New
-
Multi-Language Voice Commands
- Native voice control in Italian, Spanish, French, German, Dutch, Portuguese, Chinese, Japanese, and Russian
- Language-specific command prefixes to avoid accidental triggers
- Fully integrated with the existing transcription pipeline
-
Complete UI Localisation
- All UI elements translated, including Settings, Main window, and System Tray
- App language selection in Settings → Interface
- Clear separation between UI language and transcription language
-
Word Correction System
- Custom dictionary to automatically fix common mis-transcriptions
- Fuzzy matching with adjustable threshold
- Full management UI in Settings → Transcription
-
Custom Rewrite Prompts
- Edit and personalise rewrite prompts for each rewrite mode
- Automatic switch to Custom mode when presets are modified
- Language preservation to prevent unwanted translations
-
Compact View Toggle
- New toggle in Settings → Interface → Appearance
- Fully synchronised with the global shortcut (Ctrl+Shift+V)
🎛 Interface & Tray Improvements
-
Custom System Tray
- Instant tray menu appearance via pre-created hidden window
- Improved screen-bound positioning logic
- Native-style exit behaviour and close-to-tray support
-
Interface Settings Section
- New dedicated section for appearance and behaviour settings
- Start hidden, close to tray, launch on startup, theme, and language controls
🛠 Improvements & Fixes
- Improved clipboard and keyboard text insertion reliability
- Fixed missing trailing spaces in batch transcription
- Prevented settings persistence issues affecting window state
- Resolved Light theme styling inconsistencies
- Cleaned up legacy code and fixed streaming stability issues
- Improved settings synchronisation with external state changes
Full Changelog:
v1.10.0 → v1.13.0
v1.10.0
v1.10.0 – System Tray Support, Theme System, and UX Enhancements
This release brings another major upgrade to Dictate with full system tray integration, a complete dark/light theme system, improved rewrite behaviour, and multiple UI refinements. Dictate is now easier to keep running in the background and more visually consistent than ever.
🚀 System Tray Integration
Dictate can now run quietly in the background, always ready but never in the way.
-
New system tray icon with Show/Hide, Settings, and Quit options
-
Left-click toggles visibility of the main window
-
App no longer occupies the Windows taskbar while running
-
Start Hidden option lets Dictate launch directly to the tray
-
Launch on Startup feature (via Windows Registry)
-
Smart auto-show during dictation:
- If hidden, window appears automatically when dictation starts
- If auto-shown, it hides again once dictation ends
🎨 Theme System (Dark & Light Modes)
A fully implemented theme architecture brings polished, consistent styling across the entire UI.
- New Dark/Light theme modes with smooth animated transitions
- Theme toggle in settings with system preference detection
- Centralised theme variables (
theme.css) for colours and states - All UI elements made theme-aware (buttons, icons, tooltips, visualisers)
- All hardcoded colours removed in favour of CSS variables
✨ UI & UX Improvements
- “Customize” settings reorganised into clear groups: Input / Output / UI
- Shared tooltip component centralised for reuse across windows
- Polished visual elements: theme-aware sparkle icon, consistent hover/active states
- Audio visualiser now theme-responsive with animated colour changes
📝 Rewrite Improvements
- Ctrl + Shift + R now reliably selects all text before rewriting
- “Press rewrite” voice command updated to match this behaviour
- Works consistently across all input fields in both batch and streaming modes
⚙️ Technical Enhancements
- Added
tauri-plugin-autostartfor startup integration - Added Tauri features:
tray-icon,image-ico - New settings keys:
start_hidden,autostart_enabled - Tooltip styling unified across main and settings windows
- Help button fixed in About section
Full Changelog:
v1.8.0 → v1.10.0
v1.8.0
v1.8.0 – Push-to-Talk, Text Rewrite System & Major UI Enhancements
This release marks a major milestone for Dictate, bringing push-to-talk recording, a powerful text rewrite system, and a complete UI refactor for a smoother, faster, and more intuitive experience.
🎤 Push-to-Talk (PTT) Mode
- Hold a keyboard shortcut to record; release to stop and transcribe instantly.
- Works with all batch providers: Groq, Gemini, Mistral, SambaNova, Fireworks.
- Automatically disables when switching to streaming providers.
- No buffering delay — transcription starts immediately on key release.
- Warnings and notifications guide users when PTT is incompatible with selected providers.
🧠 Text Rewrite System
- Replaces simple grammar correction with a complete text rewriting toolkit.
- Five rewrite modes: Grammar, Professional, Polite, Casual, Structured.
- Multi-provider support and fully customisable prompts.
- Accessible via the new rewrite button, shortcut (Ctrl + Shift + R), or voice command (“press rewrite”).
- Dedicated Rewrite settings tab with provider and mode configuration.
🎨 UI & Settings Redesign
- Fully restructured settings window with tabbed navigation: General, Transcription, Rewrite, Grammar, and About.
- Sidebar navigation with icons, active state highlighting, and smooth transitions.
- New tooltip system for settings guidance with animations and smart positioning.
- Improved visual design: refreshed microphone icon, modern recording button, and new application logo.
🐞 Fixes & Polish
- Tooltip clipping and alignment issues resolved in compact mode.
- Corrected SVG rendering and styling consistency.
- Fixed trailing space issues in terminal and browser transcriptions.
- Settings now load and auto-save reliably across all UI components.
Full Changelog: v1.6.0 → v1.8.0
v1.6.0
v1.6.0 – Redesigned Settings Interface & Silero VAD Integration
This release delivers two major improvements: a complete UI refactor of the Settings window for better organisation and usability, and the integration of a new ** Voice Activity Detection (VAD)** system for more accurate speech segmentation.
✨ What's New
🎨 Redesigned Settings Interface
- Introduced tabbed navigation with four clear sections: General, Transcription, Grammar, and About.
- Added sidebar with icon-based navigation and smooth section transitions.
- Modularised settings architecture for improved maintainability and scalability.
- Enhanced About tab: now includes app version display, update checker, and quick access buttons (Help, GitHub, Ko-fi).
- Fully responsive design with scrollable content and improved layout handling.
🧠 Silero VAD (Voice Activity Detection)
- Integrated ML-based Silero VAD model for precise speech segmentation.
- Replaces older RMS-based system, significantly reducing background noise triggers and improving accuracy across microphones.
- Event-driven segmentation pipeline for all batch transcription providers (Groq, Gemini, Mistral, SambaNova, Fireworks).
- Optimised detection timing for faster and more natural response.
🛠️ Technical Improvements
- New modular settings file structure:
sections/for tab content andfields/for reusable UI components. - Implemented smart dropdown positioning and z-index fixes to prevent clipping in the settings window.
- Rust backend enhancements with async VAD session management and efficient audio buffering.
🚀 Benefits
- More intuitive settings navigation and cleaner interface.
- Far more accurate and responsive speech detection.
- Easier to maintain modular codebase for both frontend and backend.
Full Changelog: v1.4.0...v1.6.0
v1.4.0
v1.4.0 – Unified Audio Pipeline & Major Refactor
This release marks a major architectural milestone for Dictate, introducing a unified audio and provider system that significantly improves maintainability, performance, and microphone compatibility.
✨ What's New
-
Unified Audio Pipeline
- A single, optimised audio processing system now powers all transcription providers.
- Improved performance, reduced complexity, and consistent behaviour across all providers.
-
Provider Abstraction Layer
- Introduced a shared
BaseProviderarchitecture with unified interfaces (start,stop,getName,getType). - Simplifies the addition of new transcription providers.
- Introduced a shared
-
Recording Session Manager
- Centralised control of audio capture, provider lifecycle, and visualisation.
- Enhanced reliability and resource handling for start/stop operations.
-
Speech-Optimised Audio Visualiser
- New frequency weighting for clearer and more balanced visuals.
- Refined sensitivity and animation for quieter laptop microphones.
🧠 Technical & Performance Improvements
- Modernised codebase with 100+ lines of legacy code removed.
- Unified PCM16 pipeline at 16 kHz across all providers.
- Enhanced laptop microphone support with automatic gain control.
- Simplified Deepgram integration using direct PCM16 streaming.
🚀 Benefits
- More reliable audio and transcription performance.
- Simpler architecture for easier future expansion.
- Faster and more efficient resource handling.
- Better UX with smoother visual feedback and wider mic compatibility.
Full Changelog: v1.2.0...v1.4.0
v1.2.0
v1.2.0 - UI Refinements and Feature Updates
This release introduces several UI enhancements, new features, and bug fixes to improve the user experience and application stability.
✨ What's New
- Version Display: The app version is now displayed in the settings footer for easy reference.
- Update Notifications: The app now automatically checks for new versions on GitHub and displays a clickable "New version available" notice in the settings footer.
- Modernized UI:
- The main and settings windows now feature rounded corners.
- The settings window shadow has been removed for a cleaner design.
- The help and settings button icons have been updated.
- Bug Fixes:
- The
Ctrl+Shift+Ggrammar correction shortcut is now more reliable. - Fixed an issue that prevented the update notification from opening the releases page.
- Prevented unnecessary scrollbars from appearing in the UI.
- The
Full Changelog: v1.0.0...v1.2.0
v1.0.0 - Dictate
🎉 Dictate v1.0.0 - Major Release
This is the first stable release of Dictate, a high-performance desktop dictation application for Windows built with Tauri and Rust. This release marks the complete migration from Electron to Tauri, delivering powerful speech-to-text capabilities with ~80% reduction in memory usage and significantly faster startup times.
📥 Download
Download Dictate_1.0.0.exe below and run it on any Windows system. No installation required - just launch and start dictating!
✨ Key Features
-
🎤 Multi-Provider Transcription Support
- Groq (Whisper-Large-v3-Turbo) - Silence-based chunking
- Deepgram (Nova-3) - Real-time streaming
- Cartesia - Real-time streaming with PCM pipeline
- Google Gemini (2.5 Flash Lite)
- Mistral (Voxtral)
- SambaNova (Whisper-Large-v3)
- Fireworks Whisper
-
✍️ Grammar Correction
- Multi-provider support (Groq, Gemini, Mistral, SambaNova, Fireworks)
- Select text in any application and press
Ctrl+Shift+G - Default provider: Groq GPT-OSS-120B
-
🎯 Seamless Text Insertion
- Automatically inserts transcribed text into any active application
- Two insertion modes: Simulated Typing or Clipboard paste
- Real-time transcription with streaming providers
-
⌨️ Global Keyboard Shortcuts
Ctrl+Shift+D- Toggle RecordingCtrl+Shift+G- Grammar CorrectionCtrl+Shift+V- Toggle Compact ModeCtrl+Shift+S- Toggle SettingsCtrl+Shift+X- Exit Application
-
🗣️ Voice Commands
- Punctuation: "period", "comma", "question mark", etc.
- Navigation: "press enter", "backspace", "press tab"
- Control: "press copy", "select all"
- Advanced: "delete that", "correct grammar", "pause dictation"
-
🌍 Multilingual Support
- Select transcription language in settings
- Auto-detect mode for multilingual audio
- Language hints passed to compatible providers
-
🔊 Audio Cues
- Audible "beep" when recording starts
- Audible "clack" when recording stops
🚀 Performance Improvements
Compared to the legacy Electron version (v0.6.7):
- ~80% reduction in memory usage
- Near-instant application launch
- Significantly smaller binary size
- Type-safe Rust backend eliminates runtime errors
🛠️ Technical Stack
- Backend: Rust with Tauri 2.x framework
- Frontend: HTML, CSS, vanilla JavaScript
- Audio Processing: Web Audio API with streaming support
- Keyboard Injection: Native Windows API via
enigocrate
📖 Getting Started
- Download
Dictate_1.0.0.exefrom the assets below - Run the executable (no installation required)
- Click the gear icon to configure your API keys
- Select your preferred transcription provider
- Press
Ctrl+Shift+Dto start dictating!
📚 Documentation
- Full documentation available in README.md
- Complete changelog in CHANGELOG.md
- Architecture details in TAURI_STRUCTURE.md
⚠️ Migration Notes
If you're coming from the Electron version (v0.6.7):
- The Electron version is preserved in the
electron/directory - Settings from Electron are not automatically migrated
- All functionality is replicated or improved in this Tauri version
🐛 Known Issues
None reported for this release.
📄 License
This project is licensed under the Apache License 2.0.
Full Changelog: https://github.com/3choff/dictate/blob/main/CHANGELOG.md