16 Dec 15:29

3choff

6130e4d

v1.13.0 Latest

Latest

Internationalisation, Voice Control & Tray Improvements

This release brings major usability and accessibility improvements, with full UI localisation, multi-language voice commands, and a significantly refined system tray experience. Dictate is now more flexible, more international, and more seamless to use as a background tool.

✨ Highlights

Multi-language voice commands with native support for 9 languages
Full UI internationalisation (i18n) across the entire application
Re-engineered system tray for instant, native-like performance
Word correction system to fix common mis-transcriptions automatically

🆕 What’s New

Multi-Language Voice Commands
- Native voice control in Italian, Spanish, French, German, Dutch, Portuguese, Chinese, Japanese, and Russian
- Language-specific command prefixes to avoid accidental triggers
- Fully integrated with the existing transcription pipeline
Complete UI Localisation
- All UI elements translated, including Settings, Main window, and System Tray
- App language selection in Settings → Interface
- Clear separation between UI language and transcription language
Word Correction System
- Custom dictionary to automatically fix common mis-transcriptions
- Fuzzy matching with adjustable threshold
- Full management UI in Settings → Transcription
Custom Rewrite Prompts
- Edit and personalise rewrite prompts for each rewrite mode
- Automatic switch to Custom mode when presets are modified
- Language preservation to prevent unwanted translations
Compact View Toggle
- New toggle in Settings → Interface → Appearance
- Fully synchronised with the global shortcut (Ctrl+Shift+V)

🎛 Interface & Tray Improvements

Custom System Tray
- Instant tray menu appearance via pre-created hidden window
- Improved screen-bound positioning logic
- Native-style exit behaviour and close-to-tray support
Interface Settings Section
- New dedicated section for appearance and behaviour settings
- Start hidden, close to tray, launch on startup, theme, and language controls

🛠 Improvements & Fixes

Improved clipboard and keyboard text insertion reliability
Fixed missing trailing spaces in batch transcription
Prevented settings persistence issues affecting window state
Resolved Light theme styling inconsistencies
Cleaned up legacy code and fixed streaming stability issues
Improved settings synchronisation with external state changes

Full Changelog:
v1.10.0 → v1.13.0

Assets 3

11 Dec 18:11

3choff

v1.10.0

33b1242

v1.10.0

v1.10.0 – System Tray Support, Theme System, and UX Enhancements

This release brings another major upgrade to Dictate with full system tray integration, a complete dark/light theme system, improved rewrite behaviour, and multiple UI refinements. Dictate is now easier to keep running in the background and more visually consistent than ever.

🚀 System Tray Integration

Dictate can now run quietly in the background, always ready but never in the way.

New system tray icon with Show/Hide, Settings, and Quit options
Left-click toggles visibility of the main window
App no longer occupies the Windows taskbar while running
Start Hidden option lets Dictate launch directly to the tray
Launch on Startup feature (via Windows Registry)
Smart auto-show during dictation:
- If hidden, window appears automatically when dictation starts
- If auto-shown, it hides again once dictation ends

🎨 Theme System (Dark & Light Modes)

A fully implemented theme architecture brings polished, consistent styling across the entire UI.

New Dark/Light theme modes with smooth animated transitions
Theme toggle in settings with system preference detection
Centralised theme variables (theme.css) for colours and states
All UI elements made theme-aware (buttons, icons, tooltips, visualisers)
All hardcoded colours removed in favour of CSS variables

✨ UI & UX Improvements

“Customize” settings reorganised into clear groups: Input / Output / UI
Shared tooltip component centralised for reuse across windows
Polished visual elements: theme-aware sparkle icon, consistent hover/active states
Audio visualiser now theme-responsive with animated colour changes

📝 Rewrite Improvements

Ctrl + Shift + R now reliably selects all text before rewriting
“Press rewrite” voice command updated to match this behaviour
Works consistently across all input fields in both batch and streaming modes

⚙️ Technical Enhancements

Added tauri-plugin-autostart for startup integration
Added Tauri features: tray-icon, image-ico
New settings keys: start_hidden, autostart_enabled
Tooltip styling unified across main and settings windows
Help button fixed in About section

Full Changelog:
v1.8.0 → v1.10.0

Assets 3

26 Oct 08:45

3choff

v1.8.0

cf445db

v1.8.0

v1.8.0 – Push-to-Talk, Text Rewrite System & Major UI Enhancements

This release marks a major milestone for Dictate, bringing push-to-talk recording, a powerful text rewrite system, and a complete UI refactor for a smoother, faster, and more intuitive experience.

🎤 Push-to-Talk (PTT) Mode

Hold a keyboard shortcut to record; release to stop and transcribe instantly.
Works with all batch providers: Groq, Gemini, Mistral, SambaNova, Fireworks.
Automatically disables when switching to streaming providers.
No buffering delay — transcription starts immediately on key release.
Warnings and notifications guide users when PTT is incompatible with selected providers.

🧠 Text Rewrite System

Replaces simple grammar correction with a complete text rewriting toolkit.
Five rewrite modes: Grammar, Professional, Polite, Casual, Structured.
Multi-provider support and fully customisable prompts.
Accessible via the new rewrite button, shortcut (Ctrl + Shift + R), or voice command (“press rewrite”).
Dedicated Rewrite settings tab with provider and mode configuration.

🎨 UI & Settings Redesign

Fully restructured settings window with tabbed navigation: General, Transcription, Rewrite, Grammar, and About.
Sidebar navigation with icons, active state highlighting, and smooth transitions.
New tooltip system for settings guidance with animations and smart positioning.
Improved visual design: refreshed microphone icon, modern recording button, and new application logo.

🐞 Fixes & Polish

Tooltip clipping and alignment issues resolved in compact mode.
Corrected SVG rendering and styling consistency.
Fixed trailing space issues in terminal and browser transcriptions.
Settings now load and auto-save reliably across all UI components.

Full Changelog: v1.6.0 → v1.8.0

Assets 3

19 Oct 22:08

3choff

v1.6.0

eb5bab4

v1.6.0

v1.6.0 – Redesigned Settings Interface & Silero VAD Integration

This release delivers two major improvements: a complete UI refactor of the Settings window for better organisation and usability, and the integration of a new ** Voice Activity Detection (VAD)** system for more accurate speech segmentation.

✨ What's New

🎨 Redesigned Settings Interface

Introduced tabbed navigation with four clear sections: General, Transcription, Grammar, and About.
Added sidebar with icon-based navigation and smooth section transitions.
Modularised settings architecture for improved maintainability and scalability.
Enhanced About tab: now includes app version display, update checker, and quick access buttons (Help, GitHub, Ko-fi).
Fully responsive design with scrollable content and improved layout handling.

🧠 Silero VAD (Voice Activity Detection)

Integrated ML-based Silero VAD model for precise speech segmentation.
Replaces older RMS-based system, significantly reducing background noise triggers and improving accuracy across microphones.
Event-driven segmentation pipeline for all batch transcription providers (Groq, Gemini, Mistral, SambaNova, Fireworks).
Optimised detection timing for faster and more natural response.

🛠️ Technical Improvements

New modular settings file structure: sections/ for tab content and fields/ for reusable UI components.
Implemented smart dropdown positioning and z-index fixes to prevent clipping in the settings window.
Rust backend enhancements with async VAD session management and efficient audio buffering.

🚀 Benefits

More intuitive settings navigation and cleaner interface.
Far more accurate and responsive speech detection.
Easier to maintain modular codebase for both frontend and backend.

Full Changelog: v1.4.0...v1.6.0

Assets 3

18 Oct 14:23

3choff

v1.4.0

2ae8d5d

v1.4.0

v1.4.0 – Unified Audio Pipeline & Major Refactor

This release marks a major architectural milestone for Dictate, introducing a unified audio and provider system that significantly improves maintainability, performance, and microphone compatibility.

✨ What's New

Unified Audio Pipeline
- A single, optimised audio processing system now powers all transcription providers.
- Improved performance, reduced complexity, and consistent behaviour across all providers.
Provider Abstraction Layer
- Introduced a shared BaseProvider architecture with unified interfaces (start, stop, getName, getType).
- Simplifies the addition of new transcription providers.
Recording Session Manager
- Centralised control of audio capture, provider lifecycle, and visualisation.
- Enhanced reliability and resource handling for start/stop operations.
Speech-Optimised Audio Visualiser
- New frequency weighting for clearer and more balanced visuals.
- Refined sensitivity and animation for quieter laptop microphones.

🧠 Technical & Performance Improvements

Modernised codebase with 100+ lines of legacy code removed.
Unified PCM16 pipeline at 16 kHz across all providers.
Enhanced laptop microphone support with automatic gain control.
Simplified Deepgram integration using direct PCM16 streaming.

🚀 Benefits

More reliable audio and transcription performance.
Simpler architecture for easier future expansion.
Faster and more efficient resource handling.
Better UX with smoother visual feedback and wider mic compatibility.

Full Changelog: v1.2.0...v1.4.0

Assets 3

12 Oct 16:42

3choff

v1.2.0

82dcfda

v1.2.0

v1.2.0 - UI Refinements and Feature Updates

This release introduces several UI enhancements, new features, and bug fixes to improve the user experience and application stability.

✨ What's New

Version Display: The app version is now displayed in the settings footer for easy reference.
Update Notifications: The app now automatically checks for new versions on GitHub and displays a clickable "New version available" notice in the settings footer.
Modernized UI:
- The main and settings windows now feature rounded corners.
- The settings window shadow has been removed for a cleaner design.
- The help and settings button icons have been updated.
Bug Fixes:
- The Ctrl+Shift+G grammar correction shortcut is now more reliable.
- Fixed an issue that prevented the update notification from opening the releases page.
- Prevented unnecessary scrollbars from appearing in the UI.

Full Changelog: v1.0.0...v1.2.0

Assets 3

11 Oct 09:42

3choff

v1.0.0

93f2c51

v1.0.0 - Dictate

🎉 Dictate v1.0.0 - Major Release

This is the first stable release of Dictate, a high-performance desktop dictation application for Windows built with Tauri and Rust. This release marks the complete migration from Electron to Tauri, delivering powerful speech-to-text capabilities with ~80% reduction in memory usage and significantly faster startup times.

📥 Download

Download Dictate_1.0.0.exe below and run it on any Windows system. No installation required - just launch and start dictating!

✨ Key Features

🎤 Multi-Provider Transcription Support
- Groq (Whisper-Large-v3-Turbo) - Silence-based chunking
- Deepgram (Nova-3) - Real-time streaming
- Cartesia - Real-time streaming with PCM pipeline
- Google Gemini (2.5 Flash Lite)
- Mistral (Voxtral)
- SambaNova (Whisper-Large-v3)
- Fireworks Whisper
✍️ Grammar Correction
- Multi-provider support (Groq, Gemini, Mistral, SambaNova, Fireworks)
- Select text in any application and press Ctrl+Shift+G
- Default provider: Groq GPT-OSS-120B
🎯 Seamless Text Insertion
- Automatically inserts transcribed text into any active application
- Two insertion modes: Simulated Typing or Clipboard paste
- Real-time transcription with streaming providers
⌨️ Global Keyboard Shortcuts
- Ctrl+Shift+D - Toggle Recording
- Ctrl+Shift+G - Grammar Correction
- Ctrl+Shift+V - Toggle Compact Mode
- Ctrl+Shift+S - Toggle Settings
- Ctrl+Shift+X - Exit Application
🗣️ Voice Commands
- Punctuation: "period", "comma", "question mark", etc.
- Navigation: "press enter", "backspace", "press tab"
- Control: "press copy", "select all"
- Advanced: "delete that", "correct grammar", "pause dictation"
🌍 Multilingual Support
- Select transcription language in settings
- Auto-detect mode for multilingual audio
- Language hints passed to compatible providers
🔊 Audio Cues
- Audible "beep" when recording starts
- Audible "clack" when recording stops

🚀 Performance Improvements

Compared to the legacy Electron version (v0.6.7):

~80% reduction in memory usage
Near-instant application launch
Significantly smaller binary size
Type-safe Rust backend eliminates runtime errors

🛠️ Technical Stack

Backend: Rust with Tauri 2.x framework
Frontend: HTML, CSS, vanilla JavaScript
Audio Processing: Web Audio API with streaming support
Keyboard Injection: Native Windows API via enigo crate

📖 Getting Started

Download Dictate_1.0.0.exe from the assets below
Run the executable (no installation required)
Click the gear icon to configure your API keys
Select your preferred transcription provider
Press Ctrl+Shift+D to start dictating!

📚 Documentation

Full documentation available in README.md
Complete changelog in CHANGELOG.md
Architecture details in TAURI_STRUCTURE.md

⚠️ Migration Notes

If you're coming from the Electron version (v0.6.7):

The Electron version is preserved in the electron/ directory
Settings from Electron are not automatically migrated
All functionality is replicated or improved in this Tauri version

🐛 Known Issues

None reported for this release.

📄 License

This project is licensed under the Apache License 2.0.

Full Changelog: https://github.com/3choff/dictate/blob/main/CHANGELOG.md

Assets 3

Uh oh!

Releases: 3choff/dictate

v1.13.0

Internationalisation, Voice Control & Tray Improvements

✨ Highlights

🆕 What’s New

🎛 Interface & Tray Improvements

🛠 Improvements & Fixes

Uh oh!

v1.10.0

v1.10.0 – System Tray Support, Theme System, and UX Enhancements

🚀 System Tray Integration

🎨 Theme System (Dark & Light Modes)

✨ UI & UX Improvements

📝 Rewrite Improvements

⚙️ Technical Enhancements

Uh oh!

v1.8.0

v1.8.0 – Push-to-Talk, Text Rewrite System & Major UI Enhancements

🎤 Push-to-Talk (PTT) Mode

🧠 Text Rewrite System

🎨 UI & Settings Redesign

🐞 Fixes & Polish

Uh oh!

v1.6.0

v1.6.0 – Redesigned Settings Interface & Silero VAD Integration

✨ What's New

🎨 Redesigned Settings Interface

🧠 Silero VAD (Voice Activity Detection)

🛠️ Technical Improvements

🚀 Benefits

Uh oh!

v1.4.0

v1.4.0 – Unified Audio Pipeline & Major Refactor

✨ What's New

🧠 Technical & Performance Improvements

🚀 Benefits

Uh oh!

v1.2.0

v1.2.0 - UI Refinements and Feature Updates

✨ What's New

Uh oh!

v1.0.0 - Dictate

🎉 Dictate v1.0.0 - Major Release

📥 Download

✨ Key Features

🚀 Performance Improvements

🛠️ Technical Stack

📖 Getting Started

📚 Documentation

⚠️ Migration Notes

🐛 Known Issues

📄 License

Uh oh!