Skip to content

Releases: 3choff/dictate

v1.13.0

16 Dec 15:29

Choose a tag to compare

Internationalisation, Voice Control & Tray Improvements

This release brings major usability and accessibility improvements, with full UI localisation, multi-language voice commands, and a significantly refined system tray experience. Dictate is now more flexible, more international, and more seamless to use as a background tool.

✨ Highlights

  • Multi-language voice commands with native support for 9 languages
  • Full UI internationalisation (i18n) across the entire application
  • Re-engineered system tray for instant, native-like performance
  • Word correction system to fix common mis-transcriptions automatically

🆕 What’s New

  • Multi-Language Voice Commands

    • Native voice control in Italian, Spanish, French, German, Dutch, Portuguese, Chinese, Japanese, and Russian
    • Language-specific command prefixes to avoid accidental triggers
    • Fully integrated with the existing transcription pipeline
  • Complete UI Localisation

    • All UI elements translated, including Settings, Main window, and System Tray
    • App language selection in Settings → Interface
    • Clear separation between UI language and transcription language
  • Word Correction System

    • Custom dictionary to automatically fix common mis-transcriptions
    • Fuzzy matching with adjustable threshold
    • Full management UI in Settings → Transcription
  • Custom Rewrite Prompts

    • Edit and personalise rewrite prompts for each rewrite mode
    • Automatic switch to Custom mode when presets are modified
    • Language preservation to prevent unwanted translations
  • Compact View Toggle

    • New toggle in Settings → Interface → Appearance
    • Fully synchronised with the global shortcut (Ctrl+Shift+V)

🎛 Interface & Tray Improvements

  • Custom System Tray

    • Instant tray menu appearance via pre-created hidden window
    • Improved screen-bound positioning logic
    • Native-style exit behaviour and close-to-tray support
  • Interface Settings Section

    • New dedicated section for appearance and behaviour settings
    • Start hidden, close to tray, launch on startup, theme, and language controls

🛠 Improvements & Fixes

  • Improved clipboard and keyboard text insertion reliability
  • Fixed missing trailing spaces in batch transcription
  • Prevented settings persistence issues affecting window state
  • Resolved Light theme styling inconsistencies
  • Cleaned up legacy code and fixed streaming stability issues
  • Improved settings synchronisation with external state changes

Full Changelog:
v1.10.0 → v1.13.0

v1.10.0

11 Dec 18:11

Choose a tag to compare

v1.10.0 – System Tray Support, Theme System, and UX Enhancements

This release brings another major upgrade to Dictate with full system tray integration, a complete dark/light theme system, improved rewrite behaviour, and multiple UI refinements. Dictate is now easier to keep running in the background and more visually consistent than ever.

🚀 System Tray Integration

Dictate can now run quietly in the background, always ready but never in the way.

  • New system tray icon with Show/Hide, Settings, and Quit options

  • Left-click toggles visibility of the main window

  • App no longer occupies the Windows taskbar while running

  • Start Hidden option lets Dictate launch directly to the tray

  • Launch on Startup feature (via Windows Registry)

  • Smart auto-show during dictation:

    • If hidden, window appears automatically when dictation starts
    • If auto-shown, it hides again once dictation ends

🎨 Theme System (Dark & Light Modes)

A fully implemented theme architecture brings polished, consistent styling across the entire UI.

  • New Dark/Light theme modes with smooth animated transitions
  • Theme toggle in settings with system preference detection
  • Centralised theme variables (theme.css) for colours and states
  • All UI elements made theme-aware (buttons, icons, tooltips, visualisers)
  • All hardcoded colours removed in favour of CSS variables

✨ UI & UX Improvements

  • “Customize” settings reorganised into clear groups: Input / Output / UI
  • Shared tooltip component centralised for reuse across windows
  • Polished visual elements: theme-aware sparkle icon, consistent hover/active states
  • Audio visualiser now theme-responsive with animated colour changes

📝 Rewrite Improvements

  • Ctrl + Shift + R now reliably selects all text before rewriting
  • “Press rewrite” voice command updated to match this behaviour
  • Works consistently across all input fields in both batch and streaming modes

⚙️ Technical Enhancements

  • Added tauri-plugin-autostart for startup integration
  • Added Tauri features: tray-icon, image-ico
  • New settings keys: start_hidden, autostart_enabled
  • Tooltip styling unified across main and settings windows
  • Help button fixed in About section

Full Changelog:
v1.8.0 → v1.10.0


v1.8.0

26 Oct 08:45

Choose a tag to compare

v1.8.0 – Push-to-Talk, Text Rewrite System & Major UI Enhancements

This release marks a major milestone for Dictate, bringing push-to-talk recording, a powerful text rewrite system, and a complete UI refactor for a smoother, faster, and more intuitive experience.

🎤 Push-to-Talk (PTT) Mode

  • Hold a keyboard shortcut to record; release to stop and transcribe instantly.
  • Works with all batch providers: Groq, Gemini, Mistral, SambaNova, Fireworks.
  • Automatically disables when switching to streaming providers.
  • No buffering delay — transcription starts immediately on key release.
  • Warnings and notifications guide users when PTT is incompatible with selected providers.

🧠 Text Rewrite System

  • Replaces simple grammar correction with a complete text rewriting toolkit.
  • Five rewrite modes: Grammar, Professional, Polite, Casual, Structured.
  • Multi-provider support and fully customisable prompts.
  • Accessible via the new rewrite button, shortcut (Ctrl + Shift + R), or voice command (“press rewrite”).
  • Dedicated Rewrite settings tab with provider and mode configuration.

🎨 UI & Settings Redesign

  • Fully restructured settings window with tabbed navigation: General, Transcription, Rewrite, Grammar, and About.
  • Sidebar navigation with icons, active state highlighting, and smooth transitions.
  • New tooltip system for settings guidance with animations and smart positioning.
  • Improved visual design: refreshed microphone icon, modern recording button, and new application logo.

🐞 Fixes & Polish

  • Tooltip clipping and alignment issues resolved in compact mode.
  • Corrected SVG rendering and styling consistency.
  • Fixed trailing space issues in terminal and browser transcriptions.
  • Settings now load and auto-save reliably across all UI components.

Full Changelog: v1.6.0 → v1.8.0


v1.6.0

19 Oct 22:08

Choose a tag to compare

v1.6.0 – Redesigned Settings Interface & Silero VAD Integration

This release delivers two major improvements: a complete UI refactor of the Settings window for better organisation and usability, and the integration of a new ** Voice Activity Detection (VAD)** system for more accurate speech segmentation.

✨ What's New

🎨 Redesigned Settings Interface

  • Introduced tabbed navigation with four clear sections: General, Transcription, Grammar, and About.
  • Added sidebar with icon-based navigation and smooth section transitions.
  • Modularised settings architecture for improved maintainability and scalability.
  • Enhanced About tab: now includes app version display, update checker, and quick access buttons (Help, GitHub, Ko-fi).
  • Fully responsive design with scrollable content and improved layout handling.

🧠 Silero VAD (Voice Activity Detection)

  • Integrated ML-based Silero VAD model for precise speech segmentation.
  • Replaces older RMS-based system, significantly reducing background noise triggers and improving accuracy across microphones.
  • Event-driven segmentation pipeline for all batch transcription providers (Groq, Gemini, Mistral, SambaNova, Fireworks).
  • Optimised detection timing for faster and more natural response.

🛠️ Technical Improvements

  • New modular settings file structure: sections/ for tab content and fields/ for reusable UI components.
  • Implemented smart dropdown positioning and z-index fixes to prevent clipping in the settings window.
  • Rust backend enhancements with async VAD session management and efficient audio buffering.

🚀 Benefits

  • More intuitive settings navigation and cleaner interface.
  • Far more accurate and responsive speech detection.
  • Easier to maintain modular codebase for both frontend and backend.

Full Changelog: v1.4.0...v1.6.0


v1.4.0

18 Oct 14:23

Choose a tag to compare

v1.4.0 – Unified Audio Pipeline & Major Refactor

This release marks a major architectural milestone for Dictate, introducing a unified audio and provider system that significantly improves maintainability, performance, and microphone compatibility.

✨ What's New

  • Unified Audio Pipeline

    • A single, optimised audio processing system now powers all transcription providers.
    • Improved performance, reduced complexity, and consistent behaviour across all providers.
  • Provider Abstraction Layer

    • Introduced a shared BaseProvider architecture with unified interfaces (start, stop, getName, getType).
    • Simplifies the addition of new transcription providers.
  • Recording Session Manager

    • Centralised control of audio capture, provider lifecycle, and visualisation.
    • Enhanced reliability and resource handling for start/stop operations.
  • Speech-Optimised Audio Visualiser

    • New frequency weighting for clearer and more balanced visuals.
    • Refined sensitivity and animation for quieter laptop microphones.

🧠 Technical & Performance Improvements

  • Modernised codebase with 100+ lines of legacy code removed.
  • Unified PCM16 pipeline at 16 kHz across all providers.
  • Enhanced laptop microphone support with automatic gain control.
  • Simplified Deepgram integration using direct PCM16 streaming.

🚀 Benefits

  • More reliable audio and transcription performance.
  • Simpler architecture for easier future expansion.
  • Faster and more efficient resource handling.
  • Better UX with smoother visual feedback and wider mic compatibility.

Full Changelog: v1.2.0...v1.4.0


v1.2.0

12 Oct 16:42

Choose a tag to compare

v1.2.0 - UI Refinements and Feature Updates

This release introduces several UI enhancements, new features, and bug fixes to improve the user experience and application stability.

✨ What's New

  • Version Display: The app version is now displayed in the settings footer for easy reference.
  • Update Notifications: The app now automatically checks for new versions on GitHub and displays a clickable "New version available" notice in the settings footer.
  • Modernized UI:
    • The main and settings windows now feature rounded corners.
    • The settings window shadow has been removed for a cleaner design.
    • The help and settings button icons have been updated.
  • Bug Fixes:
    • The Ctrl+Shift+G grammar correction shortcut is now more reliable.
    • Fixed an issue that prevented the update notification from opening the releases page.
    • Prevented unnecessary scrollbars from appearing in the UI.

Full Changelog: v1.0.0...v1.2.0

v1.0.0 - Dictate

11 Oct 09:42

Choose a tag to compare

🎉 Dictate v1.0.0 - Major Release

This is the first stable release of Dictate, a high-performance desktop dictation application for Windows built with Tauri and Rust. This release marks the complete migration from Electron to Tauri, delivering powerful speech-to-text capabilities with ~80% reduction in memory usage and significantly faster startup times.

📥 Download

Download Dictate_1.0.0.exe below and run it on any Windows system. No installation required - just launch and start dictating!

✨ Key Features

  • 🎤 Multi-Provider Transcription Support

    • Groq (Whisper-Large-v3-Turbo) - Silence-based chunking
    • Deepgram (Nova-3) - Real-time streaming
    • Cartesia - Real-time streaming with PCM pipeline
    • Google Gemini (2.5 Flash Lite)
    • Mistral (Voxtral)
    • SambaNova (Whisper-Large-v3)
    • Fireworks Whisper
  • ✍️ Grammar Correction

    • Multi-provider support (Groq, Gemini, Mistral, SambaNova, Fireworks)
    • Select text in any application and press Ctrl+Shift+G
    • Default provider: Groq GPT-OSS-120B
  • 🎯 Seamless Text Insertion

    • Automatically inserts transcribed text into any active application
    • Two insertion modes: Simulated Typing or Clipboard paste
    • Real-time transcription with streaming providers
  • ⌨️ Global Keyboard Shortcuts

    • Ctrl+Shift+D - Toggle Recording
    • Ctrl+Shift+G - Grammar Correction
    • Ctrl+Shift+V - Toggle Compact Mode
    • Ctrl+Shift+S - Toggle Settings
    • Ctrl+Shift+X - Exit Application
  • 🗣️ Voice Commands

    • Punctuation: "period", "comma", "question mark", etc.
    • Navigation: "press enter", "backspace", "press tab"
    • Control: "press copy", "select all"
    • Advanced: "delete that", "correct grammar", "pause dictation"
  • 🌍 Multilingual Support

    • Select transcription language in settings
    • Auto-detect mode for multilingual audio
    • Language hints passed to compatible providers
  • 🔊 Audio Cues

    • Audible "beep" when recording starts
    • Audible "clack" when recording stops

🚀 Performance Improvements

Compared to the legacy Electron version (v0.6.7):

  • ~80% reduction in memory usage
  • Near-instant application launch
  • Significantly smaller binary size
  • Type-safe Rust backend eliminates runtime errors

🛠️ Technical Stack

  • Backend: Rust with Tauri 2.x framework
  • Frontend: HTML, CSS, vanilla JavaScript
  • Audio Processing: Web Audio API with streaming support
  • Keyboard Injection: Native Windows API via enigo crate

📖 Getting Started

  1. Download Dictate_1.0.0.exe from the assets below
  2. Run the executable (no installation required)
  3. Click the gear icon to configure your API keys
  4. Select your preferred transcription provider
  5. Press Ctrl+Shift+D to start dictating!

📚 Documentation

⚠️ Migration Notes

If you're coming from the Electron version (v0.6.7):

  • The Electron version is preserved in the electron/ directory
  • Settings from Electron are not automatically migrated
  • All functionality is replicated or improved in this Tauri version

🐛 Known Issues

None reported for this release.

📄 License

This project is licensed under the Apache License 2.0.


Full Changelog: https://github.com/3choff/dictate/blob/main/CHANGELOG.md