A lightweight audio transcription application for Hyprland that lets you quickly record and transcribe audio with a beautiful floating pill UI.
- 🎙️ Quick Voice Recording - Click to start/stop recording
- 🤖 AI Transcription - Automatic transcription using Whisper.cpp
- 📋 Auto-copy & Auto-paste - Transcribed text automatically copied to clipboard and pasted to your focused window
- 🎯 Smart Paste Detection - Automatically detects terminals vs apps and uses correct paste shortcut
- ✨ Beautiful UI - Animated wave visualization while recording
- ⚡ Fast - Toggle visibility with a global hotkey
- 🪟 Floating Window - Always accessible via Hyprland special workspace
- Hyprland - Wayland compositor
- PipeWire - Audio recording (
pw-recordcommand) - Whisper.cpp - Speech-to-text model
- Binary:
$HOME/open-source-projects/whisper.cpp/build/bin/whisper-cli - Model:
$HOME/open-source-projects/whisper.cpp/models/ggml-base.en.bin
- Binary:
- wtype - Keyboard input simulation for auto-paste (recommended)
- Arch:
sudo pacman -S wtype - Debian/Ubuntu:
sudo apt install wtype
- Arch:
- Rust (latest stable)
- GPUI v0.2.0 dependencies
cd ~/projects/audio-flow
cargo build --releaseIf you haven't already:
cd ~/open-source-projects/
git clone https://github.com/ggml-org/whisper.cpp
cd whisper.cpp
make
bash ./models/download-ggml-model.sh base.enpw-record --help # Should show usage informationAdd the following to your Hyprland config (~/.config/hypr/hyprland.conf):
# Audio Flow - configure as floating, centered window
windowrulev2 = float, class:^(audio-flow)$
windowrulev2 = size 400 100, class:^(audio-flow)$
windowrulev2 = center, class:^(audio-flow)$
windowrulev2 = noborder, class:^(audio-flow)$
windowrulev2 = noshadow, class:^(audio-flow)$
Add a keybind to launch the audio flow application:
# Launch audio transcription window
bind = SUPER, R, exec, ~/projects/audio-flow/target/release/audio-flow
# Optional: Auto-start recording on launch
bind = SUPER_SHIFT, R, exec, ~/projects/audio-flow/target/release/audio-flow --start-recording-on-launch
Note: The app enforces single-instance mode. Pressing Super+R when an instance is already running will close the existing instance and start a new one. This ensures you always get a fresh recording session.
--start-recording-on-launch- Skip the Idle state and start recording immediately when the app launches- Use this for faster workflow: launch → app starts recording → press key to stop and transcribe
- Example:
audio-flow --start-recording-on-launch
Fastest workflow (using Super+R for everything):
- Press
Super+R→ App launches and starts recording immediately - Speak your message
- Press
Super+R→ App stops recording, transcribes, copies to clipboard, and closes automatically
This gives you a seamless experience where Super+R handles both launching and completing the transcription.
Press Super+R to start the application. The window will appear centered on your screen.
Standard Mode:
- Click or press Space - Start recording (you'll see animated wave bars)
- Click or press Space again - Stop recording and begin transcription
- Wait - "Transcribing..." state appears briefly
- Done! - Transcribed text appears and is auto-copied to clipboard
Auto-Start Mode (with --start-recording-on-launch flag):
- App starts recording immediately (animated wave bars visible)
- Click or press Space - Stop recording and begin transcription
- Wait - "Transcribing..." state appears briefly
- Done! - Transcribed text appears and is auto-copied to clipboard
Super+R- Stop recording and transcribe (when app is focused)Space- Start/stop recording (same as clicking)ESC- Close the window and quit immediately
Simply press Super+R again. The app will automatically close any existing instance and start a fresh recording session.
| State | Description | Visual |
|---|---|---|
| Idle | Ready to record | Blue microphone icon + "Tap to record" |
| Recording | Recording audio | Red dot + animated wave bars |
| Processing | Transcribing audio | Spinner + "Transcribing..." |
| Success | Text ready | Transcribed text + "✓ Copied to clipboard" |
| Error | Something went wrong | Red icon + error message |
Audio Flow uses a configuration file located at ~/.config/audio-flow/config.toml. The config file is automatically created with default values on first run.
Edit ~/.config/audio-flow/config.toml to customize paths:
whisper_binary_path = "$HOME/open-source-projects/whisper.cpp/build/bin/whisper-cli"
whisper_model_path = "$HOME/open-source-projects/whisper.cpp/models/ggml-base.en.bin"
pipewire_binary_path = "/usr/bin/pw-record"
recording_output_dir = "$HOME/Music/recordings"
recording_filename = "temp.wav"
auto_paste = true| Option | Description | Default |
|---|---|---|
whisper_binary_path |
Path to whisper-cli binary | $HOME/open-source-projects/whisper.cpp/build/bin/whisper-cli |
whisper_model_path |
Path to Whisper model file | $HOME/open-source-projects/whisper.cpp/models/ggml-base.en.bin |
pipewire_binary_path |
Path to pw-record binary | /usr/bin/pw-record |
recording_output_dir |
Directory for audio recordings | $HOME/Music/recordings |
recording_filename |
Filename for temporary recordings | temp.wav |
auto_paste |
Automatically paste transcribed text to focused window | true |
To use the larger base model (not English-only):
whisper_model_path = "$HOME/open-source-projects/whisper.cpp/models/ggml-base.bin"recording_output_dir = "/tmp/audio-flow-recordings"
recording_filename = "recording.wav"To reset to defaults, simply delete the config file:
rm ~/.config/audio-flow/config.tomlThe app will recreate it with default values on next launch.
If you see an error message about configuration validation:
Configuration validation failed: Whisper binary not found at: /path/to/whisper-cliCheck your config file at ~/.config/audio-flow/config.toml and verify all paths are correct:
# Check Whisper binary
ls -la ~/.config/audio-flow/config.toml
# Verify paths in config
cat ~/.config/audio-flow/config.toml
# Option 1: Fix the paths manually
nano ~/.config/audio-flow/config.toml
# Option 2: Reset to defaults
rm ~/.config/audio-flow/config.toml
audio-flow # Will recreate with defaults# Try launching manually
~/projects/audio-flow/target/release/audio-flow
# Check Hyprland window rules
hyprctl clients | grep audio-flow
# Verify the binary exists
ls -la ~/projects/audio-flow/target/release/audio-flow# Check PipeWire status
systemctl --user status pipewire
# Test recording manually
pw-record test.wav
# Press Ctrl+C after a few seconds
aplay test.wav# Check Whisper.cpp binary
~/open-source-projects/whisper.cpp/build/bin/whisper-cli --help
# Check model file
ls -lh ~/open-source-projects/whisper.cpp/models/ggml-base.en.binMake sure you're running under Wayland (not X11):
echo $XDG_SESSION_TYPE # Should output "wayland"Check if wtype is installed:
which wtype # Should show /usr/bin/wtype or similarInstall wtype if missing:
# Arch Linux
sudo pacman -S wtype
# Debian/Ubuntu
sudo apt install wtypeDisable auto-paste if needed:
Edit ~/.config/audio-flow/config.toml:
auto_paste = falseTerminal paste not working:
Auto-paste detects terminals and uses Ctrl+Shift+V automatically for:
- Ghostty, Kitty, Alacritty, WezTerm, Foot, Konsole
- GNOME Terminal, xterm, Terminator, Tilix
- And many others
If your terminal isn't detected, the app will use Ctrl+V (which may not work). Open an issue on GitHub with your terminal name.
audio-flow/
├── src/
│ ├── main.rs # Application entry point
│ ├── app.rs # Main app logic and window management
│ ├── state.rs # State machine (Idle/Recording/Processing/Success/Error)
│ ├── ui/
│ │ ├── mod.rs # UI module exports
│ │ └── pill.rs # Pill UI component with animations
│ ├── audio/
│ │ ├── mod.rs # Audio module exports
│ │ ├── recorder.rs # PipeWire integration
│ │ └── transcriber.rs# Whisper.cpp integration
│ ├── clipboard.rs # Wayland clipboard integration
│ ├── paste.rs # Auto-paste with Hyprland window detection
│ ├── config.rs # Configuration file management
│ └── notifications.rs # Desktop notifications
├── CLAUDE.md # Development learnings and best practices
├── README.md # User documentation
└── Cargo.toml # Dependencies
cargo build # Debug build
cargo build --release # Release build (recommended)
cargo run # Run debug build- GPUI Patterns: See
CLAUDE.mdfor GPUI-specific learnings and best practices - GPUI API: https://docs.rs/gpui/0.2.0/gpui/
- Hyprland: https://wiki.hypr.land/
[Your License Here]
- Built with GPUI from Zed
- Transcription powered by Whisper.cpp
- Audio recording via PipeWire