Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 63 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ An experimental Linux-Voice-Assistant software for [Home Assistant](https://www.

This project enables you to build a Linux-based voice assistant designed to use [Assist](https://www.home-assistant.io/voice_control/) for Home Assistant. It allows you to create your own smart speaker that runs on any x64 or ARM64 hardware capable of handling local audio processing (using PulseAudio).

Unlike simpler voice satellites that run on microcontrollers with very limited compute power, this setup can perform local wake word detection (OWW/MWW) and process some data on-device.
Unlike simpler voice satellites that run on microcontrollers with very limited compute power, this setup can perform local wake word detection (OWW/MWW) and process some data on-device.

Because it runs on a full Linux system and offers access significantly more local computing resources for additional features and other integrations on the same satellite, this approach also provides greater flexibility for customization (such as for example experiment with using PipeWire).

Expand All @@ -24,68 +24,73 @@ Because it runs on a full Linux system and offers access significantly more loca
- Prebuild docker image available on [GitHub Container Registry](https://github.com/OHF-Voice/linux-voice-assistant/pkgs/container/linux-voice-assistant)
- Prebuild [Raspberry Pi image](https://github.com/florian-asche/PiCompose)

## Usage
## Requirements

### Hardware
- Microphone: Device must support 16kHz mono audio
- CPU: 1Ghz
- Memory: min. 512MB
- Storage: The OS and software is around 4GB
- OS: linux/amd64 or linux/aarch64

A more extensive list for possible compatible hardware can be found in the [PiCompose documentation](https://github.com/florian-asche/PiCompose) but basically any microphone that works with [PipeWire (multimedia framework for Linux)](https://pipewire.org/) can in theory be used for voice input with the prebuild image from there, you should however preferably use a far-field microphone-array solution if want better result. However if you're using your own USB microphone, **the microphone device must support 16kHz mono audio** for optimal voice recognition performance.

Two solutions recommended for test setups today is to use a Raspberry Pi Zero 2 W SBC (Single Board Computer with built-in WiFi) in combination with the [Satellite1 Hat Board](https://futureproofhomes.net/products/satellite1-top-microphone-board) or the [Respeaker Lite](https://wiki.seeedstudio.com/reSpeaker_usb_v3/). Those have microphone-array designed for far-field voice capture with the added benefit of using an onboard XMOS DSP microcontroller with custom firmware which does advanced audio pre-processing for microphone cleanup that result in very good voice recognition capabilities (as it runs algorithms for Noise Suppression, Acoustic Echo Cancellation, Interference Cancellation, and Automatic Gain Control).
Two solutions recommended for setups today is:

Alternatively if on a lower budget then suggest could try other untested microphone-array boards like example the [reSpeaker 2-Mics Pi HAT V2.0](https://wiki.seeedstudio.com/ReSpeaker_2_Mics_Pi_HAT/) (which uses a much more basic audio codec chip).
- use a Raspberry Pi Zero 2W (Single Board Computer with built-in WiFi) in combination with the [Satellite1 Hat Board](https://futureproofhomes.net/products/satellite1-top-microphone-board)
- use at least a Raspberry Pi 3 with the [Respeaker Lite](https://wiki.seeedstudio.com/reSpeaker_usb_v3/). The Respeaker Lite currently has a problem with the Zero 2W.

As for the minimum required compute performance on these satellites the target reference hardware for testing is currently a 64-bit ARM-based SBC based on Raspberry Pi RP3A0 SiP (System-in-Package); which means the Raspberry Pi Zero 2 W, Raspberry Pi Compute Module 3E (Raspberry Pi CM3E), or other development boards that uses the Compute Module Zero" (Raspberry Pi CM0), as all of which have similar specifications to the Raspberry Pi 3 B/B+ but with a CPU running at a lower frequency.
Those mic-boards have microphone-array designed for far-field voice capture with the added benefit of using an onboard XMOS DSP microcontroller with custom firmware which does advanced audio pre-processing for microphone cleanup that result in very good voice recognition capabilities (as it runs algorithms for Noise Suppression, Acoustic Echo Cancellation, Interference Cancellation, and Automatic Gain Control).

But you can also install LVA on AMD64 devices, for example on your Linux desktop computer.
Alternatively if on a lower budget then suggest could use other microphone-array boards like for example the [reSpeaker 2-Mics Pi HAT V2.0](https://wiki.seeedstudio.com/ReSpeaker_2_Mics_Pi_HAT/) (which uses a much more basic audio codec chip).

### Software
## Usage

#### Installation
### Installation

For Raspberry Pi users, we provide a prebuild image that can be flashed to a SD card. See [PiCompose](https://github.com/florian-asche/PiCompose).

For all other users, we have different installation methods available (Docker, systemd), each with its own dedicated instructions. See [Linux-Voice-Assistant - Installation](docs/install.md).
For all other users, we have different installation methods available (Docker, systemd), each with its own dedicated instructions. See [Linux-Voice-Assistant - Installation](docs/install.md).

#### Parameter overview
### Parameter overview

💡 **Note:** There is an [environment variable](docs/install_application.md#environment-variables-reference) for each parameter if you use docker or systemd based setup.

``` sh
```sh
usage: __main__.py [-h] [--name NAME] [--audio-input-device AUDIO_INPUT_DEVICE] [--list-input-devices] [--audio-input-block-size AUDIO_INPUT_BLOCK_SIZE] [--audio-output-device AUDIO_OUTPUT_DEVICE] [--list-output-devices] [--wake-word-dir WAKE_WORD_DIR] [--mic-auto-gain] [--mic-noise-suppression]
[--wake-model WAKE_MODEL] [--stop-model STOP_MODEL] [--download-dir DOWNLOAD_DIR] [--refractory-seconds REFRACTORY_SECONDS] [--wakeup-sound WAKEUP_SOUND] [--timer-finished-sound TIMER_FINISHED_SOUND] [--processing-sound PROCESSING_SOUND]
[--mute-sound MUTE_SOUND] [--unmute-sound UNMUTE_SOUND] [--preferences-file PREFERENCES_FILE] [--host HOST] [--network-interface NETWORK_INTERFACE] [--port PORT] [--enable-thinking-sound] [--debug]
```

| Parameter | Description | Default |
|-----------|-------------|---------|
| `--name` | Name of the voice assistant device (required) | Autogenerated (`lva-MAC-ADDRESS`) |
| `--audio-input-device` | Soundcard name for input device | Autodetected |
| `--audio-input-block-size` | Audio input block size in samples | 1024 |
| `--audio-output-device` | mpv name for output device | Autodetected |
| `--mic-volume` | Control microphone volume | 1.0 |
| `--mic-auto-gain` | Add WebRTC Gain to Mic | 0 |
| `--mic-noise-suppression` | Add WebRTC Noise Suppression to Mic | 0 |
| `--wake-word-dir` | Directory with wake word models (.tflite) and configs (.json) | `wakewords/` |
| `--wake-model` | ID of active wake word model | `okay_nabu` |
| `--stop-model` | ID of stop model | `stop` |
| `--download-dir` | Directory to download custom wake word models, etc. | `local/` |
| `--refractory-seconds` | Seconds before wake word can be activated again | 2.0 |
| `--timer-max-ring-seconds` | Seconds after which the timer stops ringing | 900.0 |
| `--wakeup-sound` | Sound file played when wake word is detected | `sounds/wake_word_triggered.flac` |
| `--timer-finished-sound` | Sound file played when timer finishes | `sounds/timer_finished.flac` |
| `--processing-sound` | Sound played while assistant is processing | `sounds/processing.wav` |
| `--mute-sound` | Sound played when muting the assistant | `sounds/mute_switch_on.flac` |
| `--unmute-sound` | Sound played when unmuting the assistant | `sounds/mute_switch_off.flac` |
| `--preferences-file` | Path to preferences JSON file | `preferences.json` |
| `--host` | IP-Address for ESPHome server, use 0.0.0.0 for all | Autodetected |
| `--network-interface` | Network interface for ESPHome server | Autodetected |
| `--port` | Port for ESPHome server | 6053 |
| `--enable-thinking-sound` | Enable thinking sound on startup | False |
| `--debug` | Print DEBUG messages to console | False |
| `--output-only` | Enable output only mode | False |

💡 **Note:** There is a detailed explanation on the gain, noise suppression, and wake word sensitivity flags in the [audio options](docs/audio_options.md) file.
| Parameter | Description | Default |
| ---------------------------- | --------------------------------------------------------------- | ----------------------------------- |
| `--name` | Name of the voice assistant device (required) | Autogenerated (`lva-MAC-ADDRESS`) |
| `--audio-input-device` | Soundcard name for input device | Autodetected |
| `--audio-input-block-size` | Audio input block size in samples | 1024 |
| `--audio-output-device` | mpv name for output device | Autodetected |
| `--mic-volume` | Control microphone volume | 1.0 |
| `--mic-auto-gain` | Add WebRTC Gain to Mic | 0 |
| `--mic-noise-suppression` | Add WebRTC Noise Suppression to Mic | 0 |
| `--wake-word-dir` | Directory with wake word models (.tflite) and configs (.json) | `wakewords/` |
| `--wake-model` | ID of active wake word model | `okay_nabu` |
| `--stop-model` | ID of stop model | `stop` |
| `--download-dir` | Directory to download custom wake word models, etc. | `local/` |
| `--refractory-seconds` | Seconds before wake word can be activated again | 2.0 |
| `--timer-max-ring-seconds` | Seconds after which the timer stops ringing | 900.0 |
| `--wakeup-sound` | Sound file played when wake word is detected | `sounds/wake_word_triggered.flac` |
| `--timer-finished-sound` | Sound file played when timer finishes | `sounds/timer_finished.flac` |
| `--processing-sound` | Sound played while assistant is processing | `sounds/processing.wav` |
| `--mute-sound` | Sound played when muting the assistant | `sounds/mute_switch_on.flac` |
| `--unmute-sound` | Sound played when unmuting the assistant | `sounds/mute_switch_off.flac` |
| `--preferences-file` | Path to preferences JSON file | `preferences.json` |
| `--host` | IP-Address for ESPHome server, use 0.0.0.0 for all | Autodetected |
| `--network-interface` | Network interface for ESPHome server | Autodetected |
| `--port` | Port for ESPHome server | 6053 |
| `--enable-thinking-sound` | Enable thinking sound on startup | False |
| `--debug` | Print DEBUG messages to console | False |
| `--output-only` | Enable output only mode | False |

💡 **Note:** There is a detailed explanation on the gain, noise suppression, and wake word sensitivity flags in the [audio options](docs/audio_options.md) file.

## Build Information

Expand All @@ -104,6 +109,7 @@ The documentation for the build process can be found in the [GitHub Actions Work
### Code Quality Checks

The project uses the following tools to ensure code quality:

- **Black**: Code formatting (88 characters per line, PEP 8 compliant)
- **isort**: Import sorting compatible with Black
- **flake8**: Style and syntax checks
Expand All @@ -114,45 +120,50 @@ The project uses the following tools to ensure code quality:

To use the development tools (linting, testing, etc.), you need to install the required dependencies:

``` sh
```sh
./script/setup --dev
source .venv/bin/activate
```

### Linting Commands

#### Run all linting checks
``` sh

```sh
./script/lint...
```

#### Individual linting commands (with auto-fix support)

| Script | Description | Auto-fix Available? |
|--------|-------------|---------------------|
| `./script/lint_black` | Checks Python code formatting with Black | Yes, use `--auto` flag |
| `./script/lint_flake8` | Runs style and syntax checks with flake8 | No |
| `./script/lint_isort` | Checks import sorting with isort | Yes, use `--auto` flag |
| `./script/lint_mypy` | Runs static type analysis with mypy | No |
| `./script/lint_pylint` | Runs code quality checks with pylint | Yes, use `--auto` flag |

| Script | Description | Auto-fix Available? |
| ------------------------ | ------------------------------------------ | ----------------------- |
| `./script/lint_black` | Checks Python code formatting with Black | Yes, use`--auto` flag |
| `./script/lint_flake8` | Runs style and syntax checks with flake8 | No |
| `./script/lint_isort` | Checks import sorting with isort | Yes, use`--auto` flag |
| `./script/lint_mypy` | Runs static type analysis with mypy | No |
| `./script/lint_pylint` | Runs code quality checks with pylint | Yes, use`--auto` flag |

#### Examples

Run a specific lint check:
``` sh

```sh
./script/lint_black
```

Auto-fix formatting issues (Black + isort):
``` sh

```sh
./script/lint_black --auto
./script/lint_isort --auto
```

### Testing

Run the test suite:
``` sh

```sh
./script/test
```

Expand Down
Loading