BoWW Server (Broadcast-On-Wakeword)

Hardware-aware audio streaming server.

This server manages audio ingestion from distributed clients via WebSockets.
It decouples machine hearing (VAD) from human listening (Recording) to ensure high-precision detection without compromising the dynamic range of the collected dataset.

🚀 Features

Zero-Conf Discovery: Automatically discoverable on the network via mDNS/Bonjour (_boww._tcp).
Group Arbitration: Handles multiple clients competing for the same audio channel using confidence scores and mutex locking.
Sidechain DSP Architecture:
- Path A (Detection): Aggressive AGC + Silero VAD (V5) for >99% speech detection accuracy.
- Path B (Recording): Clean, dynamic audio path with safety limiting (anti-clipping) for high-quality ASR datasets.
Hardware Efficient: Written in C++17, utilizing ONNX Runtime and WebSocket++ for low-latency performance on edge devices.

🛠️ Installation (Raspberry Pi)

1. Clone the Repository

git clone [https://github.com/yourusername/boww_server.git](https://github.com/yourusername/boww_server.git)
cd boww_server
# 2. Install Dependencies
# We provide a helper script to install system libraries (ALSA, Boost, Avahi) and fetch the specific ARM64 binary for ONNX Runtime (v1.16.3).

chmod +x setup_env_pi.sh
./setup_env_pi.sh

# 3. Fetch AI Models
# Download the specific Silero VAD V5 model required by the pipeline.

python3 setup_resources.py

./boww_server -c ./ -m ./models/silero_vad.onnx -d

# This places silero_vad.onnx into the models/ directory.

🏗️ Build Instructions
The project uses CMake and links against the local ONNX Runtime found in libs/

mkdir build
cd build
cmake ..
make -j2  # Use -j1 on Pi Zero if memory is tight
# Run the Server
# Standard run
./boww_server

# Debug mode (View VAD probabilities, AGC gain levels, and mDNS logs)
./boww_server --debug

🧪 Testing (Python Client)
Included is test_client_discovery.py, a robust test harness that simulates a hardware client (like an ESP32 or another Pi).

Prerequisites
You need a 16kHz mono WAV file named jfk-sil.wav in the same directory (or update the script variable WAV_FILE)

pip install websockets zeroconf
python3 test_client_discovery.py

To install asndloop (linux virtual mic)
sudo modprobe snd-aloop
on boot
echo "snd-aloop" | sudo tee -a /etc/modules
aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: Device [USB Audio Device], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: vc4hdmi [vc4-hdmi], device 0: MAI PCM i2s-hifi-0 [MAI PCM i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 2: Loopback [Loopback], device 0: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7
card 2: Loopback [Loopback], device 1: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7

  play into plughw:0,0 and audio will be available as a normal mic on hw:0,1
  use for streaming ASR that want a mic or save to file via clients.yaml settings

Test Workflow
Discovery: Scans mDNS for _boww._tcp.

Handshake: Connects and authenticates via clients.yaml.

Arbitration: Sends a confidence score ({"type": "confidence", "value": 1.0}).

Streaming: Streams audio in 64ms chunks upon winning the floor.

Auto-Stop: Server detects silence via VAD and sends a STOP command; client disconnects.

⚙️ Process Architecture
The BoWW Server operates as a stateful pipeline designed to optimize both detection and recording quality simultaneously.

Discovery & Handshake
The server broadcasts availability via Avahi (mDNS).

Clients connect via persistent WebSocket.

Clients are authenticated against clients.yaml. Unknown clients are assigned a temp-ID for onboarding.

The "Sidechain" Audio Pipeline
When a client streams audio, the signal is split into two parallel processing paths:

Path A: The VAD Sidechain (The Brain)

Input: Raw Audio

AGC: Applies aggressive gain (targeting -4dB) to normalize whispers or distant speech.

Inference: The boosted signal is fed to Silero VAD V5 via ONNX Runtime.

Result: High-precision Probability output (0.0 - 1.0).

Path B: The Audio Sink (The File)

Input: Raw Audio (Same source as A).

Processing: The AGC is bypassed to preserve natural dynamics.

Safety Limiter: Signal is multiplied by 0.4 to prevent hardware clipping.

Output: Written to disk (WAV) or Hardware Output (ALSA).

State Management
Jitter Buffer: Smooths out network inconsistency before writing to disk.

VAD Logic: Maintains a "Speech State". If silence persists beyond vad_no_voice_ms (configurable), the server autonomously closes the file and terminates the stream.

======================================================
 BoWWServer - Edge Smart Speaker Master Node
======================================================
Description:
  The BoWWServer coordinates multiple edge clients on the local
  network. It handles 200ms network arbitration to seamlessly
  determine the closest smart speaker, buffers incoming audio,
  runs the Silero VAD engine to detect when the user stops
  speaking, and outputs clean WAV files ready for STT pipelines.

Usage: ./boww_server [OPTIONS]

Options:
  -c, --config   Path to config dir (default: ../)
  -m, --model    Path to Silero VAD model (default: ../models/silero_vad.onnx)
  -p, --port     WebSocket listener port (default: 9002)
  -d, --debug    Enable Debug Mode (Live VAD probabilities and peak volume)
  -h, --help     Show this help message and exit

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
boww_server		boww_server
clients.yaml		clients.yaml
connecting_clients.txt		connecting_clients.txt
jfk-sil.wav		jfk-sil.wav
setup_env_pi.sh		setup_env_pi.sh
setup_resources.py		setup_resources.py
test_client_discovery.py		test_client_discovery.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BoWW Server (Broadcast-On-Wakeword)

🚀 Features

🛠️ Installation (Raspberry Pi)

1. Clone the Repository

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BoWW Server (Broadcast-On-Wakeword)

🚀 Features

🛠️ Installation (Raspberry Pi)

1. Clone the Repository

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages