Skip to content

dfeen87/Acoustic-Projection-Microphone-System

Repository files navigation

Acoustic Projection Microphone (APM) System

Build Status C++20 Python React Version Status Platform Build

Production-grade implementation of an advanced acoustic projection microphone system with real-time translation capabilities.

Table of Contents

Features

  • Advanced Beamforming: Delay-and-sum, superdirective, and adaptive null-steering algorithms
  • Deep Noise Suppression: LSTM-based neural network for speech enhancement
  • Acoustic Echo Cancellation: NLMS adaptive filter with double-talk detection
  • Voice Activity Detection: Energy and zero-crossing rate based VAD with hangover mechanism
  • Real-time Translation: Local Whisper + NLLB pipeline (200+ languages, fully offline); TensorFlow Lite engine also supported
  • Directional Audio Projection: Phased array synthesis for targeted audio delivery
  • End-to-End Encryption: ChaCha20-Poly1305 + X25519 key exchange via libsodium; Argon2id password-based key derivation and file encryption
  • Push-to-Talk (PTT) Controller: Hardware/software PTT with keyboard, mouse, external pedal, and software-controlled modes
  • Call Signaling: UDP-based call setup, teardown, and session management with peer discovery
  • Real Audio I/O: PortAudio integration for live microphone capture and speaker playback (when PortAudio is installed)
  • WAV File I/O: Read and write WAV files for offline processing and testing
  • High Performance: FFTW-optimized FFT with STFT support, multi-threaded processing, SIMD-ready
  • Production Launcher: Enterprise-grade startup system with automatic health checks and monitoring
  • REST API with Global Node Access: FastAPI-based REST API for peer discovery and session management across all network nodes

New in Version 7.0.0

  • Auto-Calibration Mode: Streamlines initial setup by automatically tuning key system parameters.
  • Real-Time Monitoring Dashboard: Expands live visibility into system status and processing behavior.
  • Adaptive Feedback Suppression: Reduces acoustic feedback dynamically during operation.
  • Preset Profiles: Adds ready-to-use configuration profiles for common deployment scenarios.
  • Diagnostics: Improves troubleshooting with clearer runtime checks and diagnostic reporting.

🌍 Local Translation (100% Private)

APM System includes fully local speech recognition and translation using state-of-the-art AI models. Your conversations never leave your device.

Features

  • 🔒 100% Private - No cloud APIs, all processing on-device
  • 🌐 200+ Languages - Powered by Meta's NLLB translation model
  • 🎤 Accurate Speech Recognition - OpenAI Whisper for transcription
  • Real-time Performance - 2-4 seconds per sentence (GPU) or 5-8 seconds (CPU)
  • 🚫 No Internet Required - Works completely offline after initial setup

Quick Setup

# One-command setup
./scripts/setup.sh --full

# Or step-by-step: activate venv and run translation bridge
source venv/bin/activate
python3 scripts/translation_bridge.py audio.wav --source en --target es

Platform Notes

  • Supported on Linux and macOS (Intel & Apple Silicon)
  • Text translation fallback uses portable <cctype> classification (no locale-dependent behavior) for cross-platform correctness
  • CI and Docker validate Linux builds; macOS builds are verified locally

Supported Languages

English, Spanish, French, German, Italian, Portuguese, Russian, Chinese, Japanese, Korean, Arabic, Hindi, and 180+ more.

See docs/translation/ for complete translation documentation.


Architecture

┌─────────────────────────────────────────────────────────────┐
│                      APM System Pipeline                     │
└─────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────┐    ┌──────────────────┐    ┌─────────────┐
│  Microphone     │───▶│   Beamforming    │───▶│    Echo     │
│  Array (4-16)   │    │   Engine         │    │ Cancellation│
└─────────────────┘    └──────────────────┘    └─────────────┘
                               │                       │
                               ▼                       ▼
┌─────────────────┐    ┌──────────────────┐    ┌─────────────┐
│   Directional   │◀───│   Translation    │◀───│    Noise    │
│   Projector     │    │   Engine         │    │ Suppression │
└─────────────────┘    └──────────────────┘    └─────────────┘
        │                                              │
        ▼                                              ▼
┌─────────────────┐                          ┌─────────────┐
│  Speaker Array  │                          │     VAD     │
│    (3-8)        │                          │   Engine    │
└─────────────────┘                          └─────────────┘

Architecture Notes

The codebase is intentionally split into two layers:

  • APMCore (src/apm_core.cpp): lightweight DSP + text fallback suitable for embedding
  • APMSystem (src/core/apm_system.cpp): full real-time pipeline (beamforming, NS, AEC, VAD, translation, projection)

Each component is defined exactly once to ensure clean builds across Linux and macOS.

🚀 Quick Start

Prerequisites

  • Node.js 14+ (for launcher) - Download
  • CMake 3.18+ - Download
  • C++20 Compiler - GCC 10+, Clang 11+, or MSVC 2019+
  • FFTW3 (optional) - sudo apt-get install libfftw3-dev (Linux) or brew install fftw (Mac)
  • libsodium (optional, for encryption) - sudo apt-get install libsodium-dev (Linux) or brew install libsodium (Mac)
  • PortAudio (optional, for live audio I/O) - sudo apt-get install portaudio19-dev (Linux) or brew install portaudio (Mac)

One-Command Launch

# Linux/Mac
./scripts/start-apm.sh

# Windows
scripts\start-apm.bat

That's it! The launcher will:

  • ✅ Validate your environment
  • ✅ Install Node.js dependencies (if needed)
  • ✅ Build the C++ backend (if needed)
  • ✅ Start the APM system
  • ✅ Open the dashboard in your browser

Manual Setup

If you prefer step-by-step control:

# 1. Install launcher dependencies
cd launcher
npm install

# 2. Build C++ backend
cd ..
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release

# 3. Start the system
cd launcher
npm start

The system starts on:

Custom Configuration

# Use different ports
APM_BACKEND_PORT=9000 APM_UI_PORT=5000 npm start

# Enable debug logging
DEBUG=1 npm start

# Combined
APM_BACKEND_PORT=9000 DEBUG=1 npm start

🌐 REST API with Global Node Access

The APM system includes a FastAPI-based REST API server that enables peer discovery and session management across all nodes in the network.

Quick Start - REST API Only

Start the REST API server independently:

# Linux/Mac
./scripts/start-api.sh

# Windows
scripts\start-api.bat

# With custom configuration
python3 backend/main.py --host 0.0.0.0 --port 8080

Global Node Access

By default, the API binds to 0.0.0.0, making it accessible from any node in the network. This enables:

  • Peer Discovery: Other nodes can query this node's peer list
  • Session Management: Remote nodes can initiate and manage sessions
  • Health Monitoring: Network-wide health checks

API Endpoints

Endpoint Method Description
/health GET Health check
/api/peers GET List all discovered peers
/api/peers/{peer_id} GET Get specific peer details
/api/status POST Update local peer status
/api/session POST Create new session
/api/session/{session_id} GET Get session status
/api/session/{session_id}/accept POST Accept session
/api/session/{session_id}/end POST End session

Interactive API Documentation

Once running, visit:

Configuration Options

# Environment variables
export APM_API_HOST=0.0.0.0      # Global access (default)
export APM_API_PORT=8080         # API port (default)

# Command-line arguments
python3 backend/main.py --host 0.0.0.0 --port 8080 --reload

For complete REST API documentation, see backend/README.md


📊 Production Launcher Features

The APM launcher is enterprise-grade with:

🔍 Pre-flight Validation

  • Checks for required executables and files
  • Validates port availability
  • Verifies build artifacts
  • Ensures proper file permissions

🏥 Health Monitoring

  • Automatic backend health checks every 300ms
  • 60-second timeout with informative error messages
  • Real-time process monitoring
  • Captures backend stdout/stderr for debugging

🛡️ Robust Error Handling

  • Graceful shutdown on SIGINT/SIGTERM
  • Force-kill after 5-second timeout
  • Port conflict detection
  • Detailed error messages with solutions

📝 Production Logging

[2025-01-15T10:30:45.123Z] [INFO] Validating environment...
[2025-01-15T10:30:45.456Z] [SUCCESS] Environment validation passed
[2025-01-15T10:30:45.789Z] [INFO] Starting C++ backend...
[2025-01-15T10:30:46.012Z] [Backend] Server listening on port 8080
[2025-01-15T10:30:47.345Z] [SUCCESS] Backend healthy after 3 checks (1234ms)
[2025-01-15T10:30:47.678Z] [SUCCESS] UI server listening on http://localhost:4173
[2025-01-15T10:30:48.901Z] [SUCCESS] APM System is fully operational! 🚀

🔐 Security

  • UI server binds to 127.0.0.1 only (localhost)
  • Security headers enabled (X-Frame-Options, X-XSS-Protection, X-Content-Type-Options)
  • No external file system access from UI server
  • 404 for all non-root paths

⚡ Performance

  • Fast startup: < 5 seconds typical
  • Minimal overhead: ~30MB RAM for launcher
  • Automatic process cleanup
  • Multi-platform support (Windows/Mac/Linux)

🧪 System Validation

Health Check Script

# Validate your entire setup
node scripts/healthcheck.js

Checks:

  • ✅ Node.js version (14+)
  • ✅ CMake installation
  • ✅ C++ compiler availability
  • ✅ File structure integrity
  • ✅ Backend binary exists
  • ✅ Dependencies installed
  • ✅ Runtime status (if running)
  • ✅ Port availability

Integration Tests

# Run full integration test suite
node tests/integration/integration.test.js

Tests include:

  • Backend health endpoint
  • Response time benchmarks
  • Concurrent request handling
  • UI server functionality
  • Security headers
  • Load testing (100 sequential, 50 concurrent requests)

📁 Project Structure Overview

Acoustic-Projection-Microphone-System/
├── backend/                     # Python FastAPI backend and signaling/control modules
├── cmake/                       # CMake package/config helpers
├── config/                      # Optional/runtime dependency configuration
├── docker/                      # Container build assets
├── docs/                        # Architecture, build, deployment, security, and user docs
├── examples/                    # C++ and Python usage examples
├── frontend/                    # Frontend adapters/hooks/components
├── include/apm/                 # Public C++ API headers
├── installers/                  # Installer assets (e.g., Inno Setup)
├── launcher/                    # Node.js launcher/orchestration entrypoint
├── scripts/                     # Setup, startup, and utility scripts
├── src/                         # Core C++ implementation (DSP, signaling, translation, tools)
├── tests/                       # C++ and Node integration tests
├── tools/                       # Utility tools and scripts
├── ui/                          # React + Vite web UI
├── CMakeLists.txt               # Root CMake build configuration
├── README.md                    # Project overview and usage
├── TESTING_AND_VALIDATION.md    # Validation matrix and test guidance
└── SYSTEM_MANIFEST.md           # Repository/system manifest

🐳 Docker Deployment

Quick Start

# Build the image
docker build -t apm-system .

# Run example
docker run --rm apm-system

# Development environment
docker run -it --rm -v $(pwd):/workspace/apm apm-system:development

Production Deployment

FROM node:18-alpine AS launcher
WORKDIR /app
COPY launcher/package*.json ./
RUN npm ci --production

FROM gcc:11 AS backend
WORKDIR /app
COPY . .
RUN cmake -B build -DCMAKE_BUILD_TYPE=Release && \
    cmake --build build --config Release

FROM node:18-alpine
WORKDIR /app
COPY --from=launcher /app/node_modules ./launcher/node_modules
COPY --from=backend /app/build/apm_backend ./apm_backend
COPY launcher/apm_launcher.js ./launcher/
COPY ui/apm-dashboard.html ./ui/

EXPOSE 8080 4173
CMD ["node", "launcher/apm_launcher.js"]

🛠️ Troubleshooting

Launcher Issues

Error: Backend executable not found

# Rebuild the backend
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release

Error: Backend port 8080 is already in use

# Find and kill the process
lsof -i :8080          # Linux/Mac
netstat -ano | findstr :8080  # Windows

# Or use a different port
APM_BACKEND_PORT=8081 npm start

Error: Backend health check timed out

# Run backend directly to see errors
./apm_backend  # or ./build/apm_backend

# Check for:
# - Firewall blocking localhost
# - Missing dependencies
# - Port conflicts

Error: UI file not found

# Verify file exists in parent directory
ls -la ../apm-dashboard.html

# File must be at: apm/apm-dashboard.html
# Launcher must be at: apm/launcher/apm_launcher.js

Build Issues

Q: Build fails with "fftw3.h not found"
A: Install FFTW:

sudo apt-get install libfftw3-dev  # Ubuntu/Debian
brew install fftw                   # macOS
vcpkg install fftw3                 # Windows

Q: Tests fail with "Segmentation fault"
A: Check audio frame sizes match across components. Ensure FFT size ≤ frame size.

Q: Poor beamforming performance
A: Verify microphone spacing matches speed of sound. Calibrate microphone positions.

Q: High CPU usage
A: Reduce sample rate from 48kHz to 16kHz for lower quality requirements.

Q: Echo cancellation not working
A: Ensure speaker reference signal is provided. Check for timing synchronization.

Getting Help

  1. Check logs: Enable debug mode with DEBUG=1 npm start
  2. Run health check: node scripts/healthcheck.js
  3. Verify prerequisites: Node.js 14+, CMake 3.18+, C++20 compiler
  4. Check ports: Ensure 8080 and 4173 are available

📊 Performance

Benchmarked on Intel i7-12700K, 32GB RAM, Ubuntu 22.04:

Component Processing Time (20ms frame) Throughput
Beamforming (4 mics) 0.8ms 25x real-time
Noise Suppression 2.1ms 9.5x real-time
Echo Cancellation 0.5ms 40x real-time
VAD 0.1ms 200x real-time
Full Pipeline 4.2ms 4.8x real-time
Launcher Overhead < 50ms N/A

Memory usage:

  • Backend: ~15MB (without TFLite models)
  • Launcher: ~30MB
  • Total: ~45MB baseline

💻 API Documentation

Public Facade

APMCore

Lightweight public facade for initializing the system and performing audio processing and text translation. Suitable for embedding.

APMCore core;
core.initialize(48000, 1);           // sample_rate, num_channels
core.set_source_language("en");
core.set_target_language("es");

std::vector<float> out = core.process(input_samples);

APMCore::TextTranslationResult r = core.translate_text("Hello");
// r.success, r.translated_text, r.processing_time_ms

Note: The C++ text fallback in APMCore supports only en→es and en→fr. Full 200+ language support requires the Python translation bridge (scripts/translation_bridge.py).

Core Classes

AudioFrame

Encapsulates audio data with metadata.

AudioFrame(size_t samples, int sample_rate, int channels);
std::span<float> samples();           // Access audio data
void compute_metadata();              // Calculate peak, RMS, clipping
std::vector<float> channel(int ch);   // Extract single channel

BeamformingEngine

Spatial filtering for directional audio capture.

BeamformingEngine(int num_mics, float spacing_m);

AudioFrame delay_and_sum(
    const std::vector<AudioFrame>& mic_array,
    float azimuth_rad,
    float elevation_rad
);

AudioFrame superdirective(
    const std::vector<AudioFrame>& mic_array,
    float azimuth_rad
);

NoiseSuppressionEngine

Deep learning-based noise reduction.

AudioFrame suppress(const AudioFrame& noisy);
void reset_state();  // Reset LSTM state

EchoCancellationEngine

Adaptive echo cancellation with NLMS.

EchoCancellationEngine(int filter_length = 2048);

AudioFrame cancel_echo(
    const AudioFrame& microphone,
    const AudioFrame& speaker_reference
);

bool detect_double_talk(const AudioFrame& mic, const AudioFrame& ref);
void reset();  // Reset adaptive filter weights and reference buffer

VoiceActivityDetector

Speech/non-speech classification.

struct VadResult {
    bool speech_detected;
    float confidence;      // 0.0 to 1.0
    float snr_db;
    float energy_db;
};

VadResult detect(const AudioFrame& frame);
void adapt_threshold(float ambient_noise_db);
void reset();  // Reset hangover counter

FFTProcessor

High-performance FFT using FFTW (only available when built with FFTW3; throws at runtime otherwise).

FFTProcessor(int size);

void forward(const std::vector<float>& input,
            std::vector<std::complex<float>>& output);

void inverse(const std::vector<std::complex<float>>& input,
            std::vector<float>& output);

static void apply_window(std::vector<float>& data, WindowType type);
// WindowType: Hann, Hamming, Blackman, Kaiser

APMSystem

Complete processing pipeline.

struct Config {
    int num_microphones = 4;
    float mic_spacing_m = 0.012f;
    int num_speakers = 3;
    float speaker_spacing_m = 0.015f;
    int sample_rate = 48000;
    std::string source_language = "en-US";
    std::string target_language = "es-ES";
};

APMSystem(const Config& config);
APMSystem();  // Default constructor (uses Config defaults)

std::vector<AudioFrame> process(
    const std::vector<AudioFrame>& microphone_array,
    const AudioFrame& speaker_reference,
    float target_direction_rad
);

std::future<std::vector<AudioFrame>> process_async(...);
void reset_all();  // Reset echo canceller, noise suppressor, and VAD state

🧪 Testing

# Run all tests
cd build && ctest --output-on-failure

# Run specific test suite
./apm_test --gtest_filter=BeamformingTest.*

# Run with detailed output
./apm_test --gtest_output=xml:test_results.xml

# Memory leak check
valgrind --leak-check=full ./apm_test

# Performance profiling
perf record ./apm_bench
perf report

# Integration tests
node tests/integration/integration.test.js

Test coverage: 87% (lines), 92% (functions)


⚙️ Configuration

Hardware Setup

Microphone Array:

  • Linear array: 4-8 microphones
  • Spacing: 10-15mm (λ/2 at 11kHz)
  • Recommended: omnidirectional electret or MEMS

Speaker Array:

  • Linear array: 3-6 speakers
  • Spacing: 15-20mm
  • Recommended: full-range drivers, 85dB+ SPL

Software Configuration

// Low-latency configuration
config.num_microphones = 4;
config.mic_spacing_m = 0.012f;

// High-quality configuration
config.num_microphones = 8;
config.mic_spacing_m = 0.010f;

// Language support
config.source_language = "en-US";  // English
config.target_language = "es-ES";  // Spanish
// APMSystem accepts any IETF language tag; 200+ languages available via Python bridge
// C++ text fallback (APMCore) supports: en→es, en→fr

Environment Variables

# Launcher configuration
export APM_BACKEND_PORT=8080      # Backend API port
export APM_UI_PORT=4173           # Dashboard UI port
export DEBUG=1                     # Enable debug logging

# Backend configuration
export APM_NUM_MICS=4             # Number of microphones
export APM_MIC_SPACING=0.012      # Microphone spacing (meters)
export APM_NUM_SPEAKERS=3         # Number of speakers

🤝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Code Style

  • Follow C++20 Core Guidelines
  • Format with clang-format (Google style)
  • Add unit tests for new features
  • Update documentation
  • Run node scripts/healthcheck.js before submitting

📜 System Requirements

Minimum

  • CPU: 2 cores
  • RAM: 512MB
  • Disk: 100MB
  • Node.js: 14.0.0+
  • CMake: 3.18+

Recommended

  • CPU: 4+ cores with AVX2
  • RAM: 2GB+
  • Disk: 1GB+
  • Node.js: 18.0.0+
  • CMake: 3.20+
  • OS: Linux or macOS (Windows supported via MSYS2/vcpkg)

🎯 What's Next

Now that the full APM pipeline builds cleanly and launches reliably, the next phase begins. This document outlines the upcoming milestones that will take the system from a validated prototype to a production‑ready acoustic intelligence engine.

✅ Completed Milestones

  • Full DSP pipeline integrated (beamforming → echo cancellation → noise suppression → VAD → translation → projection)
  • APMSystem orchestrator implemented and validated
  • Clean Docker build with reproducible environment
  • CI pipeline green across build and lint stages
  • Production launcher with health monitoring
  • Automated testing and validation
  • Cross-platform startup scripts
  • End-to-end encryption (ChaCha20-Poly1305 + X25519 via libsodium)
  • Push-to-Talk (PTT) controller with keyboard/mouse/external/software modes
  • UDP call signaling with peer discovery and session management
  • PortAudio real audio I/O (live microphone and speaker support)
  • Local Whisper + NLLB translation (200+ languages, fully offline)

🚀 Next Milestones

See docs/ROADMAP.md for detailed roadmap including:

  1. Ring buffers & low-latency audio pipeline - Continuous real-time capture with PortAudio
  2. TTS integration - Complete ASR → NMT → TTS speech-to-speech chain
  3. DSP Optimization - SIMD acceleration, FFT-based beamforming
  4. Developer Experience - CLI tools, config presets, documentation

License

This project is available for non‑commercial use only under the terms of the included LICENSE file.
Commercial use requires a separate paid license — contact [email protected].


🙏 Acknowledgments

  • Author: Don Michael Feeney Jr.
  • Dedicated to: Marcel Krüger
  • Enhanced with: Claude (Anthropic), Google (Jules), Microsoft (Copilot)
  • FFT: FFTW library by Matteo Frigo and Steven G. Johnson
  • ML Framework: TensorFlow Lite by Google

I would like to acknowledge Microsoft Copilot, Anthropic Claude, Google Jules, and OpenAI ChatGPT for their meaningful assistance in refining concepts, improving clarity, and strengthening the overall quality of this code.


📚 References

  1. Van Trees, H. L. (2002). Optimum Array Processing. Wiley-Interscience.
  2. Benesty, J., et al. (2007). Springer Handbook of Speech Processing. Springer.
  3. Paliwal, K. K., et al. (2010). "Speech Enhancement Using a Minimum Mean-Square Error Short-Time Spectral Modulation Magnitude Estimator." Speech Communication.
  4. Valin, J. M. (2018). "A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement." IEEE MMSP.

📖 Citation

If you use this work in research, please cite:

@software{feeney2026apm,
  author = {Feeney, Don Michael Jr.},
  title = {Acoustic Projection Microphone System},
  year = {2026},
  publisher = {GitHub},
  url = {https://github.com/dfeen87/Acoustic-Projection-Microphone-System}
}

📧 Support

  • Email: [email protected]
  • GitHub: Discussion Board
  • Health Check: node scripts/healthcheck.js
  • Logs: Enable with DEBUG=1 npm start

Status: Production Ready | Version: 7.0.0 | Last Updated: April 2026

About

Acoustic Projection Microphone (APM) platform combining real‑time translation, spatial audio projection, adaptive noise control, and secure on‑device processing. Backed by a robust peer‑discovery and session architecture with an intuitive UI, it’s built for the next generation of ambient, wearable, privacy‑centric voice interfaces.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors