A fully local, "Jarvis-style" voice assistant for macOS with a modern PyQt6 GUI. All processing happens on your machine - no cloud services required.
π Website: https://jarvis-home-ai.netlify.app/
- π€ Push-to-talk voice input with visual feedback
- π§ Local LLM powered by Ollama (qwen2.5:0.5b)
- π£οΈ Speech-to-Text using Faster-Whisper (base model by default)
- Text-to-Speech using Piper (preferred) with macOS system voice fallback
- π¬ Conversation history with persistent storage
- π¨ Modern dark UI with animated mic button and chat bubbles
- Configurable settings for model selection and TTS rate
- macOS Sequoia 15.x (tested on 2019 Intel MacBook Pro)
- Python 3.11 (recommended)
- 8GB+ RAM
- Homebrew (for dependencies)
- Download the latest
.dmgfrom Releases. - Open the DMG.
- Drag
Jarvis Assistant.apptoApplications. - Launch from
Applications.
If macOS blocks launch because the app is unsigned, run:
xattr -dr com.apple.quarantine "/Applications/Jarvis Assistant.app"Homebrew is required for system dependencies. Ollama is installed automatically if missing.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"- Clone or download the repository.
- Run the setup and launcher script:
./scripts/start.shThe script will automatically:
- Install system dependencies (portaudio, ffmpeg, Ollama if needed)
- Set up Python virtual environment
- Install Python dependencies
- Start Ollama (if not already running)
- Pull the required Ollama model
- Launch the application
That's it! The app will open with a GUI.
When packaging a release DMG:
- Use Python 3.11+.
- Verify before building:
python --version- Build with:
./scripts/build_release_dmg.sh v1.0.6 x86_64 # Intel host
./scripts/build_release_dmg.sh v1.0.6 arm64 # Apple Silicon host- Start the app: Run
./scripts/start.sh - Talk: Click the circular mic button to start recording
- Stop: Click again to stop and process your speech
- Listen: The assistant will respond with voice and text
- Settings: Click the settings button to change models or TTS settings
- Close: Click the close button to exit
- Idle (cyan, slow pulse): Ready to listen
- Listening (bright cyan, fast pulse): Recording your voice
- Thinking (spinning): Processing your request
- Speaking (pulsing): Playing response
Default settings are stored in jarvis_assistant/config.py:
- Whisper Model:
base(can be: tiny, base, small, medium, large) - Ollama Model:
qwen2.5:0.5b(fast, lightweight) - TTS Rate: 190 words/minute
- TTS Volume: 1.0 (max)
Settings can be changed via the GUI settings dialog and are persisted to:
~/Library/Application Support/Jarvis Assistant/settings.json
Sensitive values (ha_token, api_key, telegram_bot_token, telegram_chat_id) are stored in macOS Keychain and are removed from settings.json.
Environment variables still override stored secrets.
- Ollama endpoint is configurable in Settings -> Intelligence as a base URL (example:
http://100.74.176.49:11434). - You can click Test Connection to verify reachability and refresh model discovery from that endpoint.
- Ollama-specific endpoint controls are shown only when provider is set to
ollama. - For
opencode, the model list is intentionally restricted tobig-pickle.
- Progress acknowledgements (e.g., "let me think" / "working on it") are now anchored to voice recording stop, so the 2.5s timer includes transcription time.
- Wake-word settings now apply immediately after pressing Save Changes (no extra mic-button press required).
- Model downloads support concurrent progress rows in Settings -> Intelligence (one progress bar per active model download).
- Quick Commands authoring is available in Settings -> Speech and uses manual phrase input (comma-separated) without hidden phrase expansion.
Jarvis/
βββ scripts/
β βββ start.sh # Source launcher script
β βββ build_release_dmg.sh # DMG build helper
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ jarvis_assistant/
βββ __init__.py
βββ main.py # Application entry point
βββ config.py # Configuration and settings
βββ gui.py # PyQt6 GUI components
βββ audio_io.py # Microphone recording
βββ stt.py # Speech-to-Text (Whisper)
βββ llm_client.py # LLM client (Ollama)
βββ tts.py # Text-to-Speech (pyttsx3)
βββ conversation.py # Conversation history management
βββ utils.py # Logging utilities
If you encounter "llama runner process has terminated: signal: broken pipe", the Ollama model may be incompatible with your hardware. The app is configured to use qwen2.5:0.5b which works on Intel Macs. You can try other models via the settings dialog.
The warning Populating font family aliases took 219 ms. Replace uses of missing font family "Segoe UI" is harmless. The app will use system default fonts.
If Ollama fails to start automatically, you can start it manually:
ollama serveThen run ./scripts/start.sh again.
- portaudio: Audio I/O library
- ffmpeg: Audio processing
- Ollama: Local LLM runtime
- PyQt6: GUI framework
- sounddevice: Audio recording
- numpy: Numerical operations
- faster-whisper: Speech recognition
- piper-tts: High-quality local text-to-speech
- pyttsx3: Non-macOS fallback text-to-speech
- requests: HTTP client for Ollama API
- keyring: Secure secret storage via macOS Keychain
Run from project root:
python -m py_compile jarvis_assistant/*.pyManual secret-storage check:
- Enter HA/API/Telegram credentials in the Settings dialog and save.
- Restart the app and confirm credentials are still available.
- Open
~/Library/Application Support/Jarvis Assistant/settings.jsonand confirm secret keys are absent.
On a 2019 Intel MacBook Pro (8-core i9, 16GB RAM):
- STT (Whisper base): ~2-3 seconds for 5-second audio
- LLM (qwen2.5:0.5b): ~1-2 seconds for short responses
- TTS: Real-time playback
All processing happens locally on your machine. No data is sent to external servers. Conversation history, memory, logs, and TTS models are stored under:
~/Library/Application Support/Jarvis Assistant/
See LICENSE.