🎉 Join our Discord Community! Connect with other users, get help, and stay updated on the latest features: https://discord.gg/4Q5YVrePzZ
Whisper ASR Box is a general-purpose speech recognition toolkit. Whisper Models are trained on a large dataset of diverse audio and is also a multitask model that can perform multilingual speech recognition as well as speech translation and language identification.
Current release (v1.9.1) supports following whisper models:
docker run -d -p 9000:9000 \
-e ASR_MODEL=base \
-e ASR_ENGINE=openai_whisper \
onerahmet/openai-whisper-asr-webservice:latestdocker run -d --gpus all -p 9000:9000 \
-e ASR_MODEL=base \
-e ASR_ENGINE=openai_whisper \
onerahmet/openai-whisper-asr-webservice:latest-gpuTo reduce container startup time by avoiding repeated downloads, you can persist the cache directory:
docker run -d -p 9000:9000 \
-v $PWD/cache:/root/.cache/ \
onerahmet/openai-whisper-asr-webservice:latest- Multiple ASR engines support (OpenAI Whisper, Faster Whisper, WhisperX)
- Multiple output formats (text, JSON, VTT, SRT, TSV)
- Word-level timestamps support
- Voice activity detection (VAD) filtering
- Speaker diarization (with WhisperX)
- FFmpeg integration for broad audio/video format support
- GPU acceleration support
- Configurable model loading/unloading
- REST API with Swagger documentation
Key configuration options:
ASR_ENGINE: Engine selection (openai_whisper, faster_whisper, whisperx)ASR_MODEL: Model selection (tiny, base, small, medium, large-v3, etc.)ASR_MODEL_PATH: Custom path to store/load modelsASR_DEVICE: Device selection (cuda, cpu)MODEL_IDLE_TIMEOUT: Timeout for model unloading
For complete documentation, visit: https://ahmetoner.github.io/whisper-asr-webservice
# Install poetry v2.X
pip3 install poetry
# Install dependencies for cpu
poetry install --extras cpu
# Install dependencies for cuda
poetry install --extras cuda
# Run service
poetry run whisper-asr-webservice --host 0.0.0.0 --port 9000After starting the service, visit http://localhost:9000 or http://0.0.0.0:9000 in your browser to access the Swagger UI documentation and try out the API endpoints.