A user-friendly web interface to download YouTube audio and transcribe it using OpenAI's Whisper.
- YouTube & File Input: Transcribe from YouTube URLs or local audio files.
- Organized Storage: Automatically saves audio and transcripts to a
transcriptions/folder, organized by video title. - Multiple Whisper Models: Choose from
tiny,base,small,medium, orlargefor a balance of speed and accuracy. - Language Support: Auto-detect or manually select from dozens of languages.
- Translation to English: Translate any supported language into English.
- Multiple Export Formats: Get transcripts as
TXT,SRT,VTT, andJSON.
- Conda or Miniconda
- FFmpeg (required for audio processing)
First, create the conda environment from the environment.yml file:
conda env create -f environment.yml
conda activate whisperThe PyTorch installation varies by platform. Choose the appropriate method for your system (CUDA/Intel/AMD)
Note: Visit PyTorch's official website for the latest installation commands and CUDA version compatibility.
FFmpeg is required for audio processing. Install it based on your platform. Instructions on FFmpeg website ffmpeg.org
Once the environment is created, you can start the app.
Method 1: Batch File (Easiest)
Simply double-click run_app.bat.
Method 2: Manual Start
conda activate whisper
python app.pyThe application will open in your browser at http://127.0.0.1:7860.
- Paste a YouTube URL or upload an audio file.
- Select a Whisper Model (start with
basefor a good balance). - Choose your Language (
autois usually fine). - Select the Task (
transcribeortranslate). - Choose any Export Formats you need.
- Click "🚀 Download & Transcribe" or "🚀 Transcribe".
Your files will be saved in the transcriptions folder.
- Environment: The conda environment is managed by
environment.yml. - Downloader: Uses the
yt-dlpPython package. - File Structure: All output is saved in
D:\ML\whisper\transcriptions\[Video Title]\.
- MAY THE AI LORD BE WITH YOU
This project uses OpenAI's Whisper and the yt-dlp project.