Automatically censor and bleep out words in audio and video using AI. Free and built to self-host with Python and Docker.
Make someone sound naughty 😈 or make your content more Ad-friendly.
Works by bleeping out keywords of your choice from an mp4 by leveraging a transcription model (here Whisper) to transcribe the audio, then target and replace chosen words with bleep sounds using the extracted timestamps associated with your chosen word(s).
All processing is performed locally.
Some examples of the end product (make sure to turn volume on, its off by default).
bleep-that-sht-examples.mp4
Let's look more closely at the last example above - below is a short clip we'll bleep out some words from using the pipeline in this repo. (make sure to turn on audio - its off by default)
bleep_test_og_cropped_low_res.mp4
Now the same clip with the words - "treetz", "ice", "cream", "chocolate", "syrup", and "cookie" - bleeped out
bleep_test_processed_cropped_low_res.mp4
Use docker to quickly to spin up the app in an isolated container by typing the following at your terminal
docker compose up
Then navigate to http://localhost:8501/
to use the app.
To get setup to run the notebook / bleep your own videos / run the strealit demo first install the requirements for this project by pasting the below in your terminal.
pip install -r requirements.txt
Then activate the app server by typing the following at your terminal
streamlit run /home/bleep_that_sht/app.py --server.port=8501 --server.address=0.0.0.0
Then navigate to http://localhost:8501/
to use the app in any browser.
Note: you will need ffmpeg installed on your machine as well.
Once you have the app up and running and have navigated to ``http://localhost:8501/`, there are three tabs you can choose from
The first tab allows for local video upload and processing.
The second tab allows for youtube url download and processing.
The third tab has handy "about" information for convenience.
The app may take longer than usual during the initial processing of local videos or YouTube content because it needs to download the transcription model.
A quick walkthrough of both local video and youtube processing is shown below.
See beep_that_sht_walkthrough.ipynb
) to play / see nitty gritty details.