FullFight.AI is an end-to-end AI pipeline for extracting, analyzing, and compiling fight scenes from anime episodes. It integrates machine learning, audio/video signal processing, NLP, and a full-stack web platform. Users upload episodes via a Flask-based interface and receive curated fight highlight reels—automatically generated.
- Web Interface: Upload anime episodes and request fight scene extraction
- Feature Extraction:
- Audio RMS (librosa)
- Frame brightness (OpenCV)
- Motion (optical flow via OpenCV Farneback)
- Dialogue/emotion (anger detection using Whisper + transformer)
- Automated Labeling: Combines rule-based thresholds and ML models to label fight segments
- Visualization: Interactive Jupyter notebook for plotting and feature tuning
- Video Compilation: Clips and compiles fight scenes into highlight reels using ffmpeg
- ML Integration: Trains a
RandomForestClassifieron self-collected and labeled data
- Flask – Web server and API endpoints
- Python libraries:
ffmpeg-python– video/audio processinglibrosa– audio RMS extractionopencv-python– brightness and optical flowwhisper– speech transcriptiontransformers,torch– emotion classificationpandas,numpy– data manipulation
- HTML5/CSS3 – Responsive UI (
templates/index.html,static/style.css) - JavaScript – File uploads, API calls, and dynamic UI updates (
static/upload.js)
- Jupyter Notebook – Feature extraction, merging, labeling, visualization, and training
- scikit-learn –
RandomForestClassifiermodel - pandas, matplotlib, seaborn – Data wrangling and visualization
FullFight/
│
├── app.py # Flask backend (upload, processing, endpoints)
├── requirements.txt # Project dependencies
├── templates/
│ └── index.html # Web UI
├── static/
│ ├── style.css # Frontend styles
│ └── upload.js # Frontend logic
├── uploads/ # Uploaded video files
├── output/ # Generated highlight clips
├── fullflight.ipynb # Notebook for extraction, analysis, labeling, modeling
├── fullflight2.ip # Functions to be used in full.py
├── full.py # Full pipeline, utlized trained model, and custom data collection functions
├── audio_rms.csv # Extracted audio features
├── frame_brightness.csv # Extracted brightness features
├── optical_flow.csv # Extracted motion features
├── angry_sections.csv # Extracted emotion features
├── normalized_merged_data.csv # Combined feature set
└── rf_fight_scene_model.mkl # Trained model
- Upload Video – Users upload anime episodes via the web UI
- Feature Extraction – Notebook extracts audio RMS, brightness, motion, emotion
- Merge & Label – Combined CSV is labeled using a mix of thresholds and manual annotation
- Visualization – Features are plotted, inspected, and thresholds are refined
- Model Training –
RandomForestClassifieris trained on labeled data - Video Compilation – Clips for detected fight scenes are extracted with ffmpeg
- Data Collection: We manually reviewed and labeled each scene based on emotion, brightness, motion, and audio levels
- Emotion Detection: Uses
cardiffnlp/twitter-roberta-base-emotionon Whisper transcripts - Motion Estimation: Utilizes OpenCV Farneback for optical flow magnitude
- Modeling:
RandomForestClassifiertrained on features [RMS, brightness, flow, emotion] - Labeling Rules: A scene is flagged as “fight” if it satisfies at least one of:
- Anger score > 0.5
- Brightness > 150
- RMS > –20 dB
- Optical flow above an empirically tuned threshold
- Aaryav Lal
- Dhyan Soni
- Aditya Srivastava