A production-ready machine learning system that predicts NBA game outcomes. It features a full pipeline: Data Collection -> Training (HistGradientBoosting) -> Live Inference -> React Frontend -> Daily Automation.
- Advanced Modeling: Uses
HistGradientBoostingClassifierwith "Matchup Merge" architecture (comparing Team Form vs Opponent Form). - Live Predictions: Fetches today's games via
nba_api, processes stats in real-time, and predicts winners with confidence scores. - Automated Pipeline: A GitHub Action (
.github/workflows/daily_prediction.yml) runs every morning at 6 AM ET to generate new predictions. - React Frontend: A clean UI to view today's games and the model's picks.
- Model: HistGradientBoostingClassifier
- Key Features: 50 selected predictors including Rolling Advanced Stats (
orb%,drtg, etc.) and "Matchup" differentials.
- ML/Backend: Python, scikit-learn, pandas, numpy, nba_api
- Frontend: React.js, CSS Modules
- CI/CD: GitHub Actions (Daily Cron Job)
nba-match-predictor/
├── predictors/
│ └── predictor_v5.ipynb # Main training notebook (Analysis & Retraining)
├── models/
│ └── hist_gbm_v5/ # Serialized Model, Scaler, and Predictor list
├── scripts/
│ └── predict_v5.py # PRODUCTION SCRIPT: Generates today's predictions
├── frontend/ # React Application
│ ├── public/data/ # Contains schedule and generated predictions.json
│ └── src/ # Frontend source
├── data/ # Raw training data (gitignored)
└── .github/workflows/ # Automation configuration
To run the prediction system locally:
pip install -r requirements.txt
python scripts/predict_v5.pyThis will fetch today's games and save the results to frontend/public/data/predictions.json.
cd frontend
npm install
npm startOpen http://localhost:3000 to see the dashboard.
Open predictors/predictor_v5.ipynb in Jupyter. This notebook contains the full pipeline to:
- Load
data/nba_games_raw.csv - Clean and Compute Rolling Averages
- Train the HistGradientBoosting model
- Save artifacts to
models/
The project is configured to run automatically via GitHub Actions.
- Schedule: Every day at 11:00 UTC (6:00 AM ET).
- Action: Runs
predict_v5.py, commits the newpredictions.json, and pushes to the repo. - Deploy: Vercel (linked to the repo) automatically deploys the updated frontend.
MIT License.