Box Office Analyser

Developed and trained based on a Comprehensive Film Statistics Dataset, this Worldwide Gross Revenue ($) Predictive Model for Movies served as my final submission for CAI2C08 Machine Learning. With an approximately 67% Adjusted-R² Value, my extensive analysis and training catered quite well towards the complexity, depth, and imbalance of this dataset. Hosted via Streamlit, my application of this model can be accessed here.

Inclusive of my well-researched Report Analysis of the dataset, I gained valuable insights into the film industry by approaching it from a more analytical and data-driven perspective through the recognition of inherent challenges and unpredictability of box office performances.

Features

The Box Office Analyser Repository includes several key features designed to provide insights into movie performance and revenue predictions:

Predictive Model: Utilises my well-trained advanced machine learning algorithms to predict worldwide gross revenue based on various input features.
Data Visualisation: Interactive charts and graphs to visualise relationships between impactful features.
Feature Importance Analysis: Identifies which features most significantly impact revenue predictions, helping users understand the driving factors behind box office success.
User-Friendly Interface: A Streamlit-based application that enables users to input movie data and receive instant predictions and insights.
Exploratory Data Analysis (EDA): Provides visualisations and statistics to explore the dataset, helping users identify trends and patterns in movie performance.

Comparison with Deployment Extract

Actual Gross Revenue ($)	Predicted Gross Revenue ($)	Percentage Difference (%)
17, 475, 475	22, 152, 180.67	26.8
53, 191, 101	58, 329, 482.08	9.6

Packages

pandas: For data manipulation and analysis.
numpy: For numerical operations and handling arrays.
seaborn: For data visualisation and statistical graphics.
matplotlib: For creating static, animated, and interactive visualisations.
scikit-learn: For implementing machine learning algorithms and model evaluation.
optuna: For hyperparameter optimisation.
joblib: For saving and loading models.
catboost, xgboost, lightgbm: For gradient boosting algorithms tailored for regression tasks, all inclusive of my tree-based algorithm choices.

Setup Instructions

pip install -r requirements.txt

Running Locally

streamlit run application.py

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.streamlit		.streamlit
catboost_info		catboost_info
data		data
dataset		dataset
LICENSE		LICENSE
README.md		README.md
application.py		application.py
deployment_extract.csv		deployment_extract.csv
genre_pca.pkl		genre_pca.pkl
main.py		main.py
requirements.txt		requirements.txt
robust_scaler.pkl		robust_scaler.pkl
standard_scaler.pkl		standard_scaler.pkl
trained_box_office_analyser.pkl		trained_box_office_analyser.pkl
visuals.ipynb		visuals.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Box Office Analyser

Features

Comparison with Deployment Extract

Packages

Setup Instructions

Running Locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

sofiaamihan/box-office-analyser

Folders and files

Latest commit

History

Repository files navigation

Box Office Analyser

Features

Comparison with Deployment Extract

Packages

Setup Instructions

Running Locally

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages