PROJECT OBJECTIVE
The project aims to analyze a dataset containing Olympic data. After a detailed analysis, the dataset was cleaned and prepared for use in a prediction model. A classification model is employed to predict whether an athlete will win a medal.
DATA INFORMATION
Two datasets were sourced from Kaggle and will be merged for the analysis. One dataset contains detailed information about Olympic athletes, including unique identifiers, personal details such as name, gender, age, height, and weight, and the country they represent. It also records their participation details, including the edition and year of the Olympics, the host city, the sport and specific event they competed in, and the type of medal won, if any. The other dataset provides information about National Olympic Committees (NOCs) and their regions.
USED TECHNOLOGIES
- Python
- Pandas for data exploration
- Matplotlib and Seaborn for data visualization
- sklearn and xgboost for models
- Streamlit for deploying the web app