This repository is a part of my curriculum 'Data Storytelling and Visualization'. This project involves web-scraping, cleaning and analyzing the Tour de France data set. The website that was identified was the Tour de France section of Cycling Archives, a website that maintains a database of all the cycling events and their winers. This project was performed using:
- Requests and Beautiful Soup for web-scraping
- NumPy and Pandas for Data Normalization and analysis
- Seaborn for visualization and
- Tableau for creating dashboards.
There are four Jupyter notebooks used here:
- Tour-De-France.ipynb contains the web-scraping process followed to gather data from Cycling Archives
- cleaning_cyclists_data.ipynb contains the data cleaning process followed using NumPy and Pandas
- Data_Preprocessing.ipynb contains the data-preprocessing steps and
- Data Visualization Steps.ipynb contains the analysis and visualization performed using NumPy, Pandas and Seaborn.