In this scenario, we are going to solve a real-world data science/analysis project with python.
We will be using the following Python Libraries:
- Pandas
- Pandas Profiling Report
- AutoViz
- Plotly
After we have loaded the dataset, we will do some initial exploratory data analysis to get an idea of the dataset. I am going to show you very useful pandas’ functions which you can apply to any kind of dataset you might deal with.
However, nowadays there are so many cool libraries available, which will make exploratory data analysis so much easier. I will show you my favorite 2 libraries, which will generate automated reports for us in just a few lines of code. Those reports are a great starting point before we are moving on to answer real-world business type questions.
While answering those questions, we will cover a wide range of various pandas’ functions. Additionally, we will also code our own python helper function, which we are going to use in the deep-dive & visualization section. All the charts we are going to create will be interactive and have a clean design.
We will cover the following chart types:
- Histogram
- Box Plot
- Bar Charts
- Scatter Plot
- Line Chart
autoviz==0.0.81
numpy==1.19.3
openpyxl==3.0.5
pandas==1.2.0
pandas-profiling==2.9.0
plotly==4.14.1
plotly-express==0.4.1
xlrd==2.0.1