Title: Data Cleaning and Exploratory Data Analysis (EDA) on Titanic Dataset OVERVIEW: This task performs comprehensive analysis of the Titanic Dataset, which consists of three distinct datasets: train, test, and gender submission. The primary focus is on gaining insights into the relationships between variables and applying Data Cleaning and Exploratory Data Analysis (EDA) techniques.
TASKS IMPLEMENTED:
1] It utilizes Python Libraries like pandas, matplotlib, numpy, seaborn.
2] Extracting information from the dataset.
3] Data Cleaning: Checking for missing values. Handling missing values using imputation technique (Median). Dropping unnecessary columns.
4] Exploratory Data Analysis (EDA) Explore survival rates by different features, class, age, gender, embarked and fare distribution. Visualize relationships between variables using various plots like countplot, scatterplot, boxplot, pairplot and histograms.
5] Plots based on Different Factors: Created a various set of plots, each focuses on different aspects of the dataset (e.g. age distribution, fare distribution, etc.). Analyzed these plots to discover additional insights.
6] Correlation Matrix Analysis: Calculated and visualized correlation matrices for training and testing data. Created a heatmap to visualize the correlation between different features.