"Analyzing Amazon Sales Data" is a project focused on leveraging ETL techniques to extract, transform, and load Amazon sales datasets. It aims to uncover sales trends on a month-wise, year-wise, and yearly-month-wise basis.
Sales management has gained importance to meet increasing competition and the need for improved methods of distribution to reduce cost and to increase profits. Sales management today is the most important function in a commercial and business enterprise. Do ETL: Extract-Transform-Load some Amazon dataset and find for me Sales-trend -> month-wise, year-wise, yearly_month-wise. Find key metrics and factors and show the meaningful relationships between attributes.
Find the dataset in the given link.
Technologies: Data Science
Domain: E-commerce
Tools: Tableau Public
Language: Python
The main goal of the project is to find key metrics and factors and then show meaningful relationships between them based on different features available in the dataset.
Data Collection : Imported data from various datasets available in the project using Pandas library.
Data Cleaning : Removed missing values and created new features as per insights.
Data Preprocessing : Modified the structure of data in order to make it more understandable and suitable and convenient for statistical analysis.
Data Analysis : I started analyzing dataset using Pandas, NumPy, Matplotlib and Seaborn.
Data Visualization : Plotted graphs to get insights about dependent and independent variables. Also used Tableau for data visualization.
Amazon_sales_data_analysis.ipynb
- Total Revenue and Total Profit have a high Pearson correlation coefficient, indicating a strong relationship between the two variables. Greater profit will be made if revenue is strong, and vice versa.
- The correlation coefficient between units sold and unit cost is negative, which suggests that the relationship between product quantity and cost is inverse. The situation with units sold and units priced is the same. The more expensive a product is, the fewer units it is available in.
- We may deduce from the heatmap above that there is a substantial correlation between Total Cost, Unit Price, Unit Cost, and Total Profit.
- [Unit Price and Unit Cost] and Units Sold are entirely separate entities. The quantity of a product sold is unaffected by the price per unit, just as the cost of a unit is unaffected by the quantity of units sold.
- The relationship between Total Revenue and Unit Cost, Unit Price, and Total Cost is essentially nonexistent.
Tableau Link : Click here
As we can see, total profit and total revenue is directly proportional to each other. Maximum profit and maximum revenue have been achieved in 2012. Minimum profit and minimum revenue have been achieved in 2011.
Tableau Link : Click here
Maximum profit and maximum revenue have been achieved in the month of February. Minimum profit and minimum revenue have been achieved in the month of august.
Tableau Dashboard : Click here
Maximum profit and maximum revenue have been in Sub-Saharan Africa region. Lowest profit and lowest revenue have been in North America region.
Tableau Dashboard : Click here
- Units Sold vs. Total Profit : As we can see, maximum profit has been generated when the number of units sold were between 5000 and 10000 i.e. more the number of units sold, more will be the profit generated.
- Total Revenue vs. Total Profit : The scatter plot suggests that total profit and total revenue are directly proportional to each other.
- Unit Cost vs. Units Sold : Here, the two variables 'Units Sold' and 'Unit Cost' are inversely proportional to each other to some extent. When more units of a product are sold, the unit cost of that product becomes lesser and vice versa.
- Total Cost vs. Units Sold : As we can see, maximum cost has been generated when the number of units sold were between 7000 and 9000. More the number of units sold of a product, more will be the total cost associated with it.
Tableau Dashboard : Click here
- Maximum units has been sold having the items ‘Cosmetics' and ‘Clothes' closely followed by ‘Beverages’.
- Even though higher units of clothes sold, less profit and less revenue has been generated.
- Maximum revenue has been generated from the items ‘Office supplies' and 'Cosmetics' closely followed by ‘Household’.
- Highest net profit margin has been achieved in ‘clothes’ and ‘cosmetics’.
Tableau Link : Click here
Highest net profit margin has been achieved in the year 2012 and lowest profit margin has been achieved in the year 2011.
Tableau Link : Click here
Highest net profit margin has been achieved in the month of july and lowest profit margin has been achieved in the month of December.
Maximum units sold in the year 2012 and in the month July. Minimum units sold in the year 2016 and in the month march.
-
Profit
Tableau Dashboard : Click here
Tableau Link : Click here
In the dt.5th July, 2013 and 20th July, 2013, highest profit has been achieved.
Tableau Link : Click here
In the dt.8th February, 2017 and 16th January, 2015, sale revenues have been highest.
Tableau Link : Click here
In the dt.28th May, 2010 and 30th June, 2010, maximum sale units have been achieved.
- Total profit and total revenue is directly proportional to each other.
- Total profit and units sold is directly proportional to each other.
- The two variables 'Units Sold' and 'Unit Cost' are inversely proportional to each other to some extent. When more units of a product are sold, the unit cost of that product becomes lesser and vice versa.
- Highest net profit margin has been achieved in the items ‘clothes’ and ‘cosmetics’.