Skip to content

"Analyzing Amazon Sales Data" is a project focused on leveraging ETL techniques to extract, transform, and load Amazon sales datasets. It aims to uncover sales trends on a month-wise, year-wise, and yearly-month-wise basis.

License

Notifications You must be signed in to change notification settings

Swagatika-Meher/Analyzing-Amazon-Sales-data

Repository files navigation

Analyzing-Amazon-Sales-data

"Analyzing Amazon Sales Data" is a project focused on leveraging ETL techniques to extract, transform, and load Amazon sales datasets. It aims to uncover sales trends on a month-wise, year-wise, and yearly-month-wise basis.

Problem Statement:

Sales management has gained importance to meet increasing competition and the need for improved methods of distribution to reduce cost and to increase profits. Sales management today is the most important function in a commercial and business enterprise. Do ETL: Extract-Transform-Load some Amazon dataset and find for me Sales-trend -> month-wise, year-wise, yearly_month-wise. Find key metrics and factors and show the meaningful relationships between attributes.

Dataset:

Find the dataset in the given link.

Technologies: Data Science

Domain: E-commerce

Tools: Tableau Public

Language: Python

Approach

The main goal of the project is to find key metrics and factors and then show meaningful relationships between them based on different features available in the dataset.

Data Collection : Imported data from various datasets available in the project using Pandas library. 

Data Cleaning : Removed missing values and created new features as per insights. 

Data Preprocessing : Modified the structure of data in order to make it more understandable and suitable and convenient for statistical analysis. 

Data Analysis : I started analyzing dataset using Pandas, NumPy, Matplotlib and Seaborn. 

Data Visualization : Plotted graphs to get insights about dependent and independent variables. Also used Tableau for data visualization.

Data Collection, Data Cleaning, and Data analysis:

Amazon_sales_data_analysis.ipynb

Data Visualization:

Correlation between dataset

1111

  1. Total Revenue and Total Profit have a high Pearson correlation coefficient, indicating a strong relationship between the two variables. Greater profit will be made if revenue is strong, and vice versa.
  2. The correlation coefficient between units sold and unit cost is negative, which suggests that the relationship between product quantity and cost is inverse. The situation with units sold and units priced is the same. The more expensive a product is, the fewer units it is available in.
  3. We may deduce from the heatmap above that there is a substantial correlation between Total Cost, Unit Price, Unit Cost, and Total Profit.
  4. [Unit Price and Unit Cost] and Units Sold are entirely separate entities. The quantity of a product sold is unaffected by the price per unit, just as the cost of a unit is unaffected by the quantity of units sold.
  5. The relationship between Total Revenue and Unit Cost, Unit Price, and Total Cost is essentially nonexistent.

Yearly Analysis

Tableau Link : Click here

Sale_2

As we can see, total profit and total revenue is directly proportional to each other. Maximum profit and maximum revenue have been achieved in 2012. Minimum profit and minimum revenue have been achieved in 2011.

Monthly Analysis

Tableau Link : Click here

Sale_4

Maximum profit and maximum revenue have been achieved in the month of February. Minimum profit and minimum revenue have been achieved in the month of august.

Region-wise Analysis

Tableau Dashboard : Click here

Dashboard 2

Maximum profit and maximum revenue have been in Sub-Saharan Africa region. Lowest profit and lowest revenue have been in North America region.

Analogy

Tableau Dashboard : Click here

Dashboard 5

  1. Units Sold vs. Total Profit : As we can see, maximum profit has been generated when the number of units sold were between 5000 and 10000 i.e. more the number of units sold, more will be the profit generated.
  2. Total Revenue vs. Total Profit : The scatter plot suggests that total profit and total revenue are directly proportional to each other.
  3. Unit Cost vs. Units Sold : Here, the two variables 'Units Sold' and 'Unit Cost' are inversely proportional to each other to some extent. When more units of a product are sold, the unit cost of that product becomes lesser and vice versa.
  4. Total Cost vs. Units Sold : As we can see, maximum cost has been generated when the number of units sold were between 7000 and 9000. More the number of units sold of a product, more will be the total cost associated with it.

Selling item type

Tableau Dashboard : Click here

Dashboard 4

  • Maximum units has been sold having the items ‘Cosmetics' and ‘Clothes' closely followed by ‘Beverages’.
  • Even though higher units of clothes sold, less profit and less revenue has been generated.
  • Maximum revenue has been generated from the items ‘Office supplies' and 'Cosmetics' closely followed by ‘Household’.
  • Highest net profit margin has been achieved in ‘clothes’ and ‘cosmetics’.

Net Profit margin yearly

Tableau Link : Click here

Sale_14

Highest net profit margin has been achieved in the year 2012 and lowest profit margin has been achieved in the year 2011.

Net Profit margin monthly

Tableau Link : Click here

Sale_15

Highest net profit margin has been achieved in the month of july and lowest profit margin has been achieved in the month of December.

Year-month wise

  • Units Sold

    Tableau Dashboard : Click here

    Tableau Dashboard : Click here

Dashboard 1

Maximum units sold in the year 2012 and in the month July. Minimum units sold in the year 2016 and in the month march.

  • Profit

    Tableau Dashboard : Click here

Top 10 days when profits are highest

Tableau Link : Click here

Sale_16

In the dt.5th July, 2013 and 20th July, 2013, highest profit has been achieved.

Top 10 days when Sale Revenue are highest

Tableau Link : Click here

Sale_17

In the dt.8th February, 2017 and 16th January, 2015, sale revenues have been highest.

Top 10 days when Sale Units are highest

Tableau Link : Click here

Sale_18

In the dt.28th May, 2010 and 30th June, 2010, maximum sale units have been achieved.

Conclusion

  • Total profit and total revenue is directly proportional to each other.
  • Total profit and units sold is directly proportional to each other.
  • The two variables 'Units Sold' and 'Unit Cost' are inversely proportional to each other to some extent. When more units of a product are sold, the unit cost of that product becomes lesser and vice versa.
  • Highest net profit margin has been achieved in the items ‘clothes’ and ‘cosmetics’.

About

"Analyzing Amazon Sales Data" is a project focused on leveraging ETL techniques to extract, transform, and load Amazon sales datasets. It aims to uncover sales trends on a month-wise, year-wise, and yearly-month-wise basis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published