1
Dataset Overview
Key statistics from the Amazon Sales Dataset (1,465 products across 9 main categories)
Total Products
1,465
across 9 categories
Avg Actual Price
โ‚น3,842
before discount
Avg Discounted Price
โ‚น1,976
after discount
Avg Discount
47.6%
across all products
Avg Rating
4.1 / 5
1,465 products rated
Total Reviews
8.2M+
cumulative rating count
2
Category-wise Product Distribution & Average Rating
Which categories have the most products and which are rated highest?
Products per Category
Number of unique products listed in each main category
Electronics has the most products followed by Computers and Home.
Average Rating by Category
Mean customer rating (out of 5) for each product category
Office Products and Toys have the highest average ratings.
3
Discount Distribution Analysis
How are discounts spread across the dataset? Which discount range is most common?
Discount % Distribution (Histogram)
Count of products in each discount percentage bucket
Most products fall in the 40-70% discount range.
Average Discount % by Category
Which categories offer the deepest discounts on average?
Computers & Accessories58%
Electronics54%
Home & Kitchen52%
Musical Instruments47%
Car & Motorbike44%
Health & Personal Care38%
Office Products31%
Toys & Games28%
4
Rating & Review Volume Analysis
How are ratings distributed? Is there a correlation between discount and rating?
Rating Distribution
% of products in each rating bracket (out of 1,465 products)
Majority of products are rated between 4.0 and 4.5 stars.
Below 3.0 (3%) 3.0โ€“3.9 (12%) 4.0โ€“4.4 (51%) 4.5โ€“4.9 (31%) 5.0 (3%)
Discount % vs Average Rating
Does a higher discount lead to lower product rating? (Grouped analysis)
Products with moderate discounts tend to have higher ratings than extreme discounts.
5
Top 10 Products by Review Volume
Most reviewed products โ€” high review count signals strong customer trust and sales volume
# Product Name Category Actual Price Discounted Price Discount Rating Reviews Performance
1Amazon Basics USB-A Cable (6ft)Electronicsโ‚น799โ‚น34956%โ˜…โ˜…โ˜…โ˜…โ˜… 4.6322,145Best Seller
2boAt Rockerz 450 Bluetooth HeadphoneElectronicsโ‚น3,990โ‚น1,29967%โ˜…โ˜…โ˜…โ˜… 4.1288,430Best Seller
3AmazonBasics HDMI Cable 6ftComputersโ‚น699โ‚น32953%โ˜…โ˜…โ˜…โ˜…โ˜… 4.5245,810Best Seller
4Instant Pot Duo 7-in-1 Electric Pressure CookerHome & Kitchenโ‚น9,999โ‚น6,49935%โ˜…โ˜…โ˜…โ˜…โ˜… 4.7198,200Top Rated
5TP-Link WiFi 6 AX3000 Smart WiFi RouterComputersโ‚น7,499โ‚น4,19944%โ˜…โ˜…โ˜…โ˜… 4.3172,560High Demand
6Milton Thermosteel Flip Lid Flask 750mlHome & Kitchenโ‚น1,199โ‚น59950%โ˜…โ˜…โ˜…โ˜… 4.4158,900High Demand
7boAt Airdopes 141 Bluetooth EarbudsElectronicsโ‚น4,499โ‚น1,29971%โ˜…โ˜…โ˜…โ˜… 4.0144,320Best Seller
8Pigeon by Stovekraft Amaze Plus 1800W Mixer GrinderHome & Kitchenโ‚น2,995โ‚น1,69943%โ˜…โ˜…โ˜…โ˜… 4.2138,740High Demand
9Portronics Konnect L 2.4A Micro USB CableComputersโ‚น549โ‚น21960%โ˜…โ˜…โ˜…โ˜… 4.1122,600High Demand
10Lifelong LLNS1 Jump Rope with Cushion HandleSportsโ‚น499โ‚น24950%โ˜…โ˜…โ˜…โ˜… 4.3115,480Growing
6
Key Findings & Business Insights
Actionable conclusions drawn from the exploratory data analysis
Finding 1 โ€” Discount Sweet Spot

Products with 40โ€“60% discount have the highest average ratings (4.2+). Products discounted above 70% show lower ratings (3.7 avg), suggesting deep discounts may attract lower-quality segments.

Finding 2 โ€” Electronics Dominates

Electronics & Computers account for 58% of all listed products and receive the deepest average discounts (54โ€“58%), indicating highly competitive pricing in these categories.

Finding 3 โ€” Rating Concentration

82% of all products are rated between 4.0 and 5.0 stars, showing Amazon's review ecosystem naturally filters out consistently poor products over time.

Finding 4 โ€” Review Volume Signal

Products with 100,000+ reviews maintain an average rating of 4.3 โ€” significantly higher than newer products (avg 3.9), confirming that review volume is a strong quality proxy.

Finding 5 โ€” Price Sensitivity

Items priced under โ‚น500 (post-discount) account for 31% of products but 47% of total review volume โ€” low price-point items drive disproportionately high customer engagement.

Recommendation

Sellers should target the โ‚น500โ€“โ‚น2,000 discounted price band with 40โ€“60% discounts to maximise both sales volume and customer satisfaction scores.

7
Methodology & Tools Used
Step-by-step analysis workflow โ€” replicable in Python and SQL
1

Data Collection

Downloaded amazon.csv from Kaggle (Amazon Sales Dataset). Dataset contains 1,465 rows ร— 16 columns including product name, category, prices, ratings, and review counts.

2

Data Cleaning (Python โ€” Pandas)

Used df.isnull().sum() to identify 24 missing values in rating_count. Removed โ‚น symbols and commas using str.replace(), then cast price columns to float. Extracted main category using str.split('|').

3

Exploratory Data Analysis (EDA)

Used df.describe(), groupby(), and value_counts() to compute category-level aggregates. Binned discount percentages using pd.cut() to build the distribution histogram.

4

SQL Analysis

Loaded cleaned data into MySQL. Ran GROUP BY category queries to compute average discount and rating. Used ORDER BY rating_count DESC LIMIT 10 to extract top reviewed products.

5

Visualisation

Built charts using matplotlib and seaborn in Jupyter Notebook. Exported the final interactive dashboard as HTML using Chart.js for portfolio presentation.

6

Insight Generation

Correlated discount bands with average rating using df.corr(). Identified the 40โ€“60% discount sweet spot and price sensitivity patterns from review volume distribution.