🧠 Amazon Product Review Analyzer

NLP pipeline for sentiment analysis and topic discovery on Amazon product reviews

📘 Overview

The Amazon Product Review Analyzer is a Natural Language Processing (NLP) project that extracts insights from thousands of Amazon reviews.
It combines BERT for sentiment classification with LDA and BERTopic for topic modeling to uncover recurring complaints and praise themes.
Results are visualized through interactive Plotly charts for easy interpretation.

🚀 Features

🔍 Sentiment Analysis: Fine-tuned BERT model (via HuggingFace Transformers) to classify reviews as positive, neutral, or negative.
🧩 Topic Modeling: LDA and BERTopic used to uncover common complaint categories and product feedback themes.
🧹 Data Processing: Cleaned and preprocessed 10k+ Kaggle Amazon reviews using pandas.
📊 Visualization: Interactive dashboards built with Plotly for sentiment and topic exploration.
⚙️ Pipeline Design: Modular Python scripts for data prep, training, evaluation, and topic discovery.

🧰 Tech Stack

Technology	Purpose
Python	Core language for data processing, modeling, and visualization
pandas	Data cleaning, manipulation, and preprocessing
HuggingFace Transformers	BERT model fine-tuning for sentiment classification
scikit-learn (LDA)	Classical topic modeling using Latent Dirichlet Allocation
BERTopic	Transformer-based topic discovery with better contextual clusters
Plotly	Interactive visualizations and charts for results presentation

📊 Results

92% F1 score on held-out test data using fine-tuned BERT for sentiment classification.
BERTopic/LDA extracted clear complaint categories such as packaging issues, delivery delays, and product defects.
Visual insights enabled clear understanding of customer pain points for potential product improvements.

🧪 How It Works

Data Preparation:
- Load dataset (reviews.csv) using pandas.
- Clean and normalize text (remove punctuation, stopwords, URLs).
Sentiment Analysis (BERT):
- Fine-tune a pre-trained BERT model from HuggingFace on labeled review data.
- Evaluate using F1, precision, and recall metrics.
Topic Modeling (LDA & BERTopic):
- Apply LDA for initial topic discovery.
- Use BERTopic for transformer-based, semantically richer topics.
Visualization:
- Generate Plotly charts for sentiment distribution and top complaint themes.

⚡ Quickstart

1️⃣ Create a virtual environment & install dependencies python -m venv .venv source .venv/bin/activate # Windows: .venv\Scripts\activate pip install -r requirements.txt

2️⃣ Add the dataset Place your Amazon reviews CSV at: data/raw/reviews.csv (Any CSV with a Text or review_text column works.)

3️⃣ Preprocess the data python -m src.prepare_data

4️⃣ Run sentiment analysis (BERT/VADER) python -m src.train_sentiment

5️⃣ Run topic modeling (LDA or BERTopic) python -m src.topic_model

6️⃣Visualize results (optional Streamlit dashboard) streamlit run app/streamlit_app.py Then go to http://localhost:8501 in your browser.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Amazon Product Review Analyzer

📘 Overview

🚀 Features

🧰 Tech Stack

📊 Results

🧪 How It Works

⚡ Quickstart

About

Uh oh!

Releases

Packages

Languages

Embotic-Wayne/review-analyzer

Folders and files

Latest commit

History

Repository files navigation

🧠 Amazon Product Review Analyzer

📘 Overview

🚀 Features

🧰 Tech Stack

📊 Results

🧪 How It Works

⚡ Quickstart

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages