Skip to content

TravelXML/Amazon-Product-Reviews-Sentiment-Analysis-in-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NLP: Amazon Product Reviews Sentiment Analysis in Python

Amazon Reviews

Unlock the insights hidden in Amazon product reviews with this comprehensive sentiment analysis project. By leveraging machine learning and natural language processing (NLP), this project aims to classify reviews as positive or negative, providing valuable insights into customer sentiments.

🚀 Project Overview

This repository provides a step-by-step guide to performing sentiment analysis on Amazon product reviews. The project uses a Logistic Regression model trained on pre-processed review text to predict whether a review is positive or negative.

Key Features:

  • Data Preprocessing: Clean and prepare raw review text for analysis using Python libraries like nltk and pandas.
  • Model Training: Train a Logistic Regression model to classify the sentiment of reviews.
  • Visualization: Generate word clouds and confusion matrices to visualize the distribution of sentiments and model performance.
  • Evaluation: Assess model accuracy with metrics like accuracy score and confusion matrix.

📂 Repository Structure

  • az_senti_analysis.ipynb: The Jupyter Notebook that contains the full workflow, from data preprocessing to model evaluation.
  • data/: Directory to store the Amazon review dataset.
  • requirements.txt: List of Python libraries required to run the project.

🛠️ Installation

Prerequisites

Make sure you have Python 3.7+ installed. Clone this repository and navigate to its directory:

git clone https://github.com/TravelXML/Amazon-Product-Reviews-Sentiment-Analysis-in-Python.git
cd Amazon-Product-Reviews-Sentiment-Analysis-in-Python

Install Dependencies

Use pip to install the necessary Python libraries:

pip install -r requirements.txt

📊 Usage

  1. Download the Dataset: Ensure the Amazon product reviews dataset is placed in the data/ directory. The dataset should be in CSV format.
  2. Run the Notebook: Open and execute az_senti_analysis.ipynb in Jupyter Notebook or JupyterLab to perform sentiment analysis.
  3. Visualize Results: Explore the generated visualizations to understand the sentiment distribution across the dataset.

🎯 Example Outputs

Word Cloud

Visualize the most frequent words in positive and negative reviews: Word Cloud

Confusion Matrix

Evaluate model performance with a confusion matrix: Confusion Matrix

🤝 Contributing

Contributions are welcome! Whether it's fixing bugs, improving the documentation, or adding new features, feel free to open a pull request or submit an issue.

📧 Contact

For questions or collaborations, reach out via LinkedIn.

Happy Coding