Predictive-Analysis-of-Online-Discussion-Dynamics-using-Text-Mining

Sentiment Analysis on Reddit Comments Project

Overview

This repository contains the code for a sentiment analysis project on Reddit comments, conducted by Sai Sankeerth Thallapally (U00861167) & Rahul Penchala (U00869508). The project aims to classify the sentiment of comments as either positive or negative using machine learning techniques. It includes three Python script files for different phases of the project and links to the required datasets.

Files in the Repository

EXPERIMENT-1.ipynb: Jupyter notebook for the first experiment.
EXPERIMENT-2.ipynb: Jupyter notebook for the second experimental setup.
Final code.ipynb: Jupyter notebook containing the final sentiment analysis model.

Data Files

The large dataset and Word2Vec model files are not included in this repository, but can be downloaded from the following links:

Reddit Comments Dataset (May 2015):
- Description: Contains Reddit comments from May 2015.
- Size: Approximately 21 GB.
- Download Link: Reddit Comments May 2015 - Kaggle
GoogleNews-vectors-negative300.bin:
- Description: Pre-trained Word2Vec model from Google News.
- Size: About 3 GB.
- Download Link: GoogleNews Vectors Negative300 - Kaggle

After downloading, place the database.sqlite (extracted from the Reddit Comments Dataset) and GoogleNews-vectors-negative300.bin files in the project directory alongside the Jupyter notebooks.

Important Note on Running the Code

To ensure the smooth execution of the notebooks, it's recommended to download the entire project repository as a ZIP file. After downloading, unzip the file to extract its contents into a single directory. This approach helps maintain the file structure and relative paths, making it easier to run the code without encountering file path issues.
Once you've downloaded and extracted the files, follow the instructions under the "How to Run the Code" section. Make sure that all prerequisite software and libraries are installed, and that the data files (database.sqlite and GoogleNews-vectors-negative300.bin) are placed in the same directory as the Jupyter notebooks.
By following these steps, you'll be able to run the notebooks seamlessly and replicate the project's results.

How to Run the Code

Prerequisites

Python 3.x
Jupyter Notebook
Libraries: pandas, nltk, textblob, gensim, keras, sklearn, imblearn, sqlite3

Installation

Install Python: Download from Python's official website.
Install Jupyter Notebook:

pip install notebook

Install Libraries:

pip install pandas nltk textblob gensim keras sklearn imblearn sqlite3

Running the Notebooks

Open Terminal (macOS/Linux) or PowerShell (Windows).
Navigate to the project directory with the .ipynb files.
Launch Jupyter Notebook:

jupyter notebook

Open the desired notebook from the Jupyter browser interface.
Run the notebook cells by pressing Shift + Enter, or use "Run All" in the toolbar.

Troubleshooting

Verify all libraries are installed.
Restart the Jupyter notebook kernel if necessary.
Check the file paths to database.sqlite and GoogleNews-vectors-negative300.bin.

Contact

For any queries related to this project, please contact

Sai Sankeerth Thallapally at (sthllpll@memphis.edu)
Rahul Penchala at (rpnchala@memphis.edu)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
EXPERIMENT-1.ipynb		EXPERIMENT-1.ipynb
EXPERIMENT-2.ipynb		EXPERIMENT-2.ipynb
Final_code.ipynb		Final_code.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive-Analysis-of-Online-Discussion-Dynamics-using-Text-Mining

Sentiment Analysis on Reddit Comments Project

Overview

Files in the Repository

Data Files

Important Note on Running the Code

How to Run the Code

Prerequisites

Installation

Running the Notebooks

Troubleshooting

Contact

About

Releases 1

Packages

Languages

License

sankeerth-th/Predictive-Analysis-of-Online-Discussion-Dynamics-using-Text-Mining

Folders and files

Latest commit

History

Repository files navigation

Predictive-Analysis-of-Online-Discussion-Dynamics-using-Text-Mining

Sentiment Analysis on Reddit Comments Project

Overview

Files in the Repository

Data Files

Important Note on Running the Code

How to Run the Code

Prerequisites

Installation

Running the Notebooks

Troubleshooting

Contact

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages