The Complete Text Analysis App is a Streamlit-based web application that provides various text analysis functionalities. The app includes features for spam detection, sentiment analysis, stress detection, hate and offensive content detection, and sarcasm detection. It leverages Natural Language Processing (NLP) techniques and machine learning models to analyze and classify text inputs.
App URL (Deploy on Hugging Face) :https://huggingface.co/spaces/shubham5027/Text_Analysis_NLP
Streamlit Cloud: https://complete-text-analysis-using-natural-language-processing-3abm4.streamlit.app/
- Introduction
- Features
- Installation
- Usage
- Dependencies
- Configuration
- Documentation
- Examples
- Troubleshooting
- Contributors
- License
- Spam or Ham Detection
- Classifies text as spam or ham.
- Sentiment Analysis
- Detects the sentiment of the text (positive or negative).
- Stress Detection
- Determines the level of stress in the text.
- Hate and Offensive Content Detection
- Identifies the level of hate and offensive content in the text.
- Sarcasm Detection
- Detects whether the text is sarcastic or not.
To install and run the Complete Text Analysis App, follow these steps:
-
Clone the repository:
git clone https://github.com/yourusername/complete-text-analysis-Using-Natural-Language-Processing.git cd complete-text-analysis-app
-
Install the required dependencies:
pip install -r requirements.txt
-
Download NLTK data:
import nltk nltk.download('punkt') nltk.download('stopwords')
-
Run the Streamlit app:
streamlit run app.py
- Navigate to the app in your web browser.
- Use the sidebar to select the desired analysis option:
- Home
- Spam or Ham Detection
- Sentiment Analysis
- Stress Detection
- Hate and Offensive Content Detection
- Sarcasm Detection
- Enter the text you want to analyze in the provided text area.
- Click the "Predict" button to get the analysis results.
- Python 3.x
- Streamlit
- NumPy
- Pandas
- NLTK
- Scikit-learn
Ensure you have the following CSV files in the same directory as app.py
:
- Spam Detection.csv
- Sentiment Analysis.csv
- Stress Detection.csv
- Hate Content Detection.csv
- Sarcasm Detection.csv
The CSV files should have the following columns:
- Spam Detection.csv:
Label
,Text
- Sentiment Analysis.csv:
Text
,Label
- Stress Detection.csv:
Text
,Sentiment
,Stress Level
- Hate Content Detection.csv:
Hate Level
,Offensive Level
,Class Level
,Text
- Sarcasm Detection.csv:
Text
,Label
The transform_text
function cleans and preprocesses the text input by:
- Converting to lowercase
- Tokenizing words
- Removing stopwords and punctuation
- Stemming the words
Each analysis option has its own model training setup using the provided CSV files. The models are trained using Scikit-learn classifiers and vectorizers.
- Uses Logistic Regression and TF-IDF Vectorizer.
- Uses Logistic Regression and TF-IDF Vectorizer.
- Uses Decision Tree Regressor and TF-IDF Vectorizer.
- Uses Random Forest Classifier and TF-IDF Vectorizer.
- Uses Logistic Regression and TF-IDF Vectorizer.
Here are some examples of how to use the app:
- Enter a text message to check if it's spam or ham.
- Analyze the sentiment of a social media post.
- Detect stress levels in a piece of text.
- Identify offensive content in user comments.
- Check if a statement is sarcastic.
- Ensure all CSV files are correctly formatted and placed in the same directory as
app.py
. - Verify that all dependencies are installed.
- Check for any errors in the console for more details on issues.
This project is licensed under the MIT License. See the LICENSE file for more details.