https://github.com/hemanitekwani/ml-nlp_projects/blob/main/Breast_cancer.py
This repository contains code for a machine learning project focused on breast cancer diagnosis. The goal of this project is to predict whether a breast cancer diagnosis is malignant or benign based on various features using machine learning techniques.
The breast cancer dataset used in this project is loaded from a CSV file named breast-cancer.csv. The dataset contains information about various features related to breast cancer tumors.
The features used for prediction are extracted from the dataset by dropping the 'diagnosis' column, which represents the target variable. The remaining features include information about the characteristics of the tumors.
Before training the machine learning model, the dataset is preprocessed:
The dataset is loaded using the Pandas library.
Basic information about the dataset is displayed, including its shape and summary statistics.
Any missing values in the dataset are identified and handled if necessary.
The dataset is split into training and testing sets using a test size of 20% and a random state of 42.
A Logistic Regression model is used for prediction:
An instance of the LogisticRegression model is created.
The model is trained on the training data.
The accuracy of the model on both training and testing data is calculated and displayed.
NumPy Pandas scikit-learn (sklearn) joblib
https://github.com/hemanitekwani/ml-nlp_projects/blob/main/facemesh.py
This repository contains code for a real-time face mesh detection application using OpenCV and MediaPipe. The application captures video from the default camera, processes each frame, and detects facial landmarks using the MediaPipe FaceMesh module. Detected landmarks are visualized on the video frame in real-time.
To use the face mesh detection application:
Install the required dependencies (OpenCV, MediaPipe). Run the provided Python script. Press 'q' to exit the application. Fake News Detection using Natural Language Processing This repository contains code for a fake news detection project using NLP techniques. The project classifies news articles as real or fake based on their content. Data preprocessing, TF-IDF vectorization, and training a Logistic Regression model are included.
The project uses a subset of a news dataset loaded from a CSV file named train.csv.
Data Loading using Pandas.
Feature Creation by combining 'author' and 'title' columns. Text Preprocessing includes converting to lowercase, tokenizing, stemming, and removing stopwords.
TF-IDF vectorization converts processed text to numerical features.
Data Splitting: Training and testing sets are split. Model Initialization and Training: Logistic Regression model is trained. Accuracy Calculation: Model accuracy on training and testing data is calculated.