This repository contains a simple neural network project implemented using the PyTorch library. The project's main goal is to build and train a Convolutional Neural Network (CNN) to predict a binary target variable based on a dataset with seven features (f1, f2, f3, f4, f5, f6, f7).
- Project Overview
- Preprocessing
- Neural Network Architecture
- Part I: Building a Standard Convolutional Neural Network
- Part II: Optimizing the Neural Network
Neural networks were implemented for binary classification in this project. The project focuses on building and training a standard Neural Network with two hidden layers and one output layer.
The dataset used for training and testing contains 767 data entries. It includes seven numeric feature variables (f1, f2, f3, f4, f5, f6, f7) and a binary target variable (target).
The preprocessing phase involved data cleaning and normalization. Non-numeric entries were removed from the dataset, and the feature variables were standardized using the StandardScaler()
to ensure consistent data scaling.
- Python's Pandas library was used to remove non-numeric entries.
- Data standardization and scaling were achieved using the
StandardScaler()
function.
The standard CNN model consists of two hidden layers and one output layer. The nn.Linear()
module was employed to apply linear transformations to incoming data. ReLU activation functions were used in the hidden layers. The model was trained with a batch size of 64, a learning rate of 0.001, and for 100 epochs.
- Input Layer: Seven feature variables (f1, f2, f3, f4, f5, f6, f7)
- Two Hidden Layers with ReLU activation
- Output Layer
- Batch Size: 64
- Learning Rate: 0.001
- Number of Epochs: 100
The maximum training accuracy achieved for the standard CNN model is 90.63%, and the maximum test accuracy is 76.32%.
In the second part of the project, different techniques were applied to optimize the CNN model's accuracy.
- Dropout: A dropout rate of 0.2 was implemented.
- Learning Rate Scheduler: StepLR scheduler with step size 100 and gamma 0.1 was used.
- k-fold Cross-Validation: k-fold cross-validation with k = 5.
- Early Stopping: Early stopping with patience 5 and best epoch 0.
The optimization techniques were evaluated with different setups, and their impacts on accuracy were analyzed.
The maximum training accuracy with dropout is 96.875%, and the maximum test accuracy is 76.97%. Dropout had a positive effect on the model without overfitting.
The maximum training accuracy with the learning rate scheduler is 100.0%, but there is overfitting, with a significant difference between training and testing accuracy.
The accuracy with k-fold cross-validation is 75.824%, which is lower than the standard CNN model. K-fold cross-validation did not improve accuracy.
Early stopping resulted in a maximum training accuracy of 100.0% and a maximum test accuracy of 76.97%, with overfitting observed.
- Dropout improved accuracy without overfitting.
- Learning rate scheduler increased accuracy but led to overfitting.
- k-fold cross-validation did not improve accuracy.
- Early stopping did not have a positive effect and caused overfitting.
In conclusion, implementing dropout is recommended for improving model efficiency while avoiding overfitting.
This README provides an overview of the project, including the dataset, preprocessing steps, neural network architecture, and the results of implementing different optimization techniques. For more detailed information, refer to the complete report in the project directory.
Feel free to clone this repository and explore the code and dataset to understand the project in more detail.