Melanoma Detection using CNN

Problem Statement

This project aims to build a CNN-based model for accurately detecting melanoma, a type of cancer that can be deadly if not detected early. It accounts for 75% of skin cancer deaths. The goal is to create a solution that can evaluate images and alert dermatologists about the presence of melanoma, potentially reducing the manual effort needed in diagnosis.

Dataset

The dataset consists of 2357 images of malignant and benign oncological diseases, sourced from the International Skin Imaging Collaboration (ISIC). It includes images of nine different skin conditions:

Actinic keratosis
Basal cell carcinoma
Dermatofibroma
Melanoma
Nevus
Pigmented benign keratosis
Seborrheic keratosis
Squamous cell carcinoma
Vascular lesion

The dataset is slightly imbalanced, with melanomas and moles (nevi) being slightly over-represented.

Project Pipeline

Data Reading/Understanding:
- Define paths for train and test images using os.listdir().
- Understand the structure of the dataset and print out the number of images in each category.
Dataset Creation:
- Create train & validation datasets using tf.keras.preprocessing.image_dataset_from_directory().
- Set batch size to 32 and image size to 180x180 pixels.
- Split the data into 80% training and 20% validation.
Dataset Visualization:
- Use matplotlib to visualize one instance of each class in the dataset.
- Display a 3x3 grid of images with their corresponding labels.
Model Building & Training:
- Create a CNN model using tf.keras.Sequential().
- Use layers: Conv2D, MaxPooling2D, Dropout, Flatten, and Dense.
- Apply data augmentation using tf.keras.Sequential() with RandomFlip, RandomRotation, and RandomZoom.
- Compile the model with Adam optimizer and SparseCategoricalCrossentropy loss.
- Train for 20 epochs using model.fit().
- Plot training and validation accuracy/loss curves.
Data Augmentation:
- Implement data augmentation to address overfitting.
- Use techniques like random flipping, rotation, and zooming.
Model Building & Training on Augmented Data:
- Create a new model with the same architecture but including data augmentation layers.
- Train for 20 epochs and analyze results to see if overfitting is reduced.
Class Distribution Analysis:
- Use os.listdir() to count the number of images in each class.
- Visualize the class distribution using a bar plot.
- Identify the class with the least samples and the dominant classes.
Handling Class Imbalances:
- Use the Augmentor library to generate additional samples for underrepresented classes.
- Aim to have at least 500 samples for each class.
Final Model Building & Training:
- Create a CNN model using the balanced dataset.
- Include data augmentation layers in the model.
- Train for 30 epochs.
- Plot and analyze final accuracy and loss curves.

Tools and Libraries Used

Python 3.x
TensorFlow 2.x / Keras (with GPU acceleration)
Augmentor
Matplotlib and Seaborn (for visualization)
NumPy (for numerical operations)
Pandas (for data manipulation)
PIL (Python Imaging Library)
Google Colab (for GPU-accelerated training)

Data Augmentation Techniques

Random Flipping (horizontal)
Random Rotation (up to 20 degrees)
Random Zooming (up to 20%)
Rescaling: Normalize pixel values to range [0,1]

Model Architecture

The CNN model consists of:

Multiple Conv2D layers (starting with 16 filters and increasing)
MaxPooling2D layers
Dropout layers for regularization
Flatten layer
Dense layers (128 units with ReLU activation)
Output Dense layer with 9 units (softmax activation)

How to Use

Clone the repository:

git clone [repository-url]
cd [repository-name]

Install the required dependencies:
```
pip install -r requirements.txt
```
Open and run the Jupyter notebook MelanomaDetecting.ipynb:
```
jupyter notebook MelanomaDetecting.ipynb
```
Or upload and run it in Google Colab for GPU acceleration.
Follow the notebook cells to execute each step of the pipeline.

Results and Findings

Preliminary findings:

Initial model showed signs of overfitting (high training accuracy, lower validation accuracy).
Data augmentation helped in reducing overfitting and improving model generalization.
Class imbalance was identified, with some classes having significantly fewer samples.
Final model trained on balanced dataset showed improved accuracy across all classes.
Model Performance:
High accuracy (96.85%).
Low loss metrics (0.1252).
Balanced class performance.
Robust validation results. The model shows good performance with validation accuracy over 85% and loss under 0.2.

Contributors

Chand Rayee

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

International Skin Imaging Collaboration (ISIC) for providing the dataset
The developers of TensorFlow, Keras, and Augmentor for their excellent tools
Google Colab for providing GPU resources for training

Note

This project is part of an assignment to demonstrate the application of CNNs in medical image classification. The model and findings presented here should not be used for actual medical diagnosis without further validation and approval from medical professionals. The purpose is educational and serves as a starting point for understanding the potential of AI in dermatology.

For any questions, suggestions, or if you'd like to contribute to this project, please open an issue in the repository or contact the contributors. We welcome feedback and collaboration to improve this model and its applications in the field of dermatology.

Connect With Us 🌐

Feel free to reach out to us through any of the following platforms:

Telegram: @chand_rayee
LinkedIn: Mr. Chandrayee
GitHub: mrchandrayee
Kaggle: mrchandrayee
Instagram: @chandrayee
YouTube: Chand Rayee
Discord: AI & ML Chand Rayee

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
MelanomaDetecting.ipynb		MelanomaDetecting.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Melanoma Detection using CNN

Problem Statement

Table of Contents

Dataset

Project Pipeline

Tools and Libraries Used

Data Augmentation Techniques

Model Architecture

How to Use

Results and Findings

Contributors

License

Acknowledgments

Note

Connect With Us 🌐

About

Releases

Packages

Languages

mrchandrayee/MelanomaDetecting

Folders and files

Latest commit

History

Repository files navigation

Melanoma Detection using CNN

Problem Statement

Table of Contents

Dataset

Project Pipeline

Tools and Libraries Used

Data Augmentation Techniques

Model Architecture

How to Use

Results and Findings

Contributors

License

Acknowledgments

Note

Connect With Us 🌐

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages