IoT Attack Detection with GAN-based Data Augmentation 🔐📶

Intrusion detection for IoT networks using machine learning / deep learning and GAN-based synthetic data to improve class balance.
This repository contains a Google Colab notebook and a concise report-style README that summarizes the theory and the implementation steps.

📚 Background (Short Theory)

Internet of Things (IoT) deployments are exposed to a wide spectrum of attacks (e.g., port scans, DoS, brute-force, botnet traffic). Signature-based IDS struggles with novel or rare patterns, while ML/DL classifiers can generalize better—provided the training data is representative.
However, many IoT datasets are imbalanced: some attack classes are under-represented, which hurts recall. Generative Adversarial Networks (GANs)—here CTGAN for tabular data—can synthesize additional samples for the minority classes to balance the dataset and boost detection metrics.

🗄️ Dataset

Based on the IoT-23 dataset (UNB/CIC) or a cleaned derivative.
Typical pipeline: CSVs with network-flow features, label column (normal / multiple attack types), and optional train/test splits.

Dataset files are not included in this repo. Place them under data/ when running locally or mount from Drive in Colab.

🧩 Methodology

Preprocessing
- Load CSV(s), drop irrelevant cols, handle missing values, encode categorical features, scale numeric features.
Baseline Model
- Train a neural network (Keras MLP) or a classic ML model as a baseline; record metrics (Accuracy, Precision/Recall/F1, Confusion Matrix).
Synthetic Data with CTGAN
- Train CTGAN on the training split—focusing on minority classes—and generate synthetic samples.
Retrain with Augmented Data
- Concatenate real + synthetic data; retrain a robust model (e.g., RandomForest or improved MLP).
Evaluation
- Compare baseline vs. augmented: class-wise precision/recall/F1, macro-F1, ROC-AUC (if applicable), and visualize the confusion matrix.

🛠️ Implementation Steps (Notebook)

Environment setup (Colab): install libs, mount Google Drive (optional).
Load & preprocess: read CSV(s), encode & scale, split into train/test.
Train baseline: fit model, log metrics, save artifacts to results/.
CTGAN training: fit on minority classes, generate N samples per class.
Augmented training: mix real + synthetic, refit model, log metrics.
Evaluation & plots: classification report, confusion matrix, and (optionally) feature importance for tree-based models.

▶️ How to Run (Google Colab)

Open Google Colab and upload notebooks/Untitled10.ipynb (or open from GitHub).
Prepare data:
- Upload your CSV(s) to Colab, or
- Mount Google Drive and point the notebook to your data folder.
Run the notebook cells in order (setup → preprocessing → baseline → CTGAN → retrain → evaluation).
Results (figures, CSVs, models) can be saved under results/.

📂 Repository Structure

iot-attack-detection/
├─ notebooks/
│  └─ Untitled10.ipynb      # main Colab notebook
├─ results/                 # generated plots/reports (ignored by Git)
├─ requirements.txt         # Python dependencies
├─ .gitignore
└─ README.md

🔖 Recommended Topics

iot, intrusion-detection, cybersecurity, machine-learning, deep-learning, gan, ctgan, tabular-data

📝 Notes

Replace Untitled10.ipynb with a clearer name (e.g., iot23_gan_augmentation.ipynb) once you finalize it.
If you need to reproduce on CPU-only machines, consider using RandomForest as baseline + augmented retraining (fast & strong for tabular data).
Keep large datasets outside the repo (data/ is ignored via .gitignore).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IoT Attack Detection with GAN-based Data Augmentation 🔐📶

📚 Background (Short Theory)

🗄️ Dataset

🧩 Methodology

🛠️ Implementation Steps (Notebook)

▶️ How to Run (Google Colab)

📂 Repository Structure

🔖 Recommended Topics

📝 Notes

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

DianaC01/iot-attack-detection

Folders and files

Latest commit

History

Repository files navigation

IoT Attack Detection with GAN-based Data Augmentation 🔐📶

📚 Background (Short Theory)

🗄️ Dataset

🧩 Methodology

🛠️ Implementation Steps (Notebook)

▶️ How to Run (Google Colab)

📂 Repository Structure

🔖 Recommended Topics

📝 Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages