CS420 Machine Learning Final Project

Classification on modified MNIST dataset. Our LocNet achieves the accuracy of 99.90%.

Requirements

Model List

---- Traditional Methods
    ---- Naive Bayes (18.81%)
    ---- Decision Tree (50.94%)
    ---- Random Forest (87.61%)
    ---- K-Nearest Neighbors (88.63%)
    ---- Support Vector Machine (87.07%)
---- Deep Learning Methods
    ---- FC Baseline (90.11%)
    ---- CNN Baseline (99.47%)
    ---- PointNet (91.02%)
    ---- SegNet (99.40%)
    ---- LocNet (99.90%)

Prepare Data

Download datasets from jbox and move them to mnist/ folder, the folder structure should look like this:

---- mnist/
    ---- mnist_train/
    ---- mnist_test/

Traditonal Methods

Naive Bayes

cd traditional_methods/NaiveBayes/
python Bayes.py

Decision Tree

cd traditional_methods/DecisionTree/
python Tree.py

Random Forest

cd traditional_methods/RandomForest/
python ForestBestN.py

These commands will output the performance of random forest with different number of decision trees, demonstrated by the following two figures.

K-Nearest Neighbors

cd traditional_methods/KNN/
python KNNBestK.py

These commands will output the performance of KNN with different K, demonstrated by the following figure.

Support Vector Machine

cd traditional_methods/SVM/
python SVMBestDim.py

These commands will output the performance of linear SVM on different dimension data reduced by PCA, demonstrated by the following two figures.

cd traditional_methods/SVM/
python SVMBestKernel.py

These commands will output the performance of SVM with different kernels.

Influence of Modification

For five traditional models above, running *Preprocess.py in their respective directory will give the results as the following table shows.

	Naive Bayes	Desision Tree	Random Forest	K-Nearest Neighbor	SVM
Target dataset	18.81%	50.94%	87.61%	88.63%	87.07%
Keep largest CC	19.73%	55.54%	89.07%	88.73%	88.29%
Shift CC to center	75.90%	92.69%	98.46%	97.55%	96.85%

note: "CC" stands for connected components.

Deep Learning Methods

In this section, we implement FC and CNN baselines for classification. Three methods are proposed to improve the performance:

PointNet
SegNet
LocNet

Usage

Each model is in a seperate folder in deep_learning_methods/. To train a model, please go into the corresponding folder and run train_xxx.py. For example, to train a baseline CNN model, you can do as follows:

cd deep_learning_methods/CNN_baseline/
python train_cnn.py

Deep Model Performance

	Baseline	Largest CC	CC Centralization	SegNet	LocNet
FC	90.11%	92.46%	99.03%	92.78%	99.28%
CNN	99.47%	99.31%	99.88%	99.40%	99.90%
PointNet	91.02%

note: "CC" stands for connected components.

SegNet

We propose SegNet to automatically denoise original images using neural networks. Improvements are significant on FC baseline.

note: "i-th Epoch" stands for the visualization results after training for i epochs.

LocNet

We propose LocNet to automatically localize digits in original images by tight bounding boxes (BBox) using neural networks. Improvements are significant on both FC and CNN baselines.

note: green boxes are ground truth and red boxes are predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
deep_learning_methods		deep_learning_methods
img		img
mnist		mnist
traditional_methods		traditional_methods
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mnist_preprocess.py		mnist_preprocess.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS420 Machine Learning Final Project

Contents

Requirements

Model List

Prepare Data

Traditonal Methods

Naive Bayes

Decision Tree

Random Forest

K-Nearest Neighbors

Support Vector Machine

Influence of Modification

Deep Learning Methods

Usage

Deep Model Performance

SegNet

LocNet

Team Member

About

Releases

Packages

Contributors 2

Languages

License

yelantf/cs420-codes

Folders and files

Latest commit

History

Repository files navigation

CS420 Machine Learning Final Project

Contents

Requirements

Model List

Prepare Data

Traditonal Methods

Naive Bayes

Decision Tree

Random Forest

K-Nearest Neighbors

Support Vector Machine

Influence of Modification

Deep Learning Methods

Usage

Deep Model Performance

SegNet

LocNet

Team Member

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages