In this project, we implement three classifications techniques
- Linear Classification
- SVM
- KNN
In each technique, we are dropping some coluns of name wich are not useable, so we used 3 different columns in each model.
- Firstly we drop some columns from train.csv and test.csv same as we did in Assignment_03.
- Then we apply crossvalidation or KFold technique by using 3 columns from train.csv and test.csv.
- Finally we implement models and get accuracy and predicted values.
- In this model, it will find nearest neighbor on K-Value which is in the odd after get the sqrt on yTest (from CV).
- After the crossvalidation on train.csv and test.csv it separfates the test data upto 20% or 30% and train data upto 80% or 70%(we have changed it randomly).
- After the application of KNN model we achieved a score of 0.842.
- This model is different from other because it does not learn on the characteristics not like other models learn.
- After the crossvalidation on train.csv and test.csv it separfates the test data upto 20% or 30% and train data upto 80% or 70%(we have changed it randomly).
- After the application of SVM model we achieved a score of 0.78.
- This model used to minimize the sum of square between the observed and target in the data set and the target predicted by the linear approximation.
- We are using Logistic Regression.
- After the crossvalidation on train.csv and test.csv it separfates the test data upto 20% or 30% and train data upto 80% or 70%(we have changed it randomly).
- After the application of LC model we achieved a score of 0.80.
- In this part, we are applying 5x5,7x7,9x9 convolution to map on our 42000 data, It will help to predict and get the filtered image/label.
- Explaining about its working, Firstly, we can break our 784 columns into 28x28 and create 2D Array and iterate on array filter will push into it.
- We implement Three techniques and on these techniques, we are applying crossvalidation to separates training or testing data, to get the best/good score.
- But according to our views to work on this phase, we achieve best score on KNN.
It will find the nearest neighbors on K-value but this K-value is odd, after getting square root of yTest (from cross validation).
1. It takes K=7 (model) and also p value, if p=1 means euclidean distance and p=2 manhatten distance.
It is used to minimize the sum of square. The observed target in the dataset and the target predicted by the linear approximation.
1. It takes max iter attribute to work on it.
2. In this model we can define a range.
- It will find the characteristics which matches the other classes.
- In this model, we have advantage that we can’t note data points instead of note down the suppose vector.
1. It takes the 'C' value which is regularize value, greater the value of C causes more chances to works at its best.
2. It takes gamma values.