An computer vision project, based on cimg library and svm training, to classify handwriting number.
- A photo in which an A4 paper is placed on the desk, with some handwriting numbers on
- A string of the numbers written on the paper, like "70026548853"
- Modify the paper in the photo into a standard A4 paper
- Implement the segmentation of the numbers, dividing them into single number
- Use Adaboost or SVM to train a classifier of the handwriting numbers
- Classify the handwriting numbers with the trained classifier
- Windows10 + VS2015
- C++
- cimg library : http://www.cimg.eu/
- opencv (For extracting features of images)
- libsvm (for training, testing and predicting) : http://www.csie.ntu.edu.tw/~cjlin/libsvm/
-
Find the 4 vertices of the paper
-
Modify the paper into a standard A4 paper
-
Segmentation of the numbers in order:
- Convert into binary image
- Use vertical histogram to divide the source image into sub-image (each sub-image contains a line of numbers)
- Use horizontal histogram to divide the line-sub-image into several row-sub-images
- Foreach sub-image, implement dilation to thicken the number (and join the broken ones)
- Foreach sub-image, use connected-component_labeling algorithm to divide the single number: https://en.wikipedia.org/wiki/Connected-component_labeling
- Foreach sub-image, save all single number images and a list of their name in .txt
-
Use libsvm to train model and test, and predict the number finally
- Data prepairing:
- convert the mnist(http://yann.lecun.com/exdb/mnist/) binary data into .jpg as well as their labels
- Model training:
- extract the HOG features of each image and construct them into svm format (in **.txt)
- Scale the features(in **.txt) with
svm-scale.exe
(search the windows/ folder) - Train the model with
svm-train.exe
, and get **.model - (Optional) Test the data with
svm-predict.exe
and see the accuracy (modify the training parameter to get the highest accuracy)
- Number predicting:
- read the number images you segmented just now and do prediction with the trained model
- Data prepairing:
- 4 vertices of the paper & A4 paper modification
-
Segmentation of the numbers:
- Binary image with dilation & Divided Image & Circled single number
- Divided into single numbers in order as well as an image list in .txt
- Prediction:
-
Comparison (Before joining & After joining)
-
Implements : use the filters below when doing the dilation during the number segmentation (filterA) (filterB)
- Do the dilation with filterA twice at first, and then filterB once.
- filterB means: when at the white pixel, search up/down one pixel and left/right one pixel. If meeting a black pixel, set the current position to black.
- filterA means: when at the white pixel, search up/down one pixel and left/right two pixel. Count the blacks with the coefficient, -1 or 1 (-1 means subtract one black when meeting a black at left/right side). At last, only if the blacks more than 0, set the current position to black.
- Obviously, filterB is to thicken the number in all directions, leading to the cons that the holes in number 0, 6, 8, 9 with be filled. So I propose another simple but useful filter, the filterA, to deal with such problem. It can be seen that the intensity of a white pixel is much relevant to its horizontal neighbors, which prevent hole filling to some extent. Luckily it works well in my experiments.
-
Segmentation of the numbers & Broken numbers joining :
http://blog.csdn.net/qq_33000225/article/details/73123880 (Chinese version)
-
Prediction: (Waiting......)
- Some numbers are connected to each other and are segmented together into one image......
- From the predict result above, we see that most of the 7s and 9s are classified into 1......