Skip to content

Latest commit

 

History

History
167 lines (112 loc) · 8.25 KB

writeup.md

File metadata and controls

167 lines (112 loc) · 8.25 KB

Vehicle Detection

by Hanbyul Yang, Oct 12, 2017

Overview

This is a project of Self-Driving Car Nanodegree Program of Udacity.

The goals of this project is detecting vehicles of given image or videos that captured at driving car. Details of goals and steps are following:

  • Perform a Histogram of Oriented Gradients (HOG) feature extraction on a labeled training set of images and train a Linear SVM classifier
  • Additionally, apply a color transform and append binned color features, as well as histograms of color to HOG feature vector.
  • Implement a sliding-window technique and use trained classifier to search for vehicles in images.
  • Run pipeline on a video stream and create a heat map of recurring detections frame by frame to reject outliers and follow detected vehicles.
  • Estimate a bounding box for vehicles detected.

For the processing pipelines and codes, Check P5.ipynb. I wrote this in the order given rubrics.

Writeup / README

This file writeup.md is for writeup. README.md describes contents (files and folders) briefly.

Histogram of Oriented Gradients (HOG)

1. Explain how (and identify where in your code) you extracted HOG features from the training images.

First of all, I leveraged the given codes of lessons which are in ./helper_function.py.

There are 8792 vehicle images and 8968 non-vehicle images. Here is an example of one of each of the vehicle and non-vehicle classes. 3rd and 4th cell of jupyter notebook ./P5.ipynb

alt text

alt text

Then, I explored different color spaces and different skimage.hog() parameters (orientations, pixels_per_cell, and cells_per_block). Also, I used binned color and histograms of color as features. The codes are located in function extract_features() in line 55 of ./helper_function.py

I chose random images from each of the two classes and displayed them to get a feel for what the HOG features looks like. Below is an example using the YCrCb color space and HOG parameters of orientations=9, pixels_per_cell=(8, 8) and cells_per_block=(2, 2):

alt text

2. Explain how you settled on your final choice of HOG parameters.

I tried various combinations of parameters. For the convenience, I sticked with linear svm classifier and YCrCb color space. - Orientation 8 ~ 12. - Pixels per cell 8 ~ 16. - HOG Channels each one and all. - spatial_size 16, 32. - hist_bins 32, 64.

Final choice is in 2nd cell of jupyter notebook.

Parameter Final choice
color_space 'YCrCb'
orient 9
pix_per_cell 8
cell_per_block 2
hog_channel "ALL"
spatial_size (32, 32)
hist_bins 32
spatial_feat True
hist_feat True
hog_feat True
y_start_stop [400, 690]

3. Describe how (and identify where in your code) you trained a classifier using your selected HOG features (and color features if you used them).

Feature normalization is performed and sklearn.cross_validation.train_test_split() is used for shuffling and train and test data split. 99.07% test accuracy is acquired. 5th cell of jupyter notebook shows these process.

Sliding Window Search

1. Describe how (and identify where in your code) you implemented a sliding window search. How did you decide what scales to search and how much to overlap windows?

I also leveraged function find_cars() from lesson. It extracts hog features whole image at once rather than each window. The codes are located in 7th cell of notebook. find_car() uses 64 x 64 window size for 1.0 scale. Different scale extends or reduces region of sliding window by dividing image size with scale value.

At first I used two scales (1 and 2) but it performed poorly so I increase number of scales. Using 4 scales (1, 1.5, 1.75, 2) for searching vehicles was the optimal. 75% of overlapped window is used. Here is an example. Each color represents different scale.

alt text

But,there are false positives in image. So I made the heatmap for detected bounding boxes. High value means duplicated detections.

alt text

Thresholding heatmap with value 1 is used for removing false positive. The codes are in 11th and 12th cells of jupyter notebook.

alt text

2. Show some examples of test images to demonstrate how your pipeline is working. What did you do to optimize the performance of your classifier?

I tested with given test images in test_images/ until it works all of test images. The optimization techniques I used are adjusting window size, using multiple windows sizes (scales) and adjusting heatmap threshold.

Belows are results of test images and 14th and 15th cell of jupyter notebook show the codes.

alt text alt text alt text alt text alt text alt text

Video Implementation

1. Provide a link to your final video output. Your pipeline should perform reasonably well on the entire project video (somewhat wobbly or unstable bounding boxes are ok as long as you are identifying the vehicles most of the time with minimal false positives.)

Here's a link to my video result

2. Describe how (and identify where in your code) you implemented some kind of filter for false positives and some method for combining overlapping bounding boxes.

I used two temporal (inter-frame) thresholding methods for remove false positives.

The first one is using all detected bounding boxes of recent 3 frames. Then applying heatmap thresholding with 3. It gives relieving temporal jitters.

The other one is thresholding change of centroid by comparing previous frame. The assumption of this method is that detected cars move in some boundaries between frames. 32 pixel is used for thresholding. It removes most of difficult false positives.

Here are detected bounding box images of three frames and their corresponding heatmaps.

alt text alt text alt text

And then here are results after applying two temporal thresholding methods.

alt text alt text alt text

The code for processing pipeline is located in 13th cell of jupyter notebook.

Discussion

1. Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?

Even though I used very high accuracy (99.07%) of linear svm classifier and heat-map thresholding, the detected result was only good enough with good and clear image. If road is dirty which means color are not uniform, it usually has false positives. The most difficult case is when the shadows of trees are in the roads.

Because of the sliding window algorithm I used, My pipeline didn't work when car is on rightmost in the image. There is some area not covered. Here's the example.

alt text alt text

I think more robust method could be one based on convolutional neural network, such as YOLO. It may have fewer false positives.