Skip to content

Latest commit

 

History

History
38 lines (38 loc) · 1.92 KB

Robust Hand Detection in Vehicles.md

File metadata and controls

38 lines (38 loc) · 1.92 KB

♥ DL , CV , RCNN, Object Detection

  • Modified Faster RCNN
    • Multiple scale Faster-RCNN
    • Weight normalization
    • Add new layer
  • Change 1 : Multiple scale Faster-RCNN
    • Combine both global and local features --> enhance hand detecting in an image
    • Collect features not only conv5, but also conv3 and conv4, then incorporate them
    • Implementation:
      • For conv3, conv4, conv5, each conv is only followed with ReLU, remove Max-pooling layer.
      • Take their output as the input of 3 corresponding ROI pooling layers and normalization layers
      • Concat and shrink normalization layers as input of fc layers
      • roi pooling in fc layers: make prediction of class and position
  • Change 2: Weight normalization
    • Features in shallower layers: larger-scaled values
    • Features in deeper layers: smaller-scaled values
    • To combine the features of 3 conv layers, we need to normalize them
    • Implementation:
      • Put each feature into normalization layer(see the equations)
      • Each pixel xi is normalized, then multiply scaling factor ri
      • Use backpropagation to get ri in training step, we need to build loop here
      • After normalization, concate the layers
  • Change 3: Add new layer
    • Each RPN needs a normalization layer
    • Add two more ROI pooling layers in detector part
    • Each ROI pooling layer needs a normalization layer
    • After each concatenation(2 positions in total), we need a 1*1 conv layer
  • Some details
    • For RPN:
      • normalize each to_be_normalized layer
      • concat 3 normalized layers
      • change the dimension using 1 * 1 conv
    • For ROI pooling:
      • put each conv output into its ROI pooling (so there should be 3 ROI pooling layers)
      • normalize each layer
      • concat them
      • change the dimension using 1 * 1 conv