Skip to content

Latest commit

 

History

History
97 lines (84 loc) · 12.1 KB

MODEL_ZOO.md

File metadata and controls

97 lines (84 loc) · 12.1 KB

Welcome to pytorch-auto-drive model zoo

Lane detection performance:

method backbone resolution mixed precision? dataset metric average best training time
(2080 Ti)
Baseline VGG16 360 x 640 yes TuSimple Accuracy 93.79% 93.94% 1.5h
Baseline ResNet18 360 x 640 yes TuSimple Accuracy 94.18% 94.25% 0.7h
Baseline ResNet34 360 x 640 yes TuSimple Accuracy 95.23% 95.31% 1.1h
Baseline ResNet50 360 x 640 yes TuSimple Accuracy 95.07% 95.12% 1.5h
Baseline ResNet101 360 x 640 yes TuSimple Accuracy 95.15% 95.19% 2.6h
Baseline ERFNet 360 x 640 yes TuSimple Accuracy 96.02% 96.04% 0.8h
Baseline ENet# 360 x 640 yes TuSimple Accuracy 95.55% 95.61% 1h+
SCNN VGG16 360 x 640 yes TuSimple Accuracy 95.01% 95.17% 2h
SCNN ResNet18 360 x 640 yes TuSimple Accuracy 94.69% 94.77% 1.2h
SCNN ResNet34 360 x 640 yes TuSimple Accuracy 95.19% 95.25% 1.6h
SCNN ResNet50 360 x 640 yes TuSimple Accuracy 95.43% 95.56% 2.4h
SCNN ResNet101 360 x 640 yes TuSimple Accuracy 95.56% 95.69% 3.5h
SCNN ERFNet 360 x 640 yes TuSimple Accuracy 96.18% 96.29% 1.6h
Baseline VGG16 288 x 800 yes CULane F measure 65.93 66.09 9.3h
Baseline ResNet18 288 x 800 yes CULane F measure 65.19 65.30 5.3h
Baseline ResNet34 288 x 800 yes CULane F measure 69.82 69.92 7.3h
Baseline ResNet50 288 x 800 yes CULane F measure 68.31 68.48 12.4h
Baseline ResNet101 288 x 800 yes CULane F measure 71.29 71.37 20.0h
Baseline ERFNet 288 x 800 yes CULane F measure 73.40 73.49 6h
Baseline ENet# 288 x 800 yes CULane F measure 69.39 69.90 6.4h+
SCNN VGG16 288 x 800 yes CULane F measure 74.02 74.29 12.8h
SCNN ResNet18 288 x 800 yes CULane F measure 71.94 72.19 8.0h
SCNN ResNet34 288 x 800 yes CULane F measure 72.44 72.70 10.7h
SCNN ResNet50 288 x 800 yes CULane F measure 72.95 73.03 17.9h
SCNN ResNet101 288 x 800 yes CULane F measure 73.29 73.58 25.7h
SCNN ERFNet 288 x 800 yes CULane F measure 73.85 74.03 11.3h

All performance is measured with ImageNet pre-training and reported as 3 times average/best on test set.

+ Measured on a single GTX 1080Ti.

# No pre-training.

TuSimple detailed performance (best):

method backbone accuracy FP FN
Baseline VGG16 93.94% 0.0998 0.1021 model | shell
Baseline ResNet18 94.25% 0.0881 0.0894 model | shell
Baseline ResNet34 95.31% 0.0640 0.0622 model | shell
Baseline ResNet50 95.12% 0.0649 0.0653 model | shell
Baseline ResNet101 95.19% 0.0619 0.0620 model | shell
Baseline ERFNet 96.04% 0.0591 0.0365 model | shell
Baseline ENet 95.61% 0.0655 0.0503 model | shell
SCNN VGG16 95.17% 0.0637 0.0622 model | shell
SCNN ResNet18 94.77% 0.0753 0.0737 model | shell
SCNN ResNet34 95.25% 0.0627 0.0634 model | shell
SCNN ResNet50 95.56% 0.0561 0.0556 model | shell
SCNN ResNet101 95.69% 0.0519 0.0504 model | shell
SCNN ERFNet 96.29% 0.0470 0.0318 model | shell

CULane detailed performance (best):

method backbone normal crowded night no line shadow arrow dazzle
light
curve crossroad total
Baseline VGG16 85.51 64.05 61.14 35.96 59.76 78.43 53.25 62.16 2224 66.09 model | shell
Baseline ResNet18 85.45 62.63 61.04 33.88 51.72 78.15 53.05 59.70 1915 65.30 model | shell
Baseline ResNet34 89.46 66.66 65.38 40.43 62.17 83.18 58.51 63.00 1713 69.92 model | shell
Baseline ResNet50 88.15 65.73 63.74 37.96 62.59 81.68 59.47 64.01 2046 68.48 model | shell
Baseline ResNet101 90.11 67.89 67.01 43.10 70.56 85.09 61.77 65.47 1883 71.37 model | shell
Baseline ERFNet 91.48 71.27 68.09 46.76 74.47 86.09 64.18 66.89 2102 73.49 model | shell
Baseline ENet 89.26 68.15 62.99 42.43 68.59 83.10 58.49 63.23 2464 69.90 model | shell
SCNN VGG16 92.02 72.31 69.13 46.01 76.37 87.71 64.68 68.96 1924 74.29 model | shell
SCNN ResNet18 90.98 70.17 66.54 43.12 66.31 85.62 62.20 65.58 1808 72.19 model | shell
SCNN ResNet34 91.06 70.41 67.75 44.64 68.98 86.50 61.57 65.75 2017 72.70 model | shell
SCNN ResNet50 91.38 70.60 67.62 45.02 71.24 86.90 66.03 66.17 1958 73.03 model | shell
SCNN ResNet101 91.10 71.43 68.53 46.39 72.61 86.87 61.95 67.01 1720 73.58 model | shell
SCNN ERFNet 91.82 72.13 69.49 46.68 70.59 87.40 64.18 68.30 2236 74.03 model | shell

Semantic segmentation performance:

model resolution mixed precision? dataset average best training time
(2080 Ti)
best model link
FCN 321 x 321 yes PASCAL VOC 2012 70.72 70.83 3.3h model | shell
FCN 321 x 321 no PASCAL VOC 2012 70.91 71.55 6.3h model | shell
DeeplabV2 321 x 321 yes PASCAL VOC 2012 74.59 74.74 3.3h model | shell
DeeplabV3 321 x 321 yes PASCAL VOC 2012 78.11 78.17 7h model | shell
FCN 256 x 512 yes Cityscapes 68.05 68.20 2.2h model | shell
DeeplabV2 256 x 512 yes Cityscapes 68.65 68.90 2.2h model | shell
DeeplabV3 256 x 512 yes Cityscapes 69.87 70.37 4.5h model | shell
DeeplabV2 256 x 512 no Cityscapes 68.45 68.89 4h model | shell
ERFNet 512 x 1024 yes Cityscapes 71.99 72.47 5h model | shell
ENet 512 x 1024 yes Cityscapes 65.54 65.74 10.6h model | shell
DeeplabV2 512 x 1024 yes Cityscapes 71.78 72.12 9h model | shell
DeeplabV3 512 x 1024 yes Cityscapes 74.64 74.67 20.1h model | shell
DeeplabV2 512 x 1024 yes GTAV 32.90 33.88 13.8h model | shell
DeeplabV2 512 x 1024 yes SYNTHIA* 33.89 34.86 10.4h model | shell

All performance is measured with ImageNet pre-training and reported as 3 times average/best mIoU (%) on val set.

* mIoU-16.