method | backbone | resolution | mixed precision? | dataset | metric | average | best | training time (2080 Ti) |
---|---|---|---|---|---|---|---|---|
Baseline | VGG16 | 360 x 640 | yes | TuSimple | Accuracy | 93.79% | 93.94% | 1.5h |
Baseline | ResNet18 | 360 x 640 | yes | TuSimple | Accuracy | 94.18% | 94.25% | 0.7h |
Baseline | ResNet34 | 360 x 640 | yes | TuSimple | Accuracy | 95.23% | 95.31% | 1.1h |
Baseline | ResNet50 | 360 x 640 | yes | TuSimple | Accuracy | 95.07% | 95.12% | 1.5h |
Baseline | ResNet101 | 360 x 640 | yes | TuSimple | Accuracy | 95.15% | 95.19% | 2.6h |
Baseline | ERFNet | 360 x 640 | yes | TuSimple | Accuracy | 96.02% | 96.04% | 0.8h |
Baseline | ENet# | 360 x 640 | yes | TuSimple | Accuracy | 95.55% | 95.61% | 1h+ |
SCNN | VGG16 | 360 x 640 | yes | TuSimple | Accuracy | 95.01% | 95.17% | 2h |
SCNN | ResNet18 | 360 x 640 | yes | TuSimple | Accuracy | 94.69% | 94.77% | 1.2h |
SCNN | ResNet34 | 360 x 640 | yes | TuSimple | Accuracy | 95.19% | 95.25% | 1.6h |
SCNN | ResNet50 | 360 x 640 | yes | TuSimple | Accuracy | 95.43% | 95.56% | 2.4h |
SCNN | ResNet101 | 360 x 640 | yes | TuSimple | Accuracy | 95.56% | 95.69% | 3.5h |
SCNN | ERFNet | 360 x 640 | yes | TuSimple | Accuracy | 96.18% | 96.29% | 1.6h |
Baseline | VGG16 | 288 x 800 | yes | CULane | F measure | 65.93 | 66.09 | 9.3h |
Baseline | ResNet18 | 288 x 800 | yes | CULane | F measure | 65.19 | 65.30 | 5.3h |
Baseline | ResNet34 | 288 x 800 | yes | CULane | F measure | 69.82 | 69.92 | 7.3h |
Baseline | ResNet50 | 288 x 800 | yes | CULane | F measure | 68.31 | 68.48 | 12.4h |
Baseline | ResNet101 | 288 x 800 | yes | CULane | F measure | 71.29 | 71.37 | 20.0h |
Baseline | ERFNet | 288 x 800 | yes | CULane | F measure | 73.40 | 73.49 | 6h |
Baseline | ENet# | 288 x 800 | yes | CULane | F measure | 69.39 | 69.90 | 6.4h+ |
SCNN | VGG16 | 288 x 800 | yes | CULane | F measure | 74.02 | 74.29 | 12.8h |
SCNN | ResNet18 | 288 x 800 | yes | CULane | F measure | 71.94 | 72.19 | 8.0h |
SCNN | ResNet34 | 288 x 800 | yes | CULane | F measure | 72.44 | 72.70 | 10.7h |
SCNN | ResNet50 | 288 x 800 | yes | CULane | F measure | 72.95 | 73.03 | 17.9h |
SCNN | ResNet101 | 288 x 800 | yes | CULane | F measure | 73.29 | 73.58 | 25.7h |
SCNN | ERFNet | 288 x 800 | yes | CULane | F measure | 73.85 | 74.03 | 11.3h |
All performance is measured with ImageNet pre-training and reported as 3 times average/best on test set.
+ Measured on a single GTX 1080Ti.
# No pre-training.
method | backbone | accuracy | FP | FN | |
---|---|---|---|---|---|
Baseline | VGG16 | 93.94% | 0.0998 | 0.1021 | model | shell |
Baseline | ResNet18 | 94.25% | 0.0881 | 0.0894 | model | shell |
Baseline | ResNet34 | 95.31% | 0.0640 | 0.0622 | model | shell |
Baseline | ResNet50 | 95.12% | 0.0649 | 0.0653 | model | shell |
Baseline | ResNet101 | 95.19% | 0.0619 | 0.0620 | model | shell |
Baseline | ERFNet | 96.04% | 0.0591 | 0.0365 | model | shell |
Baseline | ENet | 95.61% | 0.0655 | 0.0503 | model | shell |
SCNN | VGG16 | 95.17% | 0.0637 | 0.0622 | model | shell |
SCNN | ResNet18 | 94.77% | 0.0753 | 0.0737 | model | shell |
SCNN | ResNet34 | 95.25% | 0.0627 | 0.0634 | model | shell |
SCNN | ResNet50 | 95.56% | 0.0561 | 0.0556 | model | shell |
SCNN | ResNet101 | 95.69% | 0.0519 | 0.0504 | model | shell |
SCNN | ERFNet | 96.29% | 0.0470 | 0.0318 | model | shell |
method | backbone | normal | crowded | night | no line | shadow | arrow | dazzle light |
curve | crossroad | total | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Baseline | VGG16 | 85.51 | 64.05 | 61.14 | 35.96 | 59.76 | 78.43 | 53.25 | 62.16 | 2224 | 66.09 | model | shell |
Baseline | ResNet18 | 85.45 | 62.63 | 61.04 | 33.88 | 51.72 | 78.15 | 53.05 | 59.70 | 1915 | 65.30 | model | shell |
Baseline | ResNet34 | 89.46 | 66.66 | 65.38 | 40.43 | 62.17 | 83.18 | 58.51 | 63.00 | 1713 | 69.92 | model | shell |
Baseline | ResNet50 | 88.15 | 65.73 | 63.74 | 37.96 | 62.59 | 81.68 | 59.47 | 64.01 | 2046 | 68.48 | model | shell |
Baseline | ResNet101 | 90.11 | 67.89 | 67.01 | 43.10 | 70.56 | 85.09 | 61.77 | 65.47 | 1883 | 71.37 | model | shell |
Baseline | ERFNet | 91.48 | 71.27 | 68.09 | 46.76 | 74.47 | 86.09 | 64.18 | 66.89 | 2102 | 73.49 | model | shell |
Baseline | ENet | 89.26 | 68.15 | 62.99 | 42.43 | 68.59 | 83.10 | 58.49 | 63.23 | 2464 | 69.90 | model | shell |
SCNN | VGG16 | 92.02 | 72.31 | 69.13 | 46.01 | 76.37 | 87.71 | 64.68 | 68.96 | 1924 | 74.29 | model | shell |
SCNN | ResNet18 | 90.98 | 70.17 | 66.54 | 43.12 | 66.31 | 85.62 | 62.20 | 65.58 | 1808 | 72.19 | model | shell |
SCNN | ResNet34 | 91.06 | 70.41 | 67.75 | 44.64 | 68.98 | 86.50 | 61.57 | 65.75 | 2017 | 72.70 | model | shell |
SCNN | ResNet50 | 91.38 | 70.60 | 67.62 | 45.02 | 71.24 | 86.90 | 66.03 | 66.17 | 1958 | 73.03 | model | shell |
SCNN | ResNet101 | 91.10 | 71.43 | 68.53 | 46.39 | 72.61 | 86.87 | 61.95 | 67.01 | 1720 | 73.58 | model | shell |
SCNN | ERFNet | 91.82 | 72.13 | 69.49 | 46.68 | 70.59 | 87.40 | 64.18 | 68.30 | 2236 | 74.03 | model | shell |
model | resolution | mixed precision? | dataset | average | best | training time (2080 Ti) |
best model link |
---|---|---|---|---|---|---|---|
FCN | 321 x 321 | yes | PASCAL VOC 2012 | 70.72 | 70.83 | 3.3h | model | shell |
FCN | 321 x 321 | no | PASCAL VOC 2012 | 70.91 | 71.55 | 6.3h | model | shell |
DeeplabV2 | 321 x 321 | yes | PASCAL VOC 2012 | 74.59 | 74.74 | 3.3h | model | shell |
DeeplabV3 | 321 x 321 | yes | PASCAL VOC 2012 | 78.11 | 78.17 | 7h | model | shell |
FCN | 256 x 512 | yes | Cityscapes | 68.05 | 68.20 | 2.2h | model | shell |
DeeplabV2 | 256 x 512 | yes | Cityscapes | 68.65 | 68.90 | 2.2h | model | shell |
DeeplabV3 | 256 x 512 | yes | Cityscapes | 69.87 | 70.37 | 4.5h | model | shell |
DeeplabV2 | 256 x 512 | no | Cityscapes | 68.45 | 68.89 | 4h | model | shell |
ERFNet | 512 x 1024 | yes | Cityscapes | 71.99 | 72.47 | 5h | model | shell |
ENet | 512 x 1024 | yes | Cityscapes | 65.54 | 65.74 | 10.6h | model | shell |
DeeplabV2 | 512 x 1024 | yes | Cityscapes | 71.78 | 72.12 | 9h | model | shell |
DeeplabV3 | 512 x 1024 | yes | Cityscapes | 74.64 | 74.67 | 20.1h | model | shell |
DeeplabV2 | 512 x 1024 | yes | GTAV | 32.90 | 33.88 | 13.8h | model | shell |
DeeplabV2 | 512 x 1024 | yes | SYNTHIA* | 33.89 | 34.86 | 10.4h | model | shell |
All performance is measured with ImageNet pre-training and reported as 3 times average/best mIoU (%) on val set.
* mIoU-16.