-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent splits between COCO 2014 and COCO 2017? #5751
Comments
Yes, it seems that YOLOv4 will show worse AP for val2017 but better AP for test-dev if we will use COCO2017 with crowd=0. http://cocodataset.org/#detection-2019
We trained YOLOv4 only on train(without val) dataset, while COCO 2017 Task Guidelines reccommend to train by using So we used:
Yes, by using train+5k and COCO 2017 instead of 2014 can increase YOLOv4 accuracy.
Ross Girshick used train+val for training FasterRCNN to test it on test-dev dataset, without train/val splitting - therefore splitting is not important: https://arxiv.org/pdf/1506.01497v3.pdf It seems that the most other network was trained by using |
Thanks for the elaboration but there seems like the issue was not made clear in the beginning.
The From Mask RCNN:
From RetinaNet:
BTW, there is little concern on the crowd setting here since the COCO evaluation seems to ignore for fair comparison:
|
Yes, YOLOv4 can be compared on test-dev, as we done in the paper. I don't know why Ross Girshick used this strange splitting when he was co-author of Yolov1. Yes, may be we should re-train YOLOv4 on COCO2017. Yes, COCO evaluation ignores crowd=1 detections/truths. But crowd=1 can occupy part of the network capacity, I don’t know how much it affects. Also MSCOCO seems poorly annotated for persons. #4085 |
Sounds all good and looking forward to YOLOv5! |
@WongKinYiu Hi,
|
@AlexeyAB OK, After finish current training, I will train all of new models with COCO 2017. |
@WongKinYiu Will it use CSP+SAM+Mish, or just CSP for Neck? |
Will train two models, 1) CSP+Leaky for quick evaluation, and 2) CSP+SAM+Mish while it is the best known combination. Currently I think MiWRC is not stable enough, when sorting by top-1 accuracy: new-MiWRC-per_channel-relu > new-MiWRC-per_feature-softmax > new-MiWRC-per_feature-relu > new-MiWRC-per_channel-softmax. I can not find any rule of the performance, and I think the results may be random. |
TL;DR
The custom splits (
trainvalno5k.txt
and5k.txt
) for COCO 2014 are supposed to be the same as the default splits for COCO 2017 but they are not. This could explain whyYOLOv4
does not produce the same validation results on both but a significantly better mAP onval2017
. Also, this implies YOLO may be trained on a different training set compared with other object detectors. Then a direct comparison might not be fair. Any clarifications?According the COCO website, both releases contain the same images and detection annotations:
The only difference is the splits which COCO 2017 adopts as the long-time convention from COCO 2014 in early object detection work. The detectron repo explicitly describes the COCO Minival Annotations (5k) as follows:
Therefore, COCO_2017_train = COCO_2014_train +
valminusminival
, COCO_2017_val =minival
, wherevalminusminival
andminival
are from COCO 2014 val as the conventional custom splits.Now, looking into their annotations instances_minival2014.json(provided by the very original author), the last 5 files are listed as follows:
The above are the same as the last 5 files in
val2017
:However, the
5k.txt
split used by YOLO for COCO 2014 lists the following that is apparently different from the conventional custom splits:It is surprising that YOLO does not follow the same convention given Ross Girshick is also one of the authors in the YOLOv1 paper. Any clarifications to address the confusion would be welcome.
The text was updated successfully, but these errors were encountered: