-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Add Keypoint R-CNN #69
Conversation
Do you know if the difference is in training, inference, or both? I.e., have you loaded trained C2 Detectron keypoint models and verified that inference results are within epsilon? |
@rbgirshick the difference is in training: I have loaded a pre-trained C2 model, and I reproduced exactly the accuracies reported in the C2 model zoo |
Have you fix the mAP differences? |
@Darknesszlx I haven't had the chance to look further into it again, but @wat3rBro used it for his modified backbones and it gave the same mAPs |
@fmassa Thank you for your replay! But sorry, I can't find keypoints related code in @wat3rBro 's repository: https://github.com/wat3rBro/maskrcnn-benchmark |
@Darknesszlx sorry the code is not published. |
Can you provide the difference between your implementation and this PR? We are eager for the modification. What did you do to improve the performance? |
@ChangErgou It uses mobile models as backbone and heads, it also replaces upsampling part of keypoint head, I haven't investigated the reason for accuracy drop for original model, but I guess difference might came from the latter part. |
According to the implementation of mask_head, the neg_inds is I think the BETWEEN_THRESHOLDS should be replaced by BELOW_LOW_THRESHOLD. This code is used to remove the negative RoIs which has iou bellow the low threshold with targets. But I have limited GPUs to do the validations. I will report the results later. Maybe you can try this modification. |
Hi @ChangErgou , Thanks for the comment! I'm not sure that that line has an effect, because there are no negative proposals selected for keypoints, and my understanding is that all boxes that are not person will be removed by the visibility check. But I might be wrong here, and I should definitely try this change out! |
After replacing the BETWEEN_THRESHOLDS with BELOW_LOW_THRESHOLD, I train the model on the Coco2017train and validate the net on the coco2017val. The result is that: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.516 |
@ChangErgou this is awesome! This might be indeed the issue, did you change anything else in the implementation? I'll try making this change and launch some trainings. It might be the remaining ingredient to get the PR merged, thanks a lot! |
For information, I've kicked a few trainings with the fix from @ChangErgou . |
@ChangErgou quick question: Did you change anything else in the code? I've done the change you mentioned, and while I got better results than before, it didn't match your keypoints numbers. Here is what I got
|
Honestly, I know nothing about key points detection. Therefore,I didn't change anything special. At first,the dataset defined in "DatasetCatalog" is oid? I see something like "person_keypoints_train2017_valminusminival_mod" with "coco2014". U do something to them and named it with "mod". I use and download the original dataset of coco2017. Maybe the dataset is something wrong in your server. Second,I only have one GPU. I set the learning rate to 0.001 and train the network with 720k iteration steps. I put 2 images on one GPU. After all, I do nothing because I am newly working on key points detection |
Thanks for the information. So you effectively used a smaller learning rate (it should have been 0.0025 for 1 GPU I believe). |
Yes. The learning rate schedule and training steps are adjusted according to the learning rate. |
Hi @fmassa The previous version neg_inds = matched_idxs == Matcher.BETWEEN_THRESHOLDS
labels_per_image[neg_inds] = 0 leaving the proposals not matched well(below threshold) have chances to be chosen as |
Hi @ChangErgou , How did you set SOLVER.STEPS and SOLVER.MAX_ITER when lr=0.001?
I think your Box and Keypoint results are not matched, and the mAP of keypoint is even higher than X-101-64x4d-FPN(66.0) in Detectron. I would be very grateful if you can provide more details. When I run with config(R-50-FPN, lr=0.001, schedule=(480000,640000), maxiter=720000, num_GPU=1, batch_size=2, train_data=COCO_2017). I got: |
BTW, one difference between this implementation and the caffe2 one is in the handling of invalid minibatches (batches with <20 total keypoints). Detectron will retry minibatch loading until a valid batch is found [1], whereas in this diff, we just zero the keypoint loss (but presumably still backprop all the other losses) [2]. Not sure how much of a difference this makes, though. [1] https://github.com/facebookresearch/Detectron/blob/cbb0236dfdc17790658c146837215d2728e6fadd/detectron/roi_data/loader.py#L130-L135 |
@jonmorton precisely, this is one difference that I was aware of when I implemented Keypoints. I printed the number of times that we have zeroed the keypoint loss but not the other losses (comparing to just fetching a new batch), and over 90k iterations we would zero the loss 2k times, which I'm not sure would account for all the remaining difference (but who knows?). Making the batch be retried is kind of tricky, and doesn't quite fit with PyTorch DataLoader. |
EDIT: I realize now that I'm seeing a larger drop rate because I'm using the single GPU, single image per batch setup. With two images per gpu, the drop rate is around what you saw. It's a bit unfortunate that detectron had such a sensitivity to batch size though. |
I think for this particular case what I might end up doing is to just run for a few more iterations than what we had in Detectron (like 95k instead of 90k), and slightly adapt the learning rate schedules to take this into account. This might not exactly reproduce the results of Detectron as the other models do, but will still be significantly faster and will already add value |
Another small difference between this PR and c2: you have initialized the ConvTranspose2d using kaiming_normal. But in c2 the weights of the transpose conv are initialized using a std=0.1 guassian distribution (unlike the convs): https://github.com/facebookresearch/Detectron/blob/master/detectron/modeling/keypoint_rcnn_heads.py#L62 |
@jonmorton thanks for the messages and for looking into the potential reasons for the performance differences! Yes, maybe filtering the images with fewer than X keypoints would be a cleaner solution, even though the number of keypoints visible inside the proposals is not exactly equivalent I think? The reason why it might not be equivalent is that the proposals are sampled dynamically via the RPN, and they might be bad and include few keypoints? But maybe just skipping those images indeed has the same intended behavior, and is much simpler! About the initialization, I unfortunately think that we don't actually use that configuration in our models, because |
With the help of @jonmorton , I discovered what was wrong with my experiments with Keypoints that made me unable to reproduce Detectron's results: I had a problem with my training data, and I was using ~15k less images during training! Now that I've reproduced the results I'll clean the code a bit and (finally!) merge this PR. Thank you all for your help identifying the issue |
For completeness, here are the results I got:
|
Still gives slightly worse results
Improves mAP by 1.5 points, to 0.514 and 0.609
Still need further cleanups and improvements, like adding fields support for the other ops in Keypoints
hi @fmassa |
@zimenglan-sysu-512 I was just cleaning up a bit the code, rebasing against master and running some more trainings to verify that I didn't break anything. It all looks good now, so I'll be merging it early this week. |
* [WIP] Keypoints inference on C2 models work * Training seems to work Still gives slightly worse results * e2e training works but gives 3 and 5 mAP less * Add modification proposed by @ChangErgou Improves mAP by 1.5 points, to 0.514 and 0.609 * Keypoints reproduce expected results * Clean coco.py * Linter + remove unnecessary code * Merge criteria for empty bboxes in has_valid_annotation * Remove trailing print * Add demo support for keypoints Still need further cleanups and improvements, like adding fields support for the other ops in Keypoints * More cleanups and misc improvements * Fixes after rebase * Add information to the readme * Fix md formatting
This is a work-in-progress PR and requires cleanup and a few improvements.
This PR implements Keypoint R-CNN, and add the necessary building blocks for it to work.
It includes a
PersonKeypoint
class, as well as theNotable changes:
INPUT.MIN_SIZE_TRAIN
to be a tuple instead of a number. I need to change all the other configs to this change before merging the PRDifferences wrt Detectron
Here are the statistics:
We are still a few mAP points worse than the Caffe2 Detectron implementation, but we use less memory and are significantly faster.
I'll keep this PR open until I fix the mAP differences