Add Keypoint R-CNN #69

fmassa · 2018-10-30T16:15:19Z

This is a work-in-progress PR and requires cleanup and a few improvements.

This PR implements Keypoint R-CNN, and add the necessary building blocks for it to work.
It includes a PersonKeypoint class, as well as the

Notable changes:

[breaking change] adds supports for multi-scale training. This changes INPUT.MIN_SIZE_TRAIN to be a tuple instead of a number. I need to change all the other configs to this change before merging the PR

Differences wrt Detectron

In Detectron, we skip iterations that do not have enough keypoints. In here, we still compute the loss for the boxes, and only skip the keypoint head

Here are the statistics:

backbone	type	lr sched	im / gpu	train mem(GB)	train time (s/iter)	total train time(hr)	inference time(s/im)	box AP	keypoint AP	model id
R-50-FPN	Fast	1x	2	6.7	0.5351	13.3	0.15268	50.9	58.7	6605296
R-50-FPN	Fast	1x	2	6.7	0.5379	13.4	0.15083	50.9	59.0	6605301

We are still a few mAP points worse than the Caffe2 Detectron implementation, but we use less memory and are significantly faster.

I'll keep this PR open until I fix the mAP differences

rbgirshick · 2018-10-31T20:24:32Z

Do you know if the difference is in training, inference, or both? I.e., have you loaded trained C2 Detectron keypoint models and verified that inference results are within epsilon?

fmassa · 2018-10-31T21:20:10Z

@rbgirshick the difference is in training: I have loaded a pre-trained C2 model, and I reproduced exactly the accuracies reported in the C2 model zoo

Darknesszlx · 2018-12-24T09:21:10Z

Have you fix the mAP differences?

fmassa · 2018-12-25T13:25:49Z

@Darknesszlx I haven't had the chance to look further into it again, but @wat3rBro used it for his modified backbones and it gave the same mAPs

Darknesszlx · 2018-12-26T02:58:16Z

@fmassa Thank you for your replay! But sorry, I can't find keypoints related code in @wat3rBro 's repository: https://github.com/wat3rBro/maskrcnn-benchmark
Could you please give me detailed link?

wat3rBro · 2018-12-26T19:07:09Z

@Darknesszlx sorry the code is not published.

ChangErgou · 2019-01-03T08:37:31Z

@Darknesszlx sorry the code is not published.

Can you provide the difference between your implementation and this PR? We are eager for the modification. What did you do to improve the performance?

wat3rBro · 2019-01-04T22:59:05Z

@ChangErgou It uses mobile models as backbone and heads, it also replaces upsampling part of keypoint head, I haven't investigated the reason for accuracy drop for original model, but I guess difference might came from the latter part.

ChangErgou · 2019-01-10T09:23:05Z

According to the implementation of mask_head, the neg_inds is neg_inds = matched_idxs == Matcher.BELOW_LOW_THRESHOLD. However, in the implementation of keypoint head the neg_inds is neg_inds = matched_idxs == Matcher.BETWEEN_THRESHOLDS.

I think the BETWEEN_THRESHOLDS should be replaced by BELOW_LOW_THRESHOLD. This code is used to remove the negative RoIs which has iou bellow the low threshold with targets. But I have limited GPUs to do the validations. I will report the results later. Maybe you can try this modification.

fmassa · 2019-01-10T10:50:02Z

Hi @ChangErgou ,

Thanks for the comment! I'm not sure that that line has an effect, because there are no negative proposals selected for keypoints, and my understanding is that all boxes that are not person will be removed by the visibility check.

But I might be wrong here, and I should definitely try this change out!

ChangErgou · 2019-01-14T02:11:42Z

After replacing the BETWEEN_THRESHOLDS with BELOW_LOW_THRESHOLD, I train the model on the Coco2017train and validate the net on the coco2017val. The result is that:

fmassa · 2019-01-14T13:10:12Z

@ChangErgou this is awesome!

This might be indeed the issue, did you change anything else in the implementation?

I'll try making this change and launch some trainings. It might be the remaining ingredient to get the PR merged, thanks a lot!

fmassa · 2019-01-16T13:53:54Z

For information, I've kicked a few trainings with the fix from @ChangErgou .
If I reproduce his numbers, I'll clean up the code and (finally) merge it.

fmassa · 2019-01-17T09:55:41Z

@ChangErgou quick question: Did you change anything else in the code?

I've done the change you mentioned, and while I got better results than before, it didn't match your keypoints numbers. Here is what I got

Run 1
Box:      0.514
Keypoint: 0.609

Run2
Box:      0.517
Keypoint: 0.605

ChangErgou · 2019-01-17T10:33:30Z

Honestly, I know nothing about key points detection. Therefore，I didn't change anything special. At first，the dataset defined in "DatasetCatalog" is oid? I see something like "person_keypoints_train2017_valminusminival_mod" with "coco2014". U do something to them and named it with "mod". I use and download the original dataset of coco2017. Maybe the dataset is something wrong in your server. Second，I only have one GPU. I set the learning rate to 0.001 and train the network with 720k iteration steps. I put 2 images on one GPU. After all, I do nothing because I am newly working on key points detection

fmassa · 2019-01-17T11:11:34Z

Thanks for the information.

So you effectively used a smaller learning rate (it should have been 0.0025 for 1 GPU I believe).
Did you also change the learning rate schedule, by changing the moments where it decreases the learning rate?

ChangErgou · 2019-01-17T15:58:21Z

Yes. The learning rate schedule and training steps are adjusted according to the learning rate.

Yishun99 · 2019-01-20T03:52:02Z

Hi @ChangErgou ,

Thanks for the comment! I'm not sure that that line has an effect, because there are no negative proposals selected for keypoints, and my understanding is that all boxes that are not person will be removed by the visibility check.

But I might be wrong here, and I should definitely try this change out!

Hi @fmassa

The previous version

neg_inds = matched_idxs == Matcher.BETWEEN_THRESHOLDS
labels_per_image[neg_inds] = 0

leaving the proposals not matched well(below threshold) have chances to be chosen as pos_ids in fg_bg_sampler.

Yishun99 · 2019-01-24T03:28:05Z

Yes. The learning rate schedule and training steps are adjusted according to the learning rate.

Hi @ChangErgou ,

How did you set SOLVER.STEPS and SOLVER.MAX_ITER when lr=0.001?
Compare your results with the Detectron's:

R-50-FPN

bbox:  51.6 vs. 52.7
keypoint: 66.5 vs. 64.1

I think your Box and Keypoint results are not matched, and the mAP of keypoint is even higher than X-101-64x4d-FPN(66.0) in Detectron. I would be very grateful if you can provide more details.

When I run with config(R-50-FPN, lr=0.001, schedule=(480000,640000), maxiter=720000, num_GPU=1, batch_size=2, train_data=COCO_2017). I got:

COCO_2017_val mAP:
```
box:  52.7
keypoint: 62.7
```
COCO_2017_val mAP(keypoint) curve:

(It’s a bit early to drop lr, but seems hard to achieve 66.5 if we delay dropping lr moment according to the trend)

jonmorton · 2019-01-25T02:01:21Z

BTW, one difference between this implementation and the caffe2 one is in the handling of invalid minibatches (batches with <20 total keypoints). Detectron will retry minibatch loading until a valid batch is found [1], whereas in this diff, we just zero the keypoint loss (but presumably still backprop all the other losses) [2].

Not sure how much of a difference this makes, though.

[1] https://github.com/facebookresearch/Detectron/blob/cbb0236dfdc17790658c146837215d2728e6fadd/detectron/roi_data/loader.py#L130-L135
[2] https://github.com/facebookresearch/maskrcnn-benchmark/pull/69/files#diff-da5a88b627656a466855dbea03e86b75R165

fmassa · 2019-01-25T09:51:54Z

@jonmorton precisely, this is one difference that I was aware of when I implemented Keypoints.

I printed the number of times that we have zeroed the keypoint loss but not the other losses (comparing to just fetching a new batch), and over 90k iterations we would zero the loss 2k times, which I'm not sure would account for all the remaining difference (but who knows?).

Making the batch be retried is kind of tricky, and doesn't quite fit with PyTorch DataLoader.

jonmorton · 2019-01-25T19:53:44Z

EDIT: I realize now that I'm seeing a larger drop rate because I'm using the single GPU, single image per batch setup. With two images per gpu, the drop rate is around what you saw.

It's a bit unfortunate that detectron had such a sensitivity to batch size though.

fmassa · 2019-01-28T09:40:22Z

I think for this particular case what I might end up doing is to just run for a few more iterations than what we had in Detectron (like 95k instead of 90k), and slightly adapt the learning rate schedules to take this into account. This might not exactly reproduce the results of Detectron as the other models do, but will still be significantly faster and will already add value

jonmorton · 2019-01-29T21:48:14Z

@fmassa another alternative suggested by @ppwwyyxx when we discussed it was simply filtering out images with fewer than X keypoints before training. (X=10 would give roughly the same skip rate as before, right?). This has the advantage of not having any dependence on batch size.

jonmorton · 2019-01-29T21:56:05Z

Another small difference between this PR and c2: you have initialized the ConvTranspose2d using kaiming_normal. But in c2 the weights of the transpose conv are initialized using a std=0.1 guassian distribution (unlike the convs): https://github.com/facebookresearch/Detectron/blob/master/detectron/modeling/keypoint_rcnn_heads.py#L62

fmassa · 2019-01-30T11:00:01Z

@jonmorton thanks for the messages and for looking into the potential reasons for the performance differences!

Yes, maybe filtering the images with fewer than X keypoints would be a cleaner solution, even though the number of keypoints visible inside the proposals is not exactly equivalent I think? The reason why it might not be equivalent is that the proposals are sampled dynamically via the RPN, and they might be bad and include few keypoints? But maybe just skipping those images indeed has the same intended behavior, and is much simpler!

About the initialization, I unfortunately think that we don't actually use that configuration in our models, because USE_DECONV is false in Detectron, see the config and the default config.
We are instead adding the kps_score_lowres, which uses MSRAFill, and I believe it is equivalent to the kaiming_normal_ that we have here (but I might be wrong).

fmassa · 2019-02-06T09:15:45Z

With the help of @jonmorton , I discovered what was wrong with my experiments with Keypoints that made me unable to reproduce Detectron's results: I had a problem with my training data, and I was using ~15k less images during training!

Now that I've reproduced the results I'll clean the code a bit and (finally!) merge this PR.

Thank you all for your help identifying the issue

fmassa · 2019-02-06T09:17:37Z

For completeness, here are the results I got:

boxes

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.538
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.829
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.584
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.367
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.616
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.696
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.184
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.543
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.620
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.475
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.684
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.763

keypoints
Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.643
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.862
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.698
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.593
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.724
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 20 ] = 0.707
Average Recall     (AR) @[ IoU=0.50      | area=   all | maxDets= 20 ] = 0.905
Average Recall     (AR) @[ IoU=0.75      | area=   all | maxDets= 20 ] = 0.758
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.658
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.777

Still gives slightly worse results

Improves mAP by 1.5 points, to 0.514 and 0.609

Still need further cleanups and improvements, like adding fields support for the other ops in Keypoints

zimenglan-sysu-512 · 2019-02-11T07:02:46Z

hi @fmassa
when do u merege the this pr into master?

fmassa · 2019-02-11T09:34:06Z

@zimenglan-sysu-512 I was just cleaning up a bit the code, rebasing against master and running some more trainings to verify that I didn't break anything.

It all looks good now, so I'll be merging it early this week.

@ChangErgou

* [WIP] Keypoints inference on C2 models work * Training seems to work Still gives slightly worse results * e2e training works but gives 3 and 5 mAP less * Add modification proposed by @ChangErgou Improves mAP by 1.5 points, to 0.514 and 0.609 * Keypoints reproduce expected results * Clean coco.py * Linter + remove unnecessary code * Merge criteria for empty bboxes in has_valid_annotation * Remove trailing print * Add demo support for keypoints Still need further cleanups and improvements, like adding fields support for the other ops in Keypoints * More cleanups and misc improvements * Fixes after rebase * Add information to the readme * Fix md formatting

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 30, 2018

fmassa mentioned this pull request Nov 1, 2018

how to set multi-scale training? #88

Closed

fmassa mentioned this pull request Nov 21, 2018

Support for Human Pose Estimation #197

Closed

thangvubk mentioned this pull request Nov 28, 2018

Unexpected results when using multi-scale training #220

Open

fmassa added 9 commits February 7, 2019 04:32

[WIP] Keypoints inference on C2 models work

8234f53

Training seems to work

72425b1

Still gives slightly worse results

e2e training works but gives 3 and 5 mAP less

0b827c5

Add modification proposed by @ChangErgou

1cb090d

Improves mAP by 1.5 points, to 0.514 and 0.609

Keypoints reproduce expected results

4246fb5

Clean coco.py

94712b8

Linter + remove unnecessary code

ab3604f

Merge criteria for empty bboxes in has_valid_annotation

24083fe

Remove trailing print

f764aff

fmassa force-pushed the keypoints branch from f8c9c5e to f764aff Compare February 7, 2019 12:55

fmassa added 3 commits February 7, 2019 20:49

Add demo support for keypoints

bc9826c

Still need further cleanups and improvements, like adding fields support for the other ops in Keypoints

More cleanups and misc improvements

2a86400

Fixes after rebase

f30f8fb

fmassa added 2 commits February 12, 2019 06:45

Add information to the readme

d433c3e

Fix md formatting

5a2e836

fmassa merged commit e0a525a into master Feb 12, 2019

fmassa deleted the keypoints branch February 12, 2019 14:48

fmassa changed the title ~~[WIP] Add Keypoint R-CNN~~ Add Keypoint R-CNN Feb 12, 2019

fmassa mentioned this pull request Mar 24, 2019

what is the next step or plan for this repo? #551

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Keypoint R-CNN #69

Add Keypoint R-CNN #69

fmassa commented Oct 30, 2018

rbgirshick commented Oct 31, 2018

fmassa commented Oct 31, 2018

Darknesszlx commented Dec 24, 2018

fmassa commented Dec 25, 2018

Darknesszlx commented Dec 26, 2018

wat3rBro commented Dec 26, 2018

ChangErgou commented Jan 3, 2019

wat3rBro commented Jan 4, 2019

ChangErgou commented Jan 10, 2019

fmassa commented Jan 10, 2019

ChangErgou commented Jan 14, 2019

fmassa commented Jan 14, 2019

fmassa commented Jan 16, 2019

fmassa commented Jan 17, 2019

ChangErgou commented Jan 17, 2019

fmassa commented Jan 17, 2019

ChangErgou commented Jan 17, 2019

Yishun99 commented Jan 20, 2019

Yishun99 commented Jan 24, 2019 •

edited

Loading

jonmorton commented Jan 25, 2019

fmassa commented Jan 25, 2019

jonmorton commented Jan 25, 2019 •

edited

Loading

fmassa commented Jan 28, 2019

jonmorton commented Jan 29, 2019 •

edited

Loading

jonmorton commented Jan 29, 2019 •

edited

Loading

fmassa commented Jan 30, 2019

fmassa commented Feb 6, 2019

fmassa commented Feb 6, 2019

zimenglan-sysu-512 commented Feb 11, 2019

fmassa commented Feb 11, 2019

Add Keypoint R-CNN #69

Add Keypoint R-CNN #69

Conversation

fmassa commented Oct 30, 2018

rbgirshick commented Oct 31, 2018

fmassa commented Oct 31, 2018

Darknesszlx commented Dec 24, 2018

fmassa commented Dec 25, 2018

Darknesszlx commented Dec 26, 2018

wat3rBro commented Dec 26, 2018

ChangErgou commented Jan 3, 2019

wat3rBro commented Jan 4, 2019

ChangErgou commented Jan 10, 2019

fmassa commented Jan 10, 2019

ChangErgou commented Jan 14, 2019

fmassa commented Jan 14, 2019

fmassa commented Jan 16, 2019

fmassa commented Jan 17, 2019

ChangErgou commented Jan 17, 2019

fmassa commented Jan 17, 2019

ChangErgou commented Jan 17, 2019

Yishun99 commented Jan 20, 2019

Yishun99 commented Jan 24, 2019 • edited Loading

jonmorton commented Jan 25, 2019

fmassa commented Jan 25, 2019

jonmorton commented Jan 25, 2019 • edited Loading

fmassa commented Jan 28, 2019

jonmorton commented Jan 29, 2019 • edited Loading

jonmorton commented Jan 29, 2019 • edited Loading

fmassa commented Jan 30, 2019

fmassa commented Feb 6, 2019

fmassa commented Feb 6, 2019

zimenglan-sysu-512 commented Feb 11, 2019

fmassa commented Feb 11, 2019

Yishun99 commented Jan 24, 2019 •

edited

Loading

jonmorton commented Jan 25, 2019 •

edited

Loading

jonmorton commented Jan 29, 2019 •

edited

Loading

jonmorton commented Jan 29, 2019 •

edited

Loading