Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TOPI, Relay] Support roi_align NHWC layout #7463

Merged
merged 10 commits into from
Feb 18, 2021
Merged

Conversation

masahi
Copy link
Member

@masahi masahi commented Feb 17, 2021

I've been working on optimizing MaskRCNN/FasterRCNN, one I thing I found was that Ansor generates better code for NHWC layout (see https://discuss.tvm.apache.org/t/autoscheduler-why-no-winograd-for-nchw-layout/9139). So I'm looking at improving NHWC end to end support / performance.

roi_align only supports NCHW layout, so layout_transform is inserted before/after each roi_align. This layout transform turned out expensive on GPU, so I added NHWC impl to roi_align to remove layout_transform. This cuts Faster RCNN runtime by 8 milli second.

Now, with Ansor NHWC tuning + roi_align improvement in this PR, we are beating pytorch by a large margin:

TVM NHWC(Ansor) + cublas: 0.0589 milli sec
PyTorch: 0.0738 milli sec

please review @kevinthesun @anijain2305 @comaniac @jwfromm @mbrookhart @electriclilies

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@comaniac
Copy link
Contributor

BTW, did you use ConvertLayout pass to convert the model layout? Would you need to add the corresponding layout converter (e.g., register_convert_op_layout or InferCorrectLayout) for RoI align to eliminate the layout_transform?

Also out of curiousity, what GPU did you use the get the performance number?

@masahi
Copy link
Member Author

masahi commented Feb 17, 2021

@comaniac GPU is geforce 1070 ti

Yeah, roi_align NHWC was like half done in #6443 where ConvertLayout support was added but implementation was not. So until now we can get a relay graph with NHWC roi_align op in it, but we cannot compile it. It is only useful for e.g BYOC.

@comaniac
Copy link
Contributor

Ah I see. That makes sense.

Copy link
Contributor

@mbrookhart mbrookhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@masahi masahi merged commit b7e0cfb into apache:main Feb 18, 2021
Lokiiiiii pushed a commit to Lokiiiiii/tvm that referenced this pull request Mar 2, 2021
* begin nhwc roi align

* integrate mode change from upstream

* adding test

* support nhwc shape func

* update strategy

* refactoring test

* refactor test

* refactoring

* fix lint

* update relay op tests
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2021
* begin nhwc roi align

* integrate mode change from upstream

* adding test

* support nhwc shape func

* update strategy

* refactoring test

* refactor test

* refactoring

* fix lint

* update relay op tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants