Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MetaSchedule] Fine-Grained Rewrite Unbound Block #10823

Conversation

zxybazh
Copy link
Member

@zxybazh zxybazh commented Mar 30, 2022

In this PR we introduced more fine-grained loop spliting and reordering for Rewrite-Unbound-Block post processor based on given cuda target's attribute (max_threads_per_block). After this PR the performance of non-reductional kernels could improve by ~20%. Regression tests are also added.

@zxybazh
Copy link
Member Author

zxybazh commented Mar 30, 2022

CC @junrushao1994

@zxybazh zxybazh changed the title [Meta Schedule] Add Injective Loop Spliting [MetaSchedule] Add Injective Loop Spliting Mar 30, 2022
@zxybazh zxybazh changed the title [MetaSchedule] Add Injective Loop Spliting [MetaSchedule] Fine-Grained Rewrite Unbound Block Mar 30, 2022
Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, and just a few nitpicks :-)

src/meta_schedule/postproc/rewrite_unbound_block.cc Outdated Show resolved Hide resolved
src/meta_schedule/postproc/rewrite_unbound_block.cc Outdated Show resolved Hide resolved
src/meta_schedule/postproc/rewrite_unbound_block.cc Outdated Show resolved Hide resolved
src/meta_schedule/postproc/rewrite_unbound_block.cc Outdated Show resolved Hide resolved
src/meta_schedule/postproc/rewrite_unbound_block.cc Outdated Show resolved Hide resolved
@junrushao
Copy link
Member

Also I would love to thank @comaniac for help!

@zxybazh
Copy link
Member Author

zxybazh commented Mar 30, 2022

@junrushao1994 CI passed and issues addressed!

@junrushao
Copy link
Member

Niiiioce!

@junrushao junrushao merged commit 72c761c into apache:main Mar 31, 2022
junrushao pushed a commit to junrushao/tvm that referenced this pull request Mar 31, 2022
In this PR we introduced more fine-grained loop spliting and reordering for Rewrite-Unbound-Block post processor based on given cuda target's attribute (`max_threads_per_block`). After this PR the performance of non-reductional kernels could improve by ~20%. Regression tests are also added.
junrushao pushed a commit to junrushao/tvm that referenced this pull request Mar 31, 2022
In this PR we introduced more fine-grained loop spliting and reordering for Rewrite-Unbound-Block post processor based on given cuda target's attribute (`max_threads_per_block`). After this PR the performance of non-reductional kernels could improve by ~20%. Regression tests are also added.
pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022
In this PR we introduced more fine-grained loop spliting and reordering for Rewrite-Unbound-Block post processor based on given cuda target's attribute (`max_threads_per_block`). After this PR the performance of non-reductional kernels could improve by ~20%. Regression tests are also added.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Apr 11, 2022
In this PR we introduced more fine-grained loop spliting and reordering for Rewrite-Unbound-Block post processor based on given cuda target's attribute (`max_threads_per_block`). After this PR the performance of non-reductional kernels could improve by ~20%. Regression tests are also added.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants