Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR][Schedule] enhance compute_at and reverse_compute_at primitive to choose possible position #12450

Merged
merged 1 commit into from
Aug 26, 2022

Conversation

yincs-intellif
Copy link
Contributor

@yincs-intellif yincs-intellif commented Aug 16, 2022

Current TIR "compute_at" primitive will compute at it's closest consumers. When a block has multiple producers, whoever
compute at later who is behind. But for some special hardware, we usually hope keep the a certain order whatever it's compute at early or late.
eg: block A and block B are producers of block C. block A compute at block C first and block B compute at block C later. We hope the result is block B->block A->block C under some loop var.

cc @Hzfengsy @junrushao1994

Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for one nit

@Hzfengsy
Copy link
Member

Do you have ideas about the naming of to_early_stage? cc @junrushao1994 @spectrometerHBH

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely a valid concern, while we might want to deliberate on the following two questions:

  • Is it possible to generalize the boolean flag to_early_stage to an integer, and allow users to specify which produecr/consumer it should move to?
  • The naming to_early_stage itself is definitely a bit less informative. Let's consider a better name

Besides, given that we have reverse_compute_at symmetrically, shall we also support a similar flag for it?

Thanks in advance!

@yincs-intellif
Copy link
Contributor Author

Thanks @junrushao1994 I quite agree with your suggestion. Currently I modify it according to scenario we have encountered. Do you have any suggestion about the better name?

@yincs-intellif
Copy link
Contributor Author

yincs-intellif commented Aug 16, 2022

Hi @junrushao1994 , what do you think of my follow suggestion?

  1. we give a integer parameter instead of to_early_stage, which means block number, masked as N.
  2. Default we will get a compute at position, which is after the last producer and before the first consumer, masked as [P, C].
  3. when N in [P,C], we use N for the position of compute at, otherwise use C or report error.

The above rules apply to "reverse_compute_at" at same time

Thanks in advance!

@yincs-intellif
Copy link
Contributor Author

@junrushao Hi, I have changed as you mentioned, could you please help review again?

Thanks in advance!

@junrushao
Copy link
Member

Hey sorry for the late reply! Happy to do another round of review tomorrow!

@yincs-intellif
Copy link
Contributor Author

Hey sorry for the late reply! Happy to do another round of review tomorrow!

@junrushao Hi, I don't receive any feedback information. Did I miss something?

src/tir/schedule/primitive/compute_at.cc Outdated Show resolved Hide resolved
src/tir/schedule/primitive/compute_at.cc Outdated Show resolved Hide resolved
src/tir/schedule/primitive/compute_at.cc Show resolved Hide resolved
@Hzfengsy Hzfengsy self-assigned this Aug 24, 2022
@yincs-intellif yincs-intellif changed the title [TIR][Schedule] enhance compute_at primitive to choose proper position [TIR][Schedule] enhance compute_at and reverse_compute_at primitive to choose possible position Aug 25, 2022
@yincs-intellif yincs-intellif requested review from Hzfengsy and junrushao and removed request for Hzfengsy and junrushao August 25, 2022 03:16
Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the continuous improvement!

Comment on lines 1308 to 1310
The block index of the loop body subtree blocks
-1 means inserted into the last possible insertion point
-2 means inserted into the first possible insertion point
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The block index of the loop body subtree blocks
-1 means inserted into the last possible insertion point
-2 means inserted into the first possible insertion point
The block index of the loop body subtree blocks:
- `index = -1` means inserted into the last possible insertion point;
- `index = -2` means inserted into the first possible insertion point;
- Otherwise, `index` is a nonnegative number that indicates the insertion point

Comment on lines 302 to 304
* \param index The block index of the loop body subtree blocks
* -1 means inserted into the last possible insertion point
* -2 means inserted into the first possible insertion point
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update the doc accordingly

Comment on lines +165 to +181
// Step 4. Return the possible insertion point according to index
int insert_position;
if (index == -1) {
insert_position = split.first_consumer_position;
} else if (index == -2) {
insert_position = split.last_producer_position + 1;
} else if (index >= 0 && index >= split.last_producer_position + 1 &&
index <= split.first_consumer_position) {
insert_position = index;
} else {
LOG(FATAL) << "Valid index:(-1, -2, [" << split.last_producer_position + 1 << ", "
<< split.first_consumer_position << "]), "
<< "current index=" << index;
throw;
}
return insert_position;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the logic here! It's much clearer now!

/*subtrees=*/AsArray(loop->body),
/*producer_srefs=*/producer_srefs,
/*consumer_srefs=*/consumer_srefs, /*block2realize=*/&block2realize);
/*consumer_srefs=*/consumer_srefs, /*block2realize=*/&block2realize,
/*index*/ index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*index*/ index);
/*index=*/index);

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yincs-intellif for the update! The diff is in a pretty good shape now! Let's fix some final nitpicks and then get it merged!

@yincs-intellif
Copy link
Contributor Author

Hi @junrushao @Hzfengsy The final modification has been updated.
Thank you for the high quality review!

@Hzfengsy Hzfengsy merged commit e02f2f9 into apache:main Aug 26, 2022
@Hzfengsy
Copy link
Member

Thanks @yincs-intellif for this PR and continuous update.

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
…o choose possible position (apache#12450)

Current TIR "compute_at" primitive will compute at it's closest consumers. When a block has multiple producers, whoever compute at later who is behind. But for some special hardware, we usually hope keep the a certain order whatever it's compute at early or late.
eg: block A and block B are producers of block C. block A compute at block C first and block B compute at block C later. We hope the result is block B->block A->block C under some loop var.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants