[MetaSchedule] Upstream the leftover changes #10689

junrushao · 2022-03-19T19:50:24Z

This PR upstreams the leftovers in MetaSchedule upstream process for performance and stability.

junrushao · 2022-03-19T19:50:40Z

comaniac

LGTM. Just a small nit.

comaniac · 2022-03-22T00:46:28Z

python/tvm/meta_schedule/testing/run_subgraph_auto_scheduler.py

@@ -93,6 +91,7 @@ def main():
            cache_line_bytes=64,
            max_shared_memory_per_block=int(ARGS.target.attrs["max_shared_memory_per_block"]),
            max_threads_per_block=int(ARGS.target.attrs["max_threads_per_block"]),
+            max_local_memory_per_block=12345678,


nit: Have a comment saying this is useless but just a placeholder.

comaniac · 2022-03-22T00:48:38Z

Thanks @junrushao1994

masahi · 2022-03-28T01:17:49Z

src/tir/schedule/primitive/sampling.cc

+  while (true) {
+    std::vector<int64_t> result = SamplePerfectTile(rand_state, extent, n_splits);
+    if (result.back() <= max_innermost_factor) {
+      return result;


I found an interesting issue with this change. After this PR, my VNNI tuning test hangs at https://gist.github.com/masahi/3579be9b2b9506106270eb2217746e74#file-vnni_dense_tir-py-L164, i.e. the second call in

def schedule_rule_batch_matmul_vnni(sch: tir.Schedule, bmm_block): sch_copy = sch.copy() schedule_batch_matmul(bmm_block, None, True, sch, layout_trans_compute_root=False) schedule_batch_matmul(bmm_block, None, True, sch_copy, layout_trans_compute_root=True) # <- Hangs at sample_perfect_tile return [sch, sch_copy]

If I remove the second schedule_batch_matmul, everything is ok. Apparently, during the second call to sample_perfect_tile, SamplePerfectTile above keeps returning 1024 when max_inner_most_factor is 128 and extent is 1024. Does this bring any bell?

Oh ooops! so this PR basically loops until the innermost factor satisfies some certain constraint, with the assumption that there is always a case where the factor could be 1 where 1 <= max_innermost_factor always holds.

To make sure I understand this correctly, in your particular case, extent = 1024, max_innermost_factor = 128, and what is n_splits? And it causes forever hang on this method? If so, probably @zxybazh or I could reproduce this bug and get it fixed :-)

Yes, extent = 1024, max_innermost_factor = 128 and n_splits=2.

It hangs on the second call to schedule_batch_matmul, where SamplePerfectTile keeps returning 1024. The following alternative definition works fine:

def schedule_rule_batch_matmul_vnni(sch: tir.Schedule, bmm_block): schedule_batch_matmul(bmm_block, None, True, sch, layout_trans_compute_root=False) return [sch]

I think the issue has to be related to the use of rand_state, since this is the only stateful input to SamplePerfectTile(rand_state, extent, n_splits). Note that I'm applying the same schedule on two copies of sch. Maybe when I make a copy of schedule by sch_copy = sch.copy(), the RNG engine is not copied?

😱 seems not!!!!!

https://github.com/junrushao1994/tvm/blob/673355b2a423dfbb8f5e311cfd74503ae91fbba9/src/tir/schedule/concrete_schedule.cc#L189

We didn't set the new random state by:

n->rand_state_ = self->ForkSeed(); // (also we need to change the Copy method to a non-const one)

and thus likely the state is always some default value......I'm very surprised that we didn't find out in previous performance alignment.

Would you mind sending a quick PR fixing this dangerous behavior? Thanks a lot!

I did that in the commit f9c655a, but it didn't fix the hang problem. I'll look into more this week.

Fixed in #10806

Hi Masa, thanks for the quick fix, just want to confirm with you if the hang problem is fixed after #10806 ?

yes it was fixed in #10806

* [MetaSchedule] Upstream the leftover changes * Update driver_api.cc

junrushao force-pushed the feature/2022-03-19/meta-schedule-perf-fix branch 5 times, most recently from a1191cd to 053a2e7 Compare March 20, 2022 17:17

[MetaSchedule] Upstream the leftover changes

edce105

junrushao force-pushed the feature/2022-03-19/meta-schedule-perf-fix branch from 053a2e7 to edce105 Compare March 20, 2022 17:22

Update driver_api.cc

c166756

comaniac approved these changes Mar 22, 2022

View reviewed changes

comaniac merged commit cced5bc into apache:main Mar 22, 2022

masahi reviewed Mar 28, 2022

View reviewed changes

masahi mentioned this pull request Mar 28, 2022

[TIR] Properly initialize PRNG seed when copying schedule #10806

Merged

pfk-beta pushed a commit to pfk-beta/tvm that referenced this pull request Apr 11, 2022

[MetaSchedule] Upstream the leftover changes (apache#10689)

c1ba0e2

* [MetaSchedule] Upstream the leftover changes * Update driver_api.cc

driazati mentioned this pull request Jul 14, 2022

TVM v0.9.0.rc0 Release Candidate Notes #12102

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaSchedule] Upstream the leftover changes #10689

[MetaSchedule] Upstream the leftover changes #10689

junrushao commented Mar 19, 2022

junrushao commented Mar 19, 2022

comaniac left a comment

comaniac Mar 22, 2022

comaniac commented Mar 22, 2022

masahi Mar 28, 2022 •

edited

Loading

junrushao Mar 28, 2022

masahi Mar 28, 2022

junrushao Mar 28, 2022 •

edited

Loading

masahi Mar 28, 2022

masahi Mar 28, 2022

zxybazh Mar 30, 2022

masahi Mar 30, 2022

[MetaSchedule] Upstream the leftover changes #10689

[MetaSchedule] Upstream the leftover changes #10689

Conversation

junrushao commented Mar 19, 2022

junrushao commented Mar 19, 2022

comaniac left a comment

Choose a reason for hiding this comment

comaniac Mar 22, 2022

Choose a reason for hiding this comment

comaniac commented Mar 22, 2022

masahi Mar 28, 2022 • edited Loading

Choose a reason for hiding this comment

junrushao Mar 28, 2022

Choose a reason for hiding this comment

masahi Mar 28, 2022

Choose a reason for hiding this comment

junrushao Mar 28, 2022 • edited Loading

Choose a reason for hiding this comment

masahi Mar 28, 2022

Choose a reason for hiding this comment

masahi Mar 28, 2022

Choose a reason for hiding this comment

zxybazh Mar 30, 2022

Choose a reason for hiding this comment

masahi Mar 30, 2022

Choose a reason for hiding this comment

masahi Mar 28, 2022 •

edited

Loading

junrushao Mar 28, 2022 •

edited

Loading