[MetaSchedule] Tuning API cleanup & ergonomics #12895

junrushao · 2022-09-25T05:48:24Z

This PR refactors tuning APIs to help with developer ergonomics and enable new potential and usecases.

Introduction

📅 Original behavior. The original monolithic tuning API assumes that tuning is an end-to-end process that transforms an IRModule into a runtime Module. For example, the API below is designed for Relay end-to-end tuning:

from tvm import meta_scheduler as ms

ms.tune_relay(
  mod: IRModule,              # The Relay program
  target: Union[str, Target], # Parameters used in the Relay program
  config: TuneConfig,         # Configuration, e.g. number of trials
  work_dir: str,              # Compilation target
  ...
) -> runtime.Module: ...

🤔 The challenge. While striving to be "the" API that controls end-to-end tuning, the design ignores a fact that many users desire to compile an neural network without going through the tuning process, and the fact that MetaSchedule is capable of doing so when supplied with a pre-tuned database.

🆕 Our refactoring. Therefore, this PR is introduced to cater those concrete needs by refactoring the monolithic API into 2 or 3 stages, depending how it is used. Take tune_relay as another example, now it's refactored into 2 separate APIs, the first of which is slower tuning that returns a database, while the second takes a pre-tuned database for fast Relay compilation.

ms.relay_integration.tune_relay(
    mod: IRModule,
    params: Dict[str, NDArray],
    target: Union[str, Target],
    work_dir: str,
    max_trials_global: int,
    ...
) -> Database: ...

ms.relay_integration.compile_relay(
    database: Database,
    mod: IRModule,
    target: Union[Target, str],
    params: Optional[Dict[str, NDArray]],
    ...
) -> runtime.Module: ...

Upgrade guide

If you are using `ms.tune_relay`

The original monolithic API is used as:

lib = ms.tune_relay(
    mod=mod,
    target=ARGS.target,
    config=ms.TuneConfig(
        strategy="evolutionary",
        num_trials_per_iter=64,
        max_trials_per_task=ARGS.num_trials,
        max_trials_global=ARGS.num_trials,
        adaptive_training=ARGS.adaptive_training,
    ),
    runner=runner,
    work_dir=ARGS.work_dir,
    params=params,
    backend=ARGS.backend,
)

And the new design is very much similar with 2 notable differences:

The monolithic API is split into 2 separate APIs
It no longer requires a second level configuration, i.e. TuneConfig

As a concrete example, the API above should be written as:

database = ms.relay_integration.tune_relay(
    mod=mod,
    target=ARGS.target,
    work_dir=ARGS.work_dir,
    max_trials_global=ARGS.num_trials,
    num_trials_per_iter=64,
    params=params,
    runner=runner,
    strategy="evolutionary",
)
lib = ms.relay_integration.compile_relay(
    database=database,
    mod=mod,
    target=ARGS.target,
    params=params,
    backend=ARGS.backend,
)

Please refer to changes in python/tvm/meta_schedule/testing/tune_relay.py as a practical case.

If you are using `ms.tune_extracted_tasks`

As a classic usecase, fluent TVM users may want to extract tasks from Relay first, filter the tasks themselves before sending them to the tuning system. It usually involves 3 APIs:

from tvm import meta_schedule as ms

# API 1. Task extraction and filtering
extracted_tasks: List[ExtractedTask] = ms.extract_task_from_relay(relay_mod, target, params)
extracted_tasks = [task for task in extracted_tasks if "conv2d" in task.task_name]

# API 2. Tuning
database = tune_extracted_tasks(
    tune_tasks,
    ms.TuneConfig(...),
    work_dir=work_dir,
    num_threads=32,
    ...,
)

# API 3. Relay compilation
with database, tvm.transform.PassContext(
    opt_level=3,
    config={"relay.backend.use_meta_schedule": True},
):
    lib = relay.build(relay_mod, target=target, params=params)

To provide more flexibility of fine-grained control over the tuning system, we again add an extra API that allows customize the behavior of ms.ExtractedTask to ms.TuneContext conversion. More specifically, after this refactoring, the APIs are changed into:

# API 1. Task extraction and filtering
extracted_tasks: List[ExtractedTask] = ms.relay_integration.extract_tasks(relay_mod, target, params)
extracted_tasks = [task for task in extracted_tasks if "conv2d" in task.task_name]

# API 2. Convert `ms.ExtractedTask` to `ms.TuneContext`
tasks: List[TuneContext]
task_weights: List[float]
tasks, task_weights = ms.relay_integration.extracted_tasks_to_tune_contexts(
    extracted_tasks=tune_tasks,
    work_dir=work_dir,
    space="post-order-apply", # gives the flexibility to customize per-task search space
    num_threads=32,
 )

# API 3. Tuning
database = ms.tune.tune_tasks(
    tasks=tasks,
    task_weights=task_weights,
    work_dir=work_dir,
    max_trials_global=20000,
)

# API 4. Relay compilation
lib = ms.relay_integration.compile_relay(
    database=database,
    mod=mod,
    target=ARGS.target,
    params=params,
    backend=ARGS.backend,
)

Please refer to changes in tests/python/integration/test_meta_schedule_auto_tensorize.py as a practical case.

Misc changes

blocks in tune_tir is moved to ms.space.PostOrderApply(f_block_filter=...)
adaptive_training in tune_{relay}/{tir}/{extracted_tasks} is moved to ms.cost_model.XGBModel(adaptive_training=...)
sch_rules/postprocs/mutators in tune_{relay}/{tir}/{extracted_tasks} is moved to ms.space.PostOrderApply(...), and when unspecified, a target-specific default is used.
default_config.py is broken down into tvm::meta_schedule::{ScheduleRule}/{Mutator}/{Postproc}::Default{LLVM}/{CPU}/{CUDA}.

Performance Numbers

The PR is tested end-to-end on a subset of representative models to avoid potential regression.

Performance comparison on V100 (AWS P3.2xlarge).

	MetaSchedule @ main (ms)	This PR (ms)	Difference
bert_base	3.185650996	3.222358502	-1.14%
resnet_50	1.588203344	1.586299171	0.12%
mobilenet_v2	0.4574400258	0.4596171817	-0.47%
resnet_18	0.6853301584	0.6812821976	0.59%
mobilenet_v3	0.7230763281	0.7010596015	3.14%
wide_resnet_50	2.864763701	2.797114016	2.42%
densenet_121	2.330949968	2.332683173	-0.07%
vgg_16	2.780654826	2.807344907	-0.95%

Performance comparison on Intel Skylake (AWS C5.9xlarge):

	MetaSchedule @ main (ms)	This PR (ms)	Difference
bert_base	12.15242064	12.37192344	-1.77%
resnet_50	5.225000453	5.320676231	-1.80%
mobilenet_v2	0.7461500253	0.753737067	-1.01%
resnet_18	2.103578434	2.019274095	4.17%
mobilenet_v3	1.14312758	1.15862842	-1.34%
wide_resnet_50	11.73288837	11.84455867	-0.94%
densenet_121	14.90702895	15.41747371	-3.31%
vgg_16	15.47650269	15.42590106	0.33%

In summarize, no performance regression is observed after this refactoring.

zxybazh · 2022-09-27T22:32:52Z

I like this change of decoupling compilation and tuning, the changes to default classes usage also make sense. Please let me know when the PR is ready for review.

junrushao · 2022-09-28T07:03:40Z

@tqchen @Hzfengsy @spectrometerHBH @zxybazh @vinx13 @yelite The PR is ready for review. Please take a look!

src/meta_schedule/schedule_rule/schedule_rule.cc

junrushao · 2022-10-06T03:30:13Z

Hey @masahi, I added executor parameter to extract_tasks and compile_relay, which controls the default value of relay.FuseOps.link_params in pass configuration. It's quite confusing to me that executor is lifted out of pass config and somehow control the compilation process in a half-functioning way (only works for GraphExecutor), and am not sure if I'm using that correctly, so please feel free to suggest what the best way is :-)

On the other hand, as a high-level API, I would prefer not to tweak tune_relay adding more parameters to it, given we wanted to give a cleaner interface to introductory level users. Instead, advance users could always use extract_tasks + tune_tasks + compile_relay to get fine-grained control over the tuning process

python/tvm/meta_schedule/relay_integration.py

masahi · 2022-10-06T07:09:26Z

On the other hand, as a high-level API, I would prefer not to tweak tune_relay adding more parameters to it, given we wanted to give a cleaner interface to introductory level users. Instead, advance users could always use extract_tasks + tune_tasks + compile_relay to get fine-grained control over the tuning process

Yes, I agree with this. A part of the reason I didn't want to change extract_task API before was that the vast majority of users don't need to care about executor stuff. For Hexagon users, we can add a wrapper API in contrib/hexagon/meta_schedule to simplify the usage. We already require using Hexagon-specific builder / runner, so the wrapper API can hide such details too.

junrushao · 2022-10-06T23:57:31Z

@masahi I updated the PR with my latest understanding of Hexagon pipeline. Would you mind taking another look? Thanks a lot!

tests/python/contrib/test_hexagon/test_meta_schedule.py

masahi · 2022-10-07T00:19:04Z

@junrushao I made one comment but otherwise Hexagon change looks good to me. It didn't occur to me before that we can do mod = mod.with_attr(...) from the user script to avoid threading executor through task extraction and tune_relay etc.

This PR refactors tuning APIs to help with developer ergonomics and enable new potential and usecases. \## Introduction **📅 Original behavior.** The original monolithic tuning API assumes that tuning is an end-to-end process that transforms an IRModule into a runtime Module. For example, the API below is designed for Relay end-to-end tuning: ```python from tvm import meta_scheduler as ms ms.tune_relay( mod: IRModule, # The Relay program target: Union[str, Target], # Parameters used in the Relay program config: TuneConfig, # Configuration, e.g. number of trials work_dir: str, # Compilation target ... ) -> runtime.Module: ... ``` **🤔 The challenge.** While striving to be "the" API that controls end-to-end tuning, the design ignores a fact that many users desire to compile an neural network without going through the tuning process, and the fact that MetaSchedule is capable of doing so when supplied with a pre-tuned database. **🆕 Our refactoring.** Therefore, this PR is introduced to cater those concrete needs by refactoring the monolithic API into 2 or 3 stages, depending how it is used. Take `tune_relay` as another example, now it's refactored into 2 separate APIs, the first of which is slower tuning that returns a database, while the second takes a pre-tuned database for fast Relay compilation. ```python ms.relay_integration.tune_relay( mod: IRModule, params: Dict[str, NDArray], target: Union[str, Target], work_dir: str, max_trials_global: int, ... ) -> Database: ... ms.relay_integration.compile_relay( database: Database, mod: IRModule, target: Union[Target, str], params: Optional[Dict[str, NDArray]], ... ) -> runtime.Module: ... ``` \## Upgrade guide \### If you are using `ms.tune_relay` The original monolithic API is used as: ```python lib = ms.tune_relay( mod=mod, target=ARGS.target, config=ms.TuneConfig( strategy="evolutionary", num_trials_per_iter=64, max_trials_per_task=ARGS.num_trials, max_trials_global=ARGS.num_trials, adaptive_training=ARGS.adaptive_training, ), runner=runner, work_dir=ARGS.work_dir, params=params, backend=ARGS.backend, ) ``` And the new design is very much similar with 2 notable differences: - The monolithic API is split into 2 separate APIs - It no longer requires a second level configuration, i.e. `TuneConfig` As a concrete example, the API above should be written as: ```python database = ms.relay_integration.tune_relay( mod=mod, target=ARGS.target, work_dir=ARGS.work_dir, max_trials_global=ARGS.num_trials, num_trials_per_iter=64, params=params, runner=runner, strategy="evolutionary", ) lib = ms.relay_integration.compile_relay( database=database, mod=mod, target=ARGS.target, params=params, backend=ARGS.backend, ) ``` Please refer to changes in `python/tvm/meta_schedule/testing/tune_relay.py` as a practical case. \### If you are using `ms.tune_extracted_tasks` As a classic usecase, fluent TVM users may want to extract tasks from Relay first, filter the tasks themselves before sending them to the tuning system. It usually involves 3 APIs: ```python from tvm import meta_schedule as ms \# API 1. Task extraction and filtering extracted_tasks: List[ExtractedTask] = ms.extract_task_from_relay(relay_mod, target, params) extracted_tasks = [task for task in extracted_tasks if "conv2d" in task.task_name] \# API 2. Tuning database = tune_extracted_tasks( tune_tasks, ms.TuneConfig(...), work_dir=work_dir, num_threads=32, ..., ) \# API 3. Relay compilation with database, tvm.transform.PassContext( opt_level=3, config={"relay.backend.use_meta_schedule": True}, ): lib = relay.build(relay_mod, target=target, params=params) ``` To provide more flexibility of fine-grained control over the tuning system, we again add an extra API that allows customize the behavior of `ms.ExtractedTask` to `ms.TuneContext` conversion. More specifically, after this refactoring, the APIs are changed into: ```python \# API 1. Task extraction and filtering extracted_tasks: List[ExtractedTask] = ms.relay_integration.extract_tasks(relay_mod, target, params) extracted_tasks = [task for task in extracted_tasks if "conv2d" in task.task_name] \# API 2. Convert `ms.ExtractedTask` to `ms.TuneContext` tasks: List[TuneContext] task_weights: List[float] tasks, task_weights = ms.relay_integration.extracted_tasks_to_tune_contexts( extracted_tasks=tune_tasks, work_dir=work_dir, space="post-order-apply", # gives the flexibility to customize per-task search space num_threads=32, ) \# API 3. Tuning database = ms.tune.tune_tasks( tasks=tasks, task_weights=task_weights, work_dir=work_dir, max_trials_global=20000, ) \# API 4. Relay compilation lib = ms.relay_integration.compile_relay( database=database, mod=mod, target=ARGS.target, params=params, backend=ARGS.backend, ) ``` Please refer to changes in `tests/python/integration/test_meta_schedule_auto_tensorize.py` as a practical case. \### Misc changes - `blocks` in `tune_tir` is moved to `ms.space.PostOrderApply(f_block_filter=...)` - `adaptive_training` in `tune_{relay}/{tir}/{extracted_tasks}` is moved to `ms.cost_model.XGBModel(adaptive_training=...)` - `sch_rules`/`postprocs`/`mutators` in `tune_{relay}/{tir}/{extracted_tasks}` is moved to `ms.space.PostOrderApply(...)`, and when unspecified, a target-specific default is used. - `default_config.py` is broken down into `tvm::meta_schedule::{ScheduleRule}/{Mutator}/{Postproc}::Default{LLVM}/{CPU}/{CUDA}`.

tests/python/unittest/test_meta_schedule_tune_relay.py

masahi · 2022-10-07T10:55:56Z

tests/python/unittest/test_meta_schedule_vnni_integration.py

+
+
+@pytest.mark.skip("Requires cascadelake")
+def test_vnni_schedule_fn_tune():


This test is broken with the error

> space=ms.space_generator.PostOrderApply( f_block_filter=None, sch_rules=None, postprocs=[], mutator_probs=None, ), ) test_meta_schedule_vnni_integration.py:213: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../../python/tvm/meta_schedule/space_generator/post_order_apply.py:53: in __init__ sch_rules, postprocs, mutator_probs = _normalize_rules(sch_rules, postprocs, mutator_probs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ sch_rules = None, postprocs = [], mutator_probs = None def _normalize_rules( sch_rules: ScheduleRuleType, postprocs: PostprocType, mutator_probs: MutatorProbType, ) -> Tuple[ Optional[List["ScheduleRule"]], Optional[List["Postproc"]], Optional[Dict["Mutator", float]], ]: # pylint: disable=import-outside-toplevel from ..mutator import Mutator from ..postproc import Postproc from ..schedule_rule import ScheduleRule # pylint: enable=import-outside-toplevel > assert sch_rules is not None E AssertionError

will send a fix

this should work

space=ms.space_generator.PostOrderApply( f_block_filter=None, sch_rules="from-target", postprocs=[], mutator_probs="from-target", ), )

masahi · 2022-10-07T11:51:10Z

tests/python/contrib/test_hexagon/test_meta_schedule.py

-            config = ms.TuneConfig(
-                strategy="replay_trace",
+            target = get_hexagon_target("v68")
+            database = ms.tir_integration.tune_tir(


Two uses of tune_tir in this file have incorrect signature. I got the following errors:

E TypeError: tune_tir() got an unexpected keyword argument 'sch_rules'

E Check failed: (!checked_type.defined()) is false: Expected Map[meta_schedule.Mutator, FloatImm], but got Array

this should work:

target = get_hexagon_target("v68") database = ms.tir_integration.tune_tir( mod=workload, target=target, max_trials_global=8, num_trials_per_iter=8, max_trials_per_task=8, work_dir=work_dir, space=ms.space_generator.PostOrderApply( f_block_filter=None, sch_rules=sch_rules, postprocs=postprocs, mutator_probs={}, ), builder=get_hexagon_local_builder(), runner=get_hexagon_rpc_runner(hexagon_launcher, number=10), ) sch = ms.tir_integration.compile_tir(database, workload, target)

csullivan · 2022-10-24T05:48:37Z

tests/python/unittest/test_meta_schedule_vnni_integration.py

+    def schedule_rule_dense_vnni(sch: Schedule, dense_block: BlockRV):
+        _schedule_dense(m=None, do_tune=True)(sch, dense_block)
+        return [sch]
+
+    register_func("meta_schedule.dense_vnni", schedule_rule_dense_vnni)


@junrushao @masahi or others, may I ask what the difference is between using the TE annotation as described (e.g. attrs={"schedule_rule": "meta_schedule.dense_vnni"}, and a corresponding packed func defining the schedule to use, as opposed to just generating the space via

space=ms.space_generator.ScheduleFn( _schedule_dense, ... ),

?

Is it that in this test case we allow auto scheduling for all ops but apply special manual scheduling for certain ops (dense in this case), whereas if we use the ScheduleFn technique for generating a search space we do not allow other operators to be auto scheduled? Thanks!

I think ScheduleFnDatabase is for a completely manual schedule, while the register_func way allows autotvm-style template based tuning. At least that's what I wanted to demonstrate before this PR or before ScheduleFnDatabase was introduced.

In this case I'm not referring to ScheduleFnDatabase as is used in test_vnni_schedule_fn_database. I'm referring here to what is done in the test test_vnni_schedule_fn_tune which utilizes the TE compute schedule_rule attr annotation along with a global packed function for the schedule that matches the annotation value meta_schedule.dense_vnni. I'm wondering if there is any difference or advantage between using the TE attr annotation and packed func as opposed to specifying an alternate search space with ScheduleFn.

Hi Chris, ScheduleFn space generator is designed to schedule all blocks in the whole Schedule, not block specific. The annotation based packfunc scheduling only works in PostOrderApply space generator, which essentially applies this annotated rule for this specific block, and apply default schedule rules (or schedule rules given in user interface) to other non-specified blocks.

Therefore, creating a ScheduleFn takes more effort and use the annotation based scheduling is easier because you don't need to worry about scheduling of other blocks.

Ahh, okay I see, thanks for the discussion @zxybazh @masahi, this is helpful.

junrushao marked this pull request as ready for review September 25, 2022 05:49

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch 16 times, most recently from ad84e8d to c4afc4d Compare September 27, 2022 20:48

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch 7 times, most recently from be1b58b to 9c28959 Compare September 28, 2022 04:20

vinx13 reviewed Sep 28, 2022

View reviewed changes

src/meta_schedule/schedule_rule/schedule_rule.cc Outdated Show resolved Hide resolved

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch 3 times, most recently from fb0e91e to ea280df Compare September 29, 2022 00:26

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch 2 times, most recently from 6943a88 to 83871fe Compare October 6, 2022 03:25

spectrometerHBH approved these changes Oct 6, 2022

View reviewed changes

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch from 83871fe to 612979f Compare October 6, 2022 06:06

masahi reviewed Oct 6, 2022

View reviewed changes

python/tvm/meta_schedule/relay_integration.py Outdated Show resolved Hide resolved

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch from 612979f to d99ac85 Compare October 6, 2022 14:28

junrushao changed the title ~~[MetaSchedule] UX: Tuning API cleanup & developer ergonomics~~ [MetaSchedule] Tuning API cleanup & ergonomics Oct 6, 2022

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch 2 times, most recently from 105fd0c to 7b71a2a Compare October 6, 2022 20:58

masahi mentioned this pull request Oct 6, 2022

[MetaSchedule] Support RewriteLayout postproc on AllocateConst #12991

Merged

masahi reviewed Oct 7, 2022

View reviewed changes

tests/python/contrib/test_hexagon/test_meta_schedule.py Outdated Show resolved Hide resolved

junrushao force-pushed the feature/2022-09-19/tune-api-refactoring branch from 7b71a2a to b3a0191 Compare October 7, 2022 02:46

spectrometerHBH merged commit 6780c9f into apache:main Oct 7, 2022

masahi reviewed Oct 7, 2022

View reviewed changes

tests/python/unittest/test_meta_schedule_tune_relay.py Show resolved Hide resolved

masahi reviewed Oct 7, 2022

View reviewed changes

masahi mentioned this pull request Oct 7, 2022

[Hexagon] Fix test_meta_schedule.py #13000

Closed

masahi reviewed Oct 7, 2022

View reviewed changes

masahi mentioned this pull request Oct 7, 2022

[Bug] Meta-scheduler tests failing in test_meta_schedule_integration.py #12732

Closed

shingjan mentioned this pull request Oct 13, 2022

Use optimize_torch in TVM's meta schedule backend pytorch/torchdynamo#1631

Merged

masahi mentioned this pull request Oct 13, 2022

[TEST] Fix the broken VNNI MetaSchedule test #13067

Merged

zxybazh mentioned this pull request Oct 19, 2022

[MetaSchedule][Minor] Restore Relay Integration Unit Test #13128

Merged

csullivan reviewed Oct 24, 2022

View reviewed changes

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022

[MetaSchedule] Tuning API cleanup & ergonomics (apache#12895)

7604dea

masahi mentioned this pull request Dec 6, 2022

[MetaSchedule] Restore num_threads parameter in tuning API #13561

Merged

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaSchedule] Tuning API cleanup & ergonomics #12895

[MetaSchedule] Tuning API cleanup & ergonomics #12895

junrushao commented Sep 25, 2022 •

edited

Loading

zxybazh commented Sep 27, 2022

junrushao commented Sep 28, 2022

junrushao commented Oct 6, 2022

masahi commented Oct 6, 2022

junrushao commented Oct 6, 2022

masahi commented Oct 7, 2022

masahi Oct 7, 2022

junrushao Oct 7, 2022

junrushao Oct 7, 2022

masahi Oct 7, 2022

junrushao Oct 7, 2022

csullivan Oct 24, 2022

masahi Oct 24, 2022

csullivan Oct 24, 2022

zxybazh Oct 24, 2022

csullivan Oct 24, 2022



		@pytest.mark.skip("Requires cascadelake")
		def test_vnni_schedule_fn_tune():

[MetaSchedule] Tuning API cleanup & ergonomics #12895

[MetaSchedule] Tuning API cleanup & ergonomics #12895

Conversation

junrushao commented Sep 25, 2022 • edited Loading

Introduction

Upgrade guide

If you are using ms.tune_relay

If you are using ms.tune_extracted_tasks

Misc changes

Performance Numbers

zxybazh commented Sep 27, 2022

junrushao commented Sep 28, 2022

junrushao commented Oct 6, 2022

masahi commented Oct 6, 2022

junrushao commented Oct 6, 2022

masahi commented Oct 7, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junrushao commented Sep 25, 2022 •

edited

Loading

If you are using `ms.tune_relay`

If you are using `ms.tune_extracted_tasks`