Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ansor][AutoTVM v2.0] Phase 2: Update heavy operations with parallel_for #6348

Merged
merged 6 commits into from
Aug 31, 2020

Conversation

jcf94
Copy link
Contributor

@jcf94 jcf94 commented Aug 27, 2020

For the full upstream plan, see Ansor RFC.

This PR contains some small fix:

  • Use parallel_for to speed up some heavy operations
    Currently add for:
    • ComputeDAG::InferBound
    • GetPerStoreFeaturesFromStates
    • SketchPolicyNode::SampleInitPopulation(some rules used the SketchPolicyNode->rand_gen, so this part still has to be protected by a std::mutex)
    • Do we need to add parallel_for for EvolutionarySearch after [Ansor][AutoTVM v2.0] Phase 2: Evolutionary Search #6310 has been merged?
  • Add a SketchPolicy + XGBModel UT for a complete workflow test
  • Add PruneInvalidState to the RandomModel branch

cc @merrymercy @comaniac

@jcf94 jcf94 changed the title [Ansor][AutoTVM v2.0] Phase 2: Update heavy operation with parallel_for [Ansor][AutoTVM v2.0] Phase 2: Update heavy operations with parallel_for Aug 27, 2020
Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a small question.

src/auto_scheduler/search_policy/sketch_policy.cc Outdated Show resolved Hide resolved
@jcf94
Copy link
Contributor Author

jcf94 commented Aug 29, 2020

@merrymercy @FrozenGene Plz have a look when you're free. :)

out_states.push_back(std::move(out_state));
}
if (out_state.defined()) {
std::unique_lock<std::mutex> l(m);
Copy link
Member

@merrymercy merrymercy Aug 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use low-level APIs from ArrayNode to remove this mutex lock?
Because we know the number of states, we can allocate n empty State() in advance.
Do something like

std::vector<State> out_states(n, State());
parallel_for i in 0...n:
    out_state[i] = process(states[i])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our original implementation in Ansor repo just worked in this way, while if there's any inferbound failure the out_state will contains some empy State().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note in the doc string of ComputeDAG::Inferbound, added necessary PruneInvalidState to the search policy.

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jcf94 jcf94 requested a review from merrymercy August 31, 2020 01:57
@merrymercy merrymercy merged commit 3ec018d into apache:master Aug 31, 2020
@jcf94 jcf94 deleted the parallel_for_update branch September 1, 2020 03:46
kevinthesun pushed a commit to kevinthesun/tvm that referenced this pull request Sep 17, 2020
…for (apache#6348)

* Update auto_scheduler with parallel_for

* Update

* Update

* Update

* Update inferbound
kevinthesun pushed a commit to kevinthesun/tvm that referenced this pull request Sep 18, 2020
…for (apache#6348)

* Update auto_scheduler with parallel_for

* Update

* Update

* Update

* Update inferbound
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Sep 18, 2020
…for (apache#6348)

* Update auto_scheduler with parallel_for

* Update

* Update

* Update

* Update inferbound
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants