Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][VM] Relay VM memory liveness/lifetime analysis #10026

Merged
merged 21 commits into from
Feb 5, 2022

Conversation

altanh
Copy link
Contributor

@altanh altanh commented Jan 22, 2022

This PR adds basic memory management to the Relay VM by inserting kill annotations on variables at the end of their lifetimes. Kill annotations are translated to a new VM instruction KillRegister, which nulls out the specified register. By nulling out a register, we destroy the ObjectRef inside, which eventually leads to tensors being freed via refcounting. This approach automatically handles aliasing of tensors (e.g. via tuples, ADTs) as the aliases are reflected in the run-time refcount.

Lifetime analysis is done using standard data-flow analysis on the CFG of the post-memory-lowering IR. The main tricky bit involves respecting the VM compiler's register aliasing scheme (e.g. var-to-var bindings); an alternative approach would be to move the register aliasing logic into the VM compiler itself, so that kill annotations are only translated to a KillRegister when all aliases of a register have been killed.

This PR does not do "static" memory planning in the sense of optimizing an allocation plan. Follow-up work should revive the storage coalescing pass and do static planning within each storage (although this would need static analysis of aliasing).

With this PR, BERT-SQuAD sees around 10x memory reduction (~20GB -> ~2GB on our tested input size).

Other changes in this PR:

  • memory.alloc_storage and memory.alloc_tensor are now marked as non-stateful, since they can be safely eliminated by DCE
  • fix small bug in VM profiling when timing "unknown" instructions

TODOs:

  • conservatively support relay refs (or skip memory planning gracefully at least) refs are not supported in the VM currently (see [Relay][VM] Add support for references. #6798)
  • support closures
  • support If
  • support Match
  • add tests
  • add more docs

@altanh altanh changed the title [WIP] Relay VM memory planning [Relay][VM] Relay VM memory planning Jan 31, 2022
@jroesch
Copy link
Member

jroesch commented Jan 31, 2022

You might also want to read and delete this pass: https://github.com/apache/tvm/blob/main/python/tvm/relay/transform/memory_plan.py#L19. The main differences afaict is that it tries to detect dynamic/static regions then combine them before inserting kills.

@altanh
Copy link
Contributor Author

altanh commented Feb 1, 2022

You might also want to read and delete this pass: https://github.com/apache/tvm/blob/main/python/tvm/relay/transform/memory_plan.py#L19. The main differences afaict is that it tries to detect dynamic/static regions then combine them before inserting kills.

I'm not going to touch this since it's complimentary to the liveness analysis, but probably worth reviving this pass in C++ later to reduce the allocation overhead. That pass doesn't overlap allocations over time though so it would sort of nullify the effect of liveness analysis until static planning is added.

@altanh altanh changed the title [Relay][VM] Relay VM memory planning [Relay][VM] Relay VM memory liveness/lifetime analysis Feb 1, 2022
@altanh
Copy link
Contributor Author

altanh commented Feb 1, 2022

I wrote this as a comment in the code, but reposting here for future reference:

Current Limitations
-------------------
1. For simplicity, we only insert kills when visiting Let bindings, and always emit the kill as a
single subsequent binding. This is slightly inaccurate; for example, if the condition of an If
is dead after the test, we can immediately kill the condition in each branch:
  let %x = if (%dead_cond) {
    let %_0 = memory.kill(%dead_cond);
    ...
  } else {
    let %_1 = memory.kill(%dead_cond);
    ...
  }
as opposed to:
  let %x = if (%dead_cond) ...
  let %_0 = memory.kill(%dead_cond);

2. Killed variables are calculated as live in - live out, which misses variables that are
actually dead but not in live in. Examples include: when the last use of a var is the result
expr of an If branch; when bound vars (i.e. function inputs, pattern matched vars, dead 
bindings) are never used.

3. When the result expr of an If branch is a variable, and this expr is the last use of the
var, we cannot "kill" the var since it is being returned. The VM compiler also emits a Move
instruction to merge the branch results, which creates another ObjectRef to the Object held
by the var. The var is also not in the subsequent live-in (since it is indeed dead by this
point, see above), so it won't be killed.

However, these limitations are unlikely to cause large leaks in practice.

@altanh
Copy link
Contributor Author

altanh commented Feb 1, 2022

cc @jroesch @mbs-octoml @mbrookhart

this PR is ready for full review

Copy link
Contributor

@mbs-octoml mbs-octoml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nits & comments I think so feel free to proceed if green and follow up later.

src/runtime/vm/profiler/vm.cc Show resolved Hide resolved
cfg_.let_map[expr] = curr_node;
cfg_.reverse_post_order.push_back(curr_node);

if (const IfNode* ite = AsIgnoringOnDevice<IfNode>(inner_let_node->value)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Reminder comment that we are dealing with the 'exits' from the bb, hence only looking at the two branch exprs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if the comment I added helps at all

src/relay/backend/vm/manifest_lifetimes.cc Outdated Show resolved Hide resolved
src/relay/backend/vm/manifest_lifetimes.cc Outdated Show resolved Hide resolved
// expr of an If branch; when bound vars (i.e. function inputs, pattern matched vars, dead
// bindings) are never used.
//
// 3. When the result expr of an If branch is a variable, and this expr is the last use of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot, I missed that one. Could push the 'return kills' onto an internal stack and pop them after each join point, but agree a TODO is find.

src/relay/backend/vm/manifest_lifetimes.cc Show resolved Hide resolved
src/relay/backend/vm/manifest_lifetimes.cc Outdated Show resolved Hide resolved
@mbrookhart mbrookhart merged commit 34d70de into apache:main Feb 5, 2022
@mbrookhart
Copy link
Contributor

Thanks @altanh @mbs-octoml @jroesch

ylc pushed a commit to ylc/tvm that referenced this pull request Feb 16, 2022
* WIP VM memory planning

* tuple projection

* support if

* lint

* remove old comment

* WIP check in attempt at CFG analysis

* rewrite CFG analysis in stages, support ADTs

* lint

* fix small bug in alias elimination, try fix VM profiler error

* update DCE tests since allocations can be DCE'd

* optimize worklist to reduce runtime

* add docs, rename pass to ManifestLifetimes

* add tests, more comments, proper VM profiler fix

* lint

* ci please

* address nits

* retry ci again

* retry ci once again :)

* fix sneaky memory leak due to cyclic refs

* fix didn't work but retry ci anyway

* slightly reduce size of large pretty printer test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants