rustc_mir: implement an "lvalue reuse" optimization (aka destination propagation aka NRVO). #46321

eddyb · 2017-11-28T06:30:41Z

Replaces a chain of moves, such as a = ...; ... b = move a; ... f(&mut b) ... c = move b, with the final destination, i.e. only c = ...; ... f(&mut c); ... remains (note that borrowing works).

rust-highfive · 2017-11-28T06:30:55Z

r? @pnkfelix

(rust_highfive has picked a reviewer for you, use r? to override)

eddyb · 2017-11-28T06:30:57Z

@bors try

bors · 2017-11-28T06:31:08Z

⌛ Trying commit 69b5139 with merge cf73ed6...

rustc_mir: implement an "lvalue reuse" optimization (aka destination propagation aka NRVO). Replaces a chain of moves, such as `a = ...; ... b = move a; ... f(&mut b) ... c = move b`, with the final destination, i.e. only `c = ...; ... f(&mut c); ...` remains (note that borrowing is allowed). **DO NOT MERGE** only for testing atm. Based on #46142.

eddyb · 2017-11-28T06:31:36Z

r? @nikomatsakis
cc @Mark-Simulacrum Let's throw some benchmarks at this!

est31 · 2017-11-28T06:36:56Z

Is placement new doing the same? Will the only reason to use it then be that you like the syntax?

eddyb · 2017-11-28T06:40:37Z

@est31 No, this doesn't change execution order, placement has to allocate before evaluating the value into the allocation, while Box::new(expr) allocates (in the callee) after evaluating expr .

OTOH, this optimizations only reuses stack local destinations when doing so can preserve behavior.

bors · 2017-11-28T08:14:04Z

☀️ Test successful - status-travis
State: approved= try=True

Mark-Simulacrum · 2017-11-28T13:21:53Z

Try build added to queue. Will be complete soon (probably a couple hours).

eddyb · 2017-11-28T22:03:21Z

Assuming the perf results are accurate, this regresses compile times. It'd be a shame if we'd have to disable this, especially the first iteration which seems straight-forward (2 visits over the MIR).

Also, do we have any runtime benchmarks yet?

Mark-Simulacrum · 2017-11-29T01:00:28Z

Hold up, those perf results are invalid. If you rerun try with DEPLOY_ALT removed from

rust/.travis.yml

Line 25 in 827cb0d

- env: IMAGE=dist-x86_64-linux DEPLOY_ALT=1

then we should be able to rebenchmark and then be successful. The current perf results (I believe) are comparing a build without LLVM asserts with a LLVM-asserts enabled build.

eddyb · 2017-11-29T08:08:53Z

@bors try

bors · 2017-11-29T08:09:03Z

⌛ Trying commit f41425b with merge 9de9a00...

rustc_mir: implement an "lvalue reuse" optimization (aka destination propagation aka NRVO). Replaces a chain of moves, such as `a = ...; ... b = move a; ... f(&mut b) ... c = move b`, with the final destination, i.e. only `c = ...; ... f(&mut c); ...` remains (note that borrowing works).

bors · 2017-11-29T10:06:32Z

☀️ Test successful - status-travis
State: approved= try=True

eddyb · 2017-11-29T10:24:05Z

@Mark-Simulacrum I'm not sure my hack worked, I'm probably missing the changes to actually upload the results from #46354.

Mark-Simulacrum · 2017-11-29T14:55:48Z

I think you need to change this condition to $DEPLOY

rust/.travis.yml

Line 325 in 9de9a00

condition: $DEPLOY_ALT = 1

Mark-Simulacrum · 2017-11-29T14:56:42Z

#46354 actually just merged so I think you should be able to just rebase.

…propagation aka NRVO).

eddyb · 2017-11-29T16:29:08Z

@bors try

bors · 2017-11-29T16:29:19Z

⌛ Trying commit e87ab56 with merge 9afa5ec24b58d6d3f271a9d655e1b98c6b550ea7...

bors · 2017-11-29T18:14:23Z

☀️ Test successful - status-travis
State: approved= try=True

Mark-Simulacrum · 2017-11-29T18:38:08Z

Perf started.

eddyb · 2017-11-29T20:31:00Z

Perf results are better, although most wins are minor and there's a regression on inflate-opt.
(The inflate crate has the entirety of the static huffman table logic in one function, with a lot of macro-generated code, and presumably, very many variables, so any MIR optimization will be slow)

ghost · 2017-11-29T20:44:20Z

As a random lurker, I wonder how this PR and this LLVM patch relate. AFAICT, they do very similar optimizations but with different code representations (MIR and LLVM IR).

pcwalton · 2017-11-29T20:47:06Z

@stjepang For one, the patch in this PR can handle borrows across functions, while that LLVM patch cannot.

eddyb · 2017-11-29T20:54:30Z

@stjepang I think @pcwalton abandoned that patch in favor of MIR optimizations, which had a lower priority and very little happened since. @pcwalton's own (forward, unlike this PR) "copy propagation" MIR pass is disabled by default because it was too slow to run (missing some caching).

Also, this PR does more than optimize across basic blocks, allowing borrows without needing to track them. OTOH, LLVM could likely handle field accesses, while this PR doesn't, by itself.

Not sure how we could encode the semantics of Move (invalidating borrows) for LLVM to use.

arielb1 · 2017-11-30T15:07:38Z

I talked with @eddyb on IRC and this appears this optimization is unjustified, especially with NLL..

eddyb · 2017-11-30T15:38:14Z

On IRC @arielb1 pointed out that with NLL this might become valid:

let (a, b, x);
x = None;
loop {
    a = init;
    use(x, &mut a);
    endregion('α);
    b = move a;
    x = Some(&'α b);
}
endregion('α);

This PR, in its current state, would incorrectly reuse b's memory for a.
He also suggested using the existing moveck dataflow analyses which should allow tracking all candidates we might possibly be interested in and the interactions between them, and borrows.

nikomatsakis

OK um so I never really finished this review. I was hoping to write up a kind of description of what the code is doing. At this point I'm not sure of current status so I'll just post these incomplete and not that interesting comments for now.

@eddyb -- what is your latest thinking for how to proceed here?

nikomatsakis · 2017-11-29T06:45:07Z

src/librustc_mir/transform/reuse_lvalues.rs

+// except according to those terms.
+
+//! A pass that reuses "final destinations" of values,
+//! propagating the lvalue back through a chain of moves.


I'm missing some kind of meta comment here somewhere in this file. In my ideal world, the explanation would include the "before" and "after" MIR, in such a way that we can refer back to the example in the comments below. I'll take a stab at writing comments as my review to see if I understand what's going on. =)

nikomatsakis · 2017-11-29T06:54:13Z

src/test/mir-opt/copy_propagation.rs

@@ -22,20 +22,17 @@ fn main() {
 // START rustc.test.CopyPropagation.before.mir
 //  bb0: {
 //      ...


Can we get some tests specific to this optimization? Ideally a directory like mir-opt/reuse_lvalues/ with various corner cases, showing before/after --- but anyway at least one? =)

nikomatsakis · 2017-11-29T06:57:29Z

src/librustc_mir/transform/reuse_lvalues.rs

+            // Keep going, in case the move chain doesn't stop here.
+            self.visit_local(local, context, location);
+
+            // Cache the final result, in a similar way to union-find.


nikomatsakis · 2017-11-29T07:02:38Z

src/librustc_mir/transform/reuse_lvalues.rs

+        // as they are guaranteed to have all accesses in between.
+        // Also, the destination local of the move has to also have
+        // a single initialization (the move itself), otherwise
+        // there could be accesses that overlap the move chain.


Example:

L1 = ... ... L2 = move L1

Here, local is L1 and dest is L2. We will set the reused flag on L2 to true.

nikomatsakis · 2017-11-29T07:04:39Z

src/librustc_mir/transform/reuse_lvalues.rs

+                   &local: &Local,
+                   context: LvalueContext<'tcx>,
+                   _location: Location) {
+        let (ref mut def, ref mut mov) = self.defs_moves[local];


Nit: you can now do let (def, move) = &mut self.defs_moves[local];

nikomatsakis · 2017-11-29T07:10:50Z

src/librustc_mir/transform/reuse_lvalues.rs

+        let (ref mut def, ref mut mov) = self.defs_moves[local];
+        match context {
+            // We specifically want the first direct initialization.
+            LvalueContext::Store |


Note that Store corresponds also to things like SetDiscriminant and InlineAsm -- is that a problem here? I guess not.

carols10cents · 2017-12-12T18:32:30Z

@eddyb -- what is your latest thinking for how to proceed here?

ping @eddyb !

eddyb · 2017-12-12T18:35:47Z

@carols10cents I have discussed a new set of analyses with @nikomatsakis which should make the optimization sound, and I hope to wrap up bug hunting and get back to this PR soon.

scottmcm · 2017-12-20T05:06:24Z

Looks like this will fix #32966

eddyb · 2017-12-20T06:03:33Z

I'll close this PR as a reimplementation is required anyway, for soundness wrt loops.

eddyb · 2017-12-25T22:12:30Z

Note to self, I've tried this:

#![feature(nll)]

struct Foo(u8);

fn without_nll() {
    let (mut a, mut b);
    let r;
    let mut first = true;
    loop {
        a = Foo(0);
        if first {
            first = false;
            b = a;
        } else {
            r = &a;
            break;
        }
    }
    // `a`, `b`, and `*r` are the same location.
    // However, I don't see a way to abuse it.
}

fn with_nll() {
    let (mut a, mut b);
    let mut r = None;
    loop {
        a = Foo(0);
        // `*r` and `a` are the same location.
        drop((r, &mut a));
        b = a;
        r = Some(&b);
    }
}

fn main() {
    without_nll();
    with_nll();
}

with_nll doesn't compile yet, the region still covers too much, and it does conflict with b = a;.

without_nll is optimized, but I was hoping for an easier way to show how going multiple times around a loop, yet taking different branches inside it, could result in unsound optimizations.

As it stands, all I can do is keep this example for future mir-opt tests, and make sure it's not optimized, preferably without loop-specific special cases, just a generalized dataflow algorithm.

jrmuizel · 2018-11-22T21:29:48Z

Any chance we can see something like this resurrected?

eddyb · 2018-11-22T22:00:44Z

@jrmuizel This PR is naive and was superseded by #47954.

rust-highfive assigned pnkfelix Nov 28, 2017

rust-highfive assigned nikomatsakis and unassigned pnkfelix Nov 28, 2017

kennytm added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Nov 28, 2017

eddyb force-pushed the even-mirer-3 branch from 69b5139 to 019fdb2 Compare November 28, 2017 11:51

eddyb force-pushed the even-mirer-3 branch from 019fdb2 to f41425b Compare November 29, 2017 08:08

rustc_mir: implement an "lvalue reuse" optimization (aka destination …

e87ab56

…propagation aka NRVO).

eddyb force-pushed the even-mirer-3 branch from f41425b to e87ab56 Compare November 29, 2017 16:29

eddyb added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 30, 2017

nikomatsakis reviewed Dec 6, 2017

View reviewed changes

eddyb closed this Dec 20, 2017

eddyb deleted the even-mirer-3 branch December 20, 2017 06:03

This was referenced May 1, 2019

Unnecessary memcpy when returning a struct #57077

Closed

Do move forwarding on MIR #32966

Open

eddyb mentioned this pull request Jun 10, 2019

Large structs constructed on stack rust-random/rand#817

Closed

rustc_mir: implement an "lvalue reuse" optimization (aka destination propagation aka NRVO). #46321

rustc_mir: implement an "lvalue reuse" optimization (aka destination propagation aka NRVO). #46321

Uh oh!

Conversation

eddyb commented Nov 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented Nov 28, 2017

Uh oh!

eddyb commented Nov 28, 2017

Uh oh!

bors commented Nov 28, 2017

Uh oh!

eddyb commented Nov 28, 2017

Uh oh!

est31 commented Nov 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eddyb commented Nov 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bors commented Nov 28, 2017

Uh oh!

Mark-Simulacrum commented Nov 28, 2017

Uh oh!

eddyb commented Nov 28, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mark-Simulacrum commented Nov 29, 2017

Uh oh!

eddyb commented Nov 29, 2017

Uh oh!

bors commented Nov 29, 2017

Uh oh!

bors commented Nov 29, 2017

Uh oh!

eddyb commented Nov 29, 2017

Uh oh!

Mark-Simulacrum commented Nov 29, 2017

Uh oh!

Mark-Simulacrum commented Nov 29, 2017

Uh oh!

eddyb commented Nov 29, 2017

Uh oh!

bors commented Nov 29, 2017

Uh oh!

bors commented Nov 29, 2017

Uh oh!

Mark-Simulacrum commented Nov 29, 2017

Uh oh!

eddyb commented Nov 29, 2017

Uh oh!

ghost commented Nov 29, 2017 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcwalton commented Nov 29, 2017

Uh oh!

eddyb commented Nov 29, 2017

Uh oh!

arielb1 commented Nov 30, 2017

Uh oh!

eddyb commented Nov 30, 2017

Uh oh!

nikomatsakis left a comment

Choose a reason for hiding this comment

Uh oh!

nikomatsakis Nov 29, 2017

Choose a reason for hiding this comment

Uh oh!

nikomatsakis Nov 29, 2017

Choose a reason for hiding this comment

Uh oh!

nikomatsakis Nov 29, 2017

Choose a reason for hiding this comment

Uh oh!

nikomatsakis Nov 29, 2017

Choose a reason for hiding this comment

Uh oh!

nikomatsakis Nov 29, 2017

Choose a reason for hiding this comment

eddyb commented Nov 28, 2017 •

edited

Loading

est31 commented Nov 28, 2017 •

edited

Loading

eddyb commented Nov 28, 2017 •

edited

Loading

eddyb commented Nov 28, 2017 •

edited

Loading

ghost commented Nov 29, 2017 •

edited by ghost

Loading