Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary support for MIR in trans #29217

Merged
merged 18 commits into from
Nov 4, 2015
Merged

Conversation

nikomatsakis
Copy link
Contributor

This branch implements a variant of trans that is based on MIR. It is very incomplete (intentionally), and had only the goal of laying out enough work to enable more incremental follow-on patches. Currently, only fns tagged with #[rustc_mir] use the new trans code. I plan to build up a meta-issue as well that tracks the various "not-yet-implemented" points. The only fn that has been tested so far is this amazingly complex "spike" fn:

#[rustc_mir]
fn sum(x: i32, y: i32) -> i32 {
    x + y
}

In general, the most interesting commit is the last one. There are some points on which I would like feedback from @rust-lang/compiler:

  • I did not use Datum. Originally, I thought that maybe just a ValueRef would be enough but I wound up with two very simple structures, LvalueRef and OperandRef, that just package up a ValueRef and a type. Because of MIR's structure, you don't wind up mixing by-ref and by-value so much, and I tend to think that a thinner abstraction layer is better here, but I'm not sure.
  • Related to the above, I expect that sooner or later we will analyze temps (and maybe variables too) to find those whose address is never taken and which are word-sized and which perhaps meet a few other criteria. For those, we'll probably want to avoid the alloca, just because it means prettier code.
  • I generally tried to re-use data structures from elsewhere in trans, though I'm sure we can trim these down.
  • I didn't do any debuginfo primarily because it seems to want node-ids and we have only spans. I haven't really read into that code so I don't know what's going on there.

r? @nrc

@alexcrichton
Copy link
Member

@nikomatsakis
Copy link
Contributor Author

@nrc says he won't have time to review this just now, so I'm going to pick on @dotdash. Let me know if you don't have time. :)

r? @dotdash

@rust-highfive rust-highfive assigned dotdash and unassigned nrc Oct 22, 2015
@nikomatsakis
Copy link
Contributor Author

Note though that the code is intentionally a WIP :)

}
};

let old_region_bounds_pairs_len = self.region_bound_pairs.len();

// Collect the types from which we create inferred bounds.
// For the return type, if diverging, substitute `bool` just
// because it will have no effect.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds something like what would get fixed by making diverging a first class type, no?

@bors
Copy link
Contributor

bors commented Oct 25, 2015

☔ The latest upstream changes (presumably #29254) made this pull request unmergeable. Please resolve the merge conflicts.

from_end: false,
min_length: _ } => {
let base_ty = tr_base.ty.to_ty(tcx);
let lloffset = common::C_u32(bcx.ccx(), offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using C_uint and ConstantIndex be adjusted to use a uint as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, thinking about it, that would make the available values differ when cross-compiling. 32bit at least gives consistent results for now.

@dotdash
Copy link
Contributor

dotdash commented Oct 26, 2015

A few things ended up a commit early or late, LGTM otherwise and feels a lot cleaner than our current trans, though that's probably to be expected :-)

I like the more thinner abstraction compared to Datum, but that's just a gut feeling. Besides providing neater IR. analyzing temps and avoiding allocas will probably be a requirement for proper usage of the expect intrinsic IIRC. Maybe @Aatch can comment on that.

I can't really comment on the debuginfo story either. While I know a bit about the DI types, I have no idea how the mapping to source locations work.

r=me with the tidy error fixed

@Aatch
Copy link
Contributor

Aatch commented Oct 27, 2015

I'd like to see rvalues translated in a way that doesn't always require a reified destination. We have a distinction in the MIR between real locations and temporaries, we should make use of it. Trying to deal with the issue of (almost) always needing a real destination causes enough problems in the current trans code that I'd hate to see the same mistake repeated here.

Looking at the code, extending LValueRef to indicate by-value or by-ref, then passing that directly into trans_rvalue is probably enough. With the ValueRef being write-once in the by-value case. In the creation of the MirContext, the LValueRefs can then be created with the appropriate setting. Even if right now we just keep it as "always by-ref", it provides a good base for handling by-value-only lvalues in the future.

Right now the code is set up to assume that LValueRefs are always allocas, which seems inconsistent with the idea that might want to avoid unnecessary allocas in the future. I'd like to see at least a basic handling of that case (i.e. by-value LValueRef) before this lands, even if it's peppered with unimplemented!() or similar.

@nikomatsakis
Copy link
Contributor Author

On Mon, Oct 26, 2015 at 08:21:02PM -0700, James Miller wrote:

Right now the code is set up to assume that LValueRefs are always allocas, which seems inconsistent with the idea that might want to avoid unnecessary allocas in the future. I'd like to see at least a basic handling of that case (i.e. by-value LValueRef) before this lands, even if it's peppered with unimplemented!() or similar.

I was debating about the best way to handle this. I am inclined to
think an LvalueRef should always be an alloca -- that is, an Lvalue
is a memory location, full stop.

My plan was to identify write-once temporaries whose address is never
used and which are of immediate type and then throw those into a set
of "immediate temporaries". Then when we saw an assignment to an
immediate temporary, we would store an OperandRef into a different
table (i.e., there would never be an LvalueRef for that temporary,
because there is no lvalue).

You're right that this will require isolating a path for processing
rvalues that does not take a destination. I figured we would use a
similar delegating scheme, where for many rvalue types (e.g., a+b)
this is the primary version, and the "into" version just does a store.

Anyway, I can try to mock this up to show you what I mean. I'm not
sure whether it makes sense to do it in this PR or not, though.

@Aatch
Copy link
Contributor

Aatch commented Oct 29, 2015

@nikomatsakis well either way I think basic "support" for it should be in the initial PR. Just having the delegation scheme in place should be enough to start. I just really don't want to end up with a massive amount of refactoring work because it was delayed until later and further changes didn't consider it. It'll be way easier to say "use this API instead" in a code review if said API actually exists.

@eddyb
Copy link
Member

eddyb commented Oct 29, 2015

I have to agree with @Aatch, refactoring the existing trans code to support that optimization ended up reaching effectively all of trans and I have no idea how that could have been reviewed or merged.

@nikomatsakis
Copy link
Contributor Author

OK, seems reasonable, I'll take a stab at implementing what I had in mind.

On Thu, Oct 29, 2015 at 3:17 AM, Eduard-Mihai Burtescu <
notifications@github.com> wrote:

I have to agree with @Aatch https://github.com/Aatch, refactoring the
existing trans code to support that optimization ended up reaching
effectively all of trans and I have no idea how that could have been
reviewed or merged.


Reply to this email directly or view it on GitHub
#29217 (comment).

//
// Note that this is currently duplicated with src/libcore/ops.rs
// which does the same thing, and it would be nice to perhaps unify
// these two implementations on day! Also note that we call `fmod`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "on day" -> "one day"

@bors
Copy link
Contributor

bors commented Nov 1, 2015

☔ The latest upstream changes (presumably #29486) made this pull request unmergeable. Please resolve the merge conflicts.

@michaelwoerister
Copy link
Member

I just now saw the debuginfo-related things here. Debuginfo source locations in LLVM not only need the span but also the lexical scope of each source location. This information is later on used in the debugger to determine which variables are currently in scope. For this purpose we currently build a "scope map" for each function, which maps NodeIDs to the appropriate DI/DWARF scope descriptor.

@nikomatsakis
Copy link
Contributor Author

@michaelwoerister ok -- we do preserve scope information in MIR at the moment -- I envisioned this as temporary, but it seems like there are more and more reasons for it to stay around -- we should chat briefly about how best to connect those two pieces.

@nikomatsakis nikomatsakis force-pushed the mir-trans branch 2 times, most recently from 4ad49d7 to 98bd698 Compare November 2, 2015 14:45
@nikomatsakis
Copy link
Contributor Author

@dotdash @eddyb @Aatch the last commit now illustrates the process I had in mind. As you can see, this path is never (currently) used.

llarg
} else if common::type_is_fat_ptr(tcx, arg_ty) {
// we pass fat pointers as two words, but we want to
// represent them internally as a pointer two two words,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'to two words'

@Aatch
Copy link
Contributor

Aatch commented Nov 3, 2015

Looks good. Analysing the MIR can be done at a later date, but now the code has to be able to handle non-alloca LValues, even if they don't come up right now.

@nikomatsakis
Copy link
Contributor Author

@Aatch hmm, I had a commit to that effect, I must have accidentally dropped the hunks from the patch. I'll maybe just add one that builds up the hashmap, it's not a lot more code, it'd be nice to see the system working.

@nikomatsakis
Copy link
Contributor Author

@Aatch ok, see the latest pushes (I rebased, but I kept the commits separate so you could see what has changed). In particular d328072 adds the code you were referring to (handle references to temps that have no alloca), and 2b9233a actually skips allocas when possible. Under these changes, my sample test case produces the following IR:

"BB(0)":                                          ; preds = %entry-block
  %2 = load i32, i32* %arg0, align 4
  store i32 %2, i32* %x, align 4
  %3 = load i32, i32* %arg1, align 4
  store i32 %3, i32* %y, align 4
  %4 = load i32, i32* %x
  %5 = load i32, i32* %y
  %6 = add i32 %4, %5
  br label %"BB(1)"

There is still a bit of indirection due to the named variables x and y, but you can see that the temporaries don't have allocas anymore. Instead, they are represented directly by %4 and %5.

@nikomatsakis
Copy link
Contributor Author

Ok, so I added enough new code that somebody ought to review it, I think. @dotdash are you up to reviewing those last four or five commits?

@dotdash
Copy link
Contributor

dotdash commented Nov 3, 2015

I'll take a look this evening

2015-11-03 15:05 GMT+01:00 Niko Matsakis notifications@github.com:

Ok, so I added enough new code that somebody ought to review it, I
think. @dotdash https://github.com/dotdash are you up to reviewing
those last four or five commits?


Reply to this email directly or view it on GitHub
#29217 (comment).

@dotdash
Copy link
Contributor

dotdash commented Nov 3, 2015

The alloca avoidance looks like I had expected, so r=me with nits and that fat-pointer thing fixed.

If you feel like it, it might be nice to check which codegen tests already survive being built through MIR and having them duplicated to cover both current trans and MIR trans, but I don't feel that's something that has to be done right now.

@nikomatsakis
Copy link
Contributor Author

@dotdash

If you feel like it, it might be nice to check which codegen tests already survive being built through MIR and having them duplicated to cover both current trans and MIR trans, but I don't feel that's something that has to be done right now.

@brson and I were discussing an alternative of possibly setting up some kind of "arewemiryet.com" infrastructure that would regularly test with MIR enabled and display which tests are failing with MIR. And once all tests work, it would test which crates are failing in the ecosystem.

In any case, I think the support is as yet still a bit incomplete.

@nikomatsakis
Copy link
Contributor Author

@bors r=dotdash

@bors
Copy link
Contributor

bors commented Nov 3, 2015

📌 Commit b46c0fc has been approved by dotdash

@bors
Copy link
Contributor

bors commented Nov 4, 2015

⌛ Testing commit b46c0fc with merge 22d8bc5...

@bors
Copy link
Contributor

bors commented Nov 4, 2015

💔 Test failed - auto-linux-64-x-android-t

@nikomatsakis
Copy link
Contributor Author

@bors r=dotdash

@bors
Copy link
Contributor

bors commented Nov 4, 2015

📌 Commit e787863 has been approved by dotdash

bors added a commit that referenced this pull request Nov 4, 2015
This branch implements a variant of trans that is based on MIR. It is very incomplete (intentionally), and had only the goal of laying out enough work to enable more incremental follow-on patches. Currently, only fns tagged with `#[rustc_mir]` use the new trans code. I plan to build up a meta-issue as well that tracks the various "not-yet-implemented" points. The only fn that has been tested so far is this amazingly complex "spike" fn:

```rust
#[rustc_mir]
fn sum(x: i32, y: i32) -> i32 {
    x + y
}
```

In general, the most interesting commit is the last one. There are some points on which I would like feedback from @rust-lang/compiler:

- I did not use `Datum`. Originally, I thought that maybe just a `ValueRef` would be enough but I wound up with two very simple structures, `LvalueRef` and `OperandRef`, that just package up a `ValueRef` and a type. Because of MIR's structure, you don't wind up mixing by-ref and by-value so much, and I tend to think that a thinner abstraction layer is better here, but I'm not sure.
- Related to the above, I expect that sooner or later we will analyze temps (and maybe variables too) to find those whose address is never taken and which are word-sized and which perhaps meet a few other criteria. For those, we'll probably want to avoid the alloca, just because it means prettier code.
- I generally tried to re-use data structures from elsewhere in trans, though I'm sure we can trim these down.
- I didn't do any debuginfo primarily because it seems to want node-ids and we have only spans. I haven't really read into that code so I don't know what's going on there.

r? @nrc
@bors
Copy link
Contributor

bors commented Nov 4, 2015

⌛ Testing commit e787863 with merge a216e84...

@bors bors merged commit e787863 into rust-lang:master Nov 4, 2015
@alexcrichton
Copy link
Member

❇️

@nikomatsakis nikomatsakis deleted the mir-trans branch March 30, 2016 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.