Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix repr(align) enum handling #96814

Merged
merged 2 commits into from
Jul 6, 2022
Merged

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented May 7, 2022

enum, for better or worse, supports repr(align). That has already caused a bug in #92464, which was "fixed" in #92932, but it turns out that that fix is wrong and caused #96185.

So this reverts #92932 (which fixes #96185), and attempts another strategy for fixing #92464: special-case enums when doing a cast, re-using the code to load the discriminant rather than assuming that the enum has scalar layout. This works fine for the interpreter.

However, #92464 contained another testcase that was previously not in the test suite -- and after adding it, it ICEs again. This is not surprising; codegen needs the same patch that I did in the interpreter. Probably this has to happen around here. Unfortunately I don't know how to do that -- the interpreter can load a discriminant from an operand, but codegen can only do that from a place. @oli-obk @eddyb @bjorn3 any idea?

@rust-highfive
Copy link
Collaborator

Some changes occured to the CTFE / Miri engine

cc @rust-lang/miri

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label May 7, 2022
@rust-highfive
Copy link
Collaborator

r? @jackh726

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 7, 2022
@rust-log-analyzer

This comment has been minimized.

Copy link
Member

@eddyb eddyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with comment fix (just the typo would be enough, but I got side-tracked using the change suggestion UI and added some more details)

compiler/rustc_middle/src/ty/layout.rs Outdated Show resolved Hide resolved
compiler/rustc_const_eval/src/interpret/cast.rs Outdated Show resolved Hide resolved
@eddyb
Copy link
Member

eddyb commented May 7, 2022

However, #92464 contained another testcase that was previously not in the test suite -- and after adding it, it ICEs again. This is not surprising; codegen needs the same patch that I did in the interpreter. Probably this has to happen around here. Unfortunately I don't know how to do that -- the interpreter can load a discriminant from an operand, but codegen can only do that from a place. @oli-obk @eddyb @bjorn3 any idea?

Ah I missed this, right, so this PR not ready for merging.
There's also this very interesting piece of code I noticed while looking around codegen:

if let Abi::Scalar(scalar) = operand.layout.abi {
if let Int(_, s) = scalar.primitive() {
// We use `i1` for bytes that are always `0` or `1`,
// e.g., `#[repr(i8)] enum E { A, B }`, but we can't
// let LLVM interpret the `i1` as signed, because
// then `i1 1` (i.e., E::B) is effectively `i8 -1`.
signed = !scalar.is_bool() && s;
if !scalar.is_always_valid(bx.cx())
&& scalar.valid_range(bx.cx()).end
>= scalar.valid_range(bx.cx()).start
{
// We want `table[e as usize ± k]` to not
// have bound checks, and this is the most
// convenient place to put the `assume`s.
if scalar.valid_range(bx.cx()).start > 0 {
let enum_value_lower_bound = bx.cx().const_uint_big(
ll_t_in,
scalar.valid_range(bx.cx()).start,
);
let cmp_start = bx.icmp(
IntPredicate::IntUGE,
llval,
enum_value_lower_bound,
);
bx.assume(cmp_start);
}
let enum_value_upper_bound = bx
.cx()
.const_uint_big(ll_t_in, scalar.valid_range(bx.cx()).end);
let cmp_end = bx.icmp(
IntPredicate::IntULE,
llval,
enum_value_upper_bound,
);
bx.assume(cmp_end);
}
}
}

So the discriminant reading should happen before that, with let llval = operand.immediate(); presumably special-cased when operand.layout.ty.is_enum().
The llval value just before the actual int2int cast would be identical to this I'm guessing:

mir::Rvalue::Discriminant(ref place) => {
let discr_ty = rvalue.ty(self.mir, bx.tcx());
let discr_ty = self.monomorphize(discr_ty);
let discr = self
.codegen_place(&mut bx, place.as_ref())
.codegen_get_discr(&mut bx, discr_ty);
(
bx,
OperandRef {
val: OperandValue::Immediate(discr),
layout: self.cx.layout_of(discr_ty),
},
)
}


This suggests another avenue for these casts: enum->integer casts should be lowered in MIR to Rvalue::Discriminant( (which would produce an integer of the discriminant type) followed by the original cast to integer. It also has the advantage that it would work on enums with data just as well, if we ever want to support those in as casts.

@eddyb
Copy link
Member

eddyb commented May 7, 2022

cc @nagisa wrt the above comment, and because of the codegen logic that #75529 touched - if we go the route of not having direct enum->integer casts in MIR (but instead relying on Rvalue::Discriminant, we would have no obvious place to put that assume.

However, it strikes me that today enum->integer casts can be by-value in codegen, whereas Rvalue::Discriminant is memory-only (i.e. requires a PlaceRef) for now. So instead of the assume we'd get a load with !range metadata attached, that might be just as good?

(And/or if/when we get the ability to read discriminants from OperandRefs directly, that could be where the assume would go, replacing the !range metadata?)

@RalfJung
Copy link
Member Author

RalfJung commented May 7, 2022

The llval value just before the actual int2int cast would be identical to this I'm guessing:

I tried that, but couldn't complete the type golf since I didn't have a place to work with.

This suggests another avenue for these casts: enum->integer casts should be lowered in MIR to Rvalue::Discriminant( (which would produce an integer of the discriminant type) followed by the original cast to integer. It also has the advantage that it would work on enums with data just as well, if we ever want to support those in as casts.

Yeah that would be nice, make both Miri's and codegen's job much easier. :D
But MIR building is even scarier than codegen, so I doubt I can do that...

@nagisa
Copy link
Member

nagisa commented May 7, 2022

The enum as scalar cast was always intended to behave as if it was reading out a discriminant, wasn’t it? I'm quite surprised this isn't already happening.


I'm perfectly content with the idea of us losing assumes that have been added before if it helps some agenda that's grander than micro-optimizing a handcrafted test case. I said as such in the linked PR:

I don’t mind adding this either, but I wouldn’t hold my breath that any specific behaviour will last, especially across backend versions.

!range definitely sounds much more palatable and resilient approach too.

@RalfJung
Copy link
Member Author

RalfJung commented May 8, 2022

The enum as scalar cast was always intended to behave as if it was reading out a discriminant, wasn’t it? I'm quite surprised this isn't already happening.

Good, so looks like we have broad agreement here. Then we seem to have 2 options:

  • Either this happens already during MIR building. Then we need someone familiar with that code to pick up this PR; that's a bit too much foreign code for me to pick up right now.
  • Or the codegen backends all do something like that themselves. I did that for Miri; with a little help hopefully I can get it done for the SSA backend. The main problem is that I have an OperandRef but to call the get disicriminant code I need a PlaceRef, and I don't know how to convert from one to the other. In Miri we have force_allocation. In codegen, does something like that exist -- is that even possible? Or do we need to adjust the SSA logic to not treat "enums that are inputs to cast" as by-value so that we are sure we have a by-ref here and somehow can get that by-ref information out of the OperandRef?

Doing it in the MIR seems more elegant, so if we can get some MIR people to help here that would be great. :)

@rust-log-analyzer

This comment has been minimized.

@oli-obk
Copy link
Contributor

oli-obk commented May 8, 2022

  • Then we need someone familiar with that code to pick up this PR; that's a bit too much foreign code for me to pick up right now.

I'll create a PR

@oli-obk
Copy link
Contributor

oli-obk commented May 9, 2022

I'll create a PR

done: #96862

I'm perfectly content with the idea of us losing assumes that have been added before if it helps some agenda that's grander than micro-optimizing a handcrafted test case.

We lost some of them. I tried to get them back by telling LLVM more things about the result of the discriminant (the input automatically gets that via Operand::load). That ended up giving lots of miscompilations. I think the right level of changing things would be in LLVM opts, as it works for enum Foo { A, B }, but just about nothing else (like not even enum Foo { A = 1, B = 2 } or various other repesentation changes).

@bors
Copy link
Contributor

bors commented May 10, 2022

☔ The latest upstream changes (presumably #96891) made this pull request unmergeable. Please resolve the merge conflicts.

@jackh726
Copy link
Member

r? @eddyb

@rust-highfive rust-highfive assigned eddyb and unassigned jackh726 May 22, 2022
@RalfJung RalfJung added S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 27, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 5, 2022
Change enum->int casts to not go through MIR casts.

follow-up to rust-lang#96814

this simplifies all backends and even gives LLVM more information about the return value of `Rvalue::Discriminant`, enabling optimizations in more cases.
@rustbot
Copy link
Collaborator

rustbot commented Jul 5, 2022

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

@RalfJung
Copy link
Member Author

RalfJung commented Jul 5, 2022

Now that #96862 landed, this becomes easy. :) (Other than convincing github to not show outdated diffs.)
@oli-obk ready for review again.

@@ -11,5 +11,13 @@ enum Aligned {
fn main() {
let aligned = Aligned::Zero;
let fo = aligned as u8;
println!("foo {}",fo);
println!("foo {}", fo);
println!("{}", tou8(Aligned::Zero));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an assertion instead of a print? Otherwise we're only testing that this code runs, not that it does the right thing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added assertions but kept the prints, just in case they were relevant to trigger the ICE.

Copy link
Contributor

@oli-obk oli-obk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with an explanation for the prints or changed to an assert

@rust-log-analyzer

This comment has been minimized.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 5, 2022

@bors r=oli-obk

@bors
Copy link
Contributor

bors commented Jul 5, 2022

📌 Commit d5721ce has been approved by oli-obk

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. labels Jul 5, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 5, 2022
…laumeGomez

Rollup of 6 pull requests

Successful merges:

 - rust-lang#95503 (bootstrap: Allow building individual crates)
 - rust-lang#96814 (Fix repr(align) enum handling)
 - rust-lang#98256 (Fix whitespace handling after where clause)
 - rust-lang#98880 (Proper macOS libLLVM symlink when cross compiling)
 - rust-lang#98944 (Edit `rustc_mir_dataflow::framework::lattice::FlatSet` docs)
 - rust-lang#98951 (Update books)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 3e802d7 into rust-lang:master Jul 6, 2022
@rustbot rustbot added this to the 1.64.0 milestone Jul 6, 2022
@RalfJung RalfJung mentioned this pull request Jul 6, 2022
@RalfJung RalfJung deleted the enum-repr-align branch July 6, 2022 02:27
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Jul 6, 2022
assert Scalar sanity

With rust-lang#96814 having landed, finally our `Scalar` layouts have the invariants they deserve. :)
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this pull request Jul 18, 2022
Change enum->int casts to not go through MIR casts.

follow-up to rust-lang/rust#96814

this simplifies all backends and even gives LLVM more information about the return value of `Rvalue::Discriminant`, enabling optimizations in more cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
10 participants