-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to fix ICE from re-interning an AllocId with different allocation contents #127442
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Try to fix ICE from re-interning an AllocId with different allocation contents I'm trying to fix the ICE in rust-lang#126741 and just based on println-debugging this lock scoping looks strange. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (32ec295): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (secondary -0.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary -0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 700.852s -> 698.999s (-0.26%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely looks fishy from the outside as well. Lock to get the alloc ID, release that lock and lock again go write it in? Sounds racy.
Looks good to me as a member of the peanut gallery.
I'm more suprised it doesn't deadlock. Shouldn't any nested alloc id decoding now run into the lock, even single threaded? Also, as per your analysis on the issue, it seems the core issue is |
Hm, shouldn't a deadlock only arise if we happen to have a cyclic dependency between allocations, and threads happen to decode them in the opposite order? If that's correct, we'd need a program that produces such allocations to be sure that this works correctly, that sounds like a situation that wouldn't happen much naturally. |
After your change, you're always locking for a specific location. So if we could write #![feature(const_ptr_write)]
#![feature(const_mut_refs)]
#![feature(const_heap)]
#![feature(core_intrinsics)]
struct List(Option<&'static List>);
const FOO: &'static List = unsafe {
let x = std::intrinsics::const_allocate(std::mem::size_of::<List>(), std::mem::align_of::<List>()) as *mut List;
x.write(List(Some(&*x)));
&*x
}; without already overflowing rustc's stack on master (lol), then deserialization would now deadlock in one thread (when it wouldn't before your change). This doesn't matter, so my surprise is resolved. |
Hm. I was more concerned with this sort of structure: struct Thing {
inner: &'static Thing,
}
const X: Thing = Thing {
inner: &Y,
};
const Y: Thing = Thing {
inner: &X,
}; Is this sort of thing for sure impossible? |
yes, constants will always evaluate other constants, even if just taking a reference to them. You need statics for this kind of thing. If we taught constants such things, we'd use a similar scheme to statics, so there would be no issue. |
1486ae4
to
963e157
Compare
r? oli-obk
As far as I can tell, the only way to do this is to remove the in-progress decoding states, then still change the lock scope so that we hold the lock from the moment we discover that a state is Empty to when we can store back the Done state. The "in-progress" state that we're waiting to end is the Mutex being locked. |
Some changes occurred to the CTFE / Miri engine cc @rust-lang/miri |
963e157
to
3623c76
Compare
Could you add a comment in the code summarizing the discussion about deadlocks? Looking for github review is always cumbersome when reading the code. |
@bors r- |
I don't trust my knowledge of const-eval enough to be sure my comment is correct, can one of you check it? |
// deadlock with a single thread, because attempting to create such a pointer immediately | ||
// blows the stack. | ||
// It is also impossible to create two allocations (call them A and B) where A is a pointer to B, and B | ||
// is a pointer to A, because const eval itself rejects such cycles. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't reject them, it will cause a query cycle because eval(A) -> eval(B) -> eval(A)
, but that's just wording nitpicking
// require that we decode other allocations. This cannot deadlock for two reasons: | ||
// It is impossible to create an allocation that contains a pointer to itself and thus | ||
// deadlock with a single thread, because attempting to create such a pointer immediately | ||
// blows the stack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the time of writing this is only possible with the unstable const_allocate
intrinsic, which only exists for testing purposes and has no route to stabilization
Technically custom compiler code could generate cyclic allocs directly, not sure what would happen then.
2ab9fa2
to
107cf98
Compare
Thanks! It looks like I wasn't too far off the mark, so I'll re-apply your approval in a few days, but feel free to send this to the queue earlier. |
@bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (ae7b1c1): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (secondary -5.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 2.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 768.535s -> 770.886s (0.31%) |
Visiting for weekly rustc-perf triage
@rustbot label: +perf-regression-triaged |
As far as I can tell, based on my investigation in #126741, the racy decoding scheme implemented here was never fully correct, but the arrangement of Allocations that's required to ICE the compiler requires some very specific MIR optimizations to create. As far as I can tell, GVN likes to create the problematic pattern, which is why we're noticing this problem now.
So the solution here is to not do racy decoding. If two threads race to decoding an AllocId, one of them is going to sit on a lock until the other is done.