Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to fix ICE from re-interning an AllocId with different allocation contents #127442

Merged
merged 2 commits into from
Jul 22, 2024

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Jul 7, 2024

As far as I can tell, based on my investigation in #126741, the racy decoding scheme implemented here was never fully correct, but the arrangement of Allocations that's required to ICE the compiler requires some very specific MIR optimizations to create. As far as I can tell, GVN likes to create the problematic pattern, which is why we're noticing this problem now.

So the solution here is to not do racy decoding. If two threads race to decoding an AllocId, one of them is going to sit on a lock until the other is done.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 7, 2024
@saethlin
Copy link
Member Author

saethlin commented Jul 7, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 7, 2024
@bors
Copy link
Contributor

bors commented Jul 7, 2024

⌛ Trying commit 1486ae4 with merge 32ec295...

bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 7, 2024
Try to fix ICE from re-interning an AllocId with different allocation contents

I'm trying to fix the ICE in rust-lang#126741 and just based on println-debugging this lock scoping looks strange.

r? `@ghost`
@bors
Copy link
Contributor

bors commented Jul 7, 2024

☀️ Try build successful - checks-actions
Build commit: 32ec295 (32ec295edc7aef4bb3ba92cd3896d488e7a4c8c5)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (32ec295): comparison URL.

Overall result: ❌ regressions - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.3% [0.3%, 0.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary -0.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.7% [0.7%, 0.7%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.9% [-1.2%, -0.6%] 4
All ❌✅ (primary) - - 0

Cycles

Results (secondary -0.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.5% [-0.6%, -0.4%] 2
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 700.852s -> 698.999s (-0.26%)
Artifact size: 328.40 MiB -> 328.43 MiB (0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 7, 2024
Copy link

@aapoalas aapoalas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely looks fishy from the outside as well. Lock to get the alloc ID, release that lock and lock again go write it in? Sounds racy.

Looks good to me as a member of the peanut gallery.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 8, 2024

I'm more suprised it doesn't deadlock. Shouldn't any nested alloc id decoding now run into the lock, even single threaded?

Also, as per your analysis on the issue, it seems the core issue is Start decoding concurrently. not the holding of the lock. We should probably just sleep until the alloc id has been fully decoded in thr other thread to avoid the racy decoding

@saethlin
Copy link
Member Author

saethlin commented Jul 8, 2024

Hm, shouldn't a deadlock only arise if we happen to have a cyclic dependency between allocations, and threads happen to decode them in the opposite order? If that's correct, we'd need a program that produces such allocations to be sure that this works correctly, that sounds like a situation that wouldn't happen much naturally.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 8, 2024

After your change, you're always locking for a specific location. So if we could write

#![feature(const_ptr_write)]
#![feature(const_mut_refs)]
#![feature(const_heap)]
#![feature(core_intrinsics)]

struct List(Option<&'static List>);
const FOO: &'static List = unsafe {
    let x = std::intrinsics::const_allocate(std::mem::size_of::<List>(), std::mem::align_of::<List>()) as *mut List;
    x.write(List(Some(&*x)));
    &*x
};

without already overflowing rustc's stack on master (lol), then deserialization would now deadlock in one thread (when it wouldn't before your change).

This doesn't matter, so my surprise is resolved.

@saethlin
Copy link
Member Author

saethlin commented Jul 8, 2024

Hm. I was more concerned with this sort of structure:

struct Thing {
    inner: &'static Thing,
}

const X: Thing = Thing {
    inner: &Y,
};

const Y: Thing = Thing {
    inner: &X,
};

Is this sort of thing for sure impossible?

@oli-obk
Copy link
Contributor

oli-obk commented Jul 8, 2024

Is this sort of thing for sure impossible?

yes, constants will always evaluate other constants, even if just taking a reference to them. You need statics for this kind of thing. If we taught constants such things, we'd use a similar scheme to statics, so there would be no issue.

@saethlin saethlin force-pushed the alloc-decoding-lock branch from 1486ae4 to 963e157 Compare July 16, 2024 22:40
@saethlin
Copy link
Member Author

r? oli-obk

We should probably just sleep until the alloc id has been fully decoded in thr other thread to avoid the racy decoding

As far as I can tell, the only way to do this is to remove the in-progress decoding states, then still change the lock scope so that we hold the lock from the moment we discover that a state is Empty to when we can store back the Done state. The "in-progress" state that we're waiting to end is the Mutex being locked.

@saethlin saethlin marked this pull request as ready for review July 16, 2024 22:41
@rustbot
Copy link
Collaborator

rustbot commented Jul 16, 2024

Some changes occurred to the CTFE / Miri engine

cc @rust-lang/miri

@saethlin saethlin force-pushed the alloc-decoding-lock branch from 963e157 to 3623c76 Compare July 17, 2024 22:35
@oli-obk
Copy link
Contributor

oli-obk commented Jul 18, 2024

bors r plus

@bors
Copy link
Contributor

bors commented Jul 18, 2024

📌 Commit 3623c76 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 18, 2024
@cjgillot
Copy link
Contributor

Could you add a comment in the code summarizing the discussion about deadlocks? Looking for github review is always cumbersome when reading the code.

@saethlin
Copy link
Member Author

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 19, 2024
@saethlin
Copy link
Member Author

I don't trust my knowledge of const-eval enough to be sure my comment is correct, can one of you check it?

// deadlock with a single thread, because attempting to create such a pointer immediately
// blows the stack.
// It is also impossible to create two allocations (call them A and B) where A is a pointer to B, and B
// is a pointer to A, because const eval itself rejects such cycles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't reject them, it will cause a query cycle because eval(A) -> eval(B) -> eval(A), but that's just wording nitpicking

// require that we decode other allocations. This cannot deadlock for two reasons:
// It is impossible to create an allocation that contains a pointer to itself and thus
// deadlock with a single thread, because attempting to create such a pointer immediately
// blows the stack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time of writing this is only possible with the unstable const_allocate intrinsic, which only exists for testing purposes and has no route to stabilization

Technically custom compiler code could generate cyclic allocs directly, not sure what would happen then.

@saethlin saethlin force-pushed the alloc-decoding-lock branch from 2ab9fa2 to 107cf98 Compare July 21, 2024 16:31
@saethlin
Copy link
Member Author

Thanks! It looks like I wasn't too far off the mark, so I'll re-apply your approval in a few days, but feel free to send this to the queue earlier.

@oli-obk
Copy link
Contributor

oli-obk commented Jul 21, 2024

@bors r+

@bors
Copy link
Contributor

bors commented Jul 21, 2024

📌 Commit 107cf98 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 21, 2024
@bors
Copy link
Contributor

bors commented Jul 22, 2024

⌛ Testing commit 107cf98 with merge ae7b1c1...

@bors
Copy link
Contributor

bors commented Jul 22, 2024

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing ae7b1c1 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 22, 2024
@bors bors merged commit ae7b1c1 into rust-lang:master Jul 22, 2024
7 checks passed
@rustbot rustbot added this to the 1.82.0 milestone Jul 22, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (ae7b1c1): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.8% [0.2%, 2.5%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.7% [-1.4%, -0.3%] 7
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary -5.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.4% [-6.7%, -4.6%] 9
All ❌✅ (primary) - - 0

Cycles

Results (secondary 2.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.9% [2.9%, 2.9%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 768.535s -> 770.886s (0.31%)
Artifact size: 328.90 MiB -> 328.88 MiB (-0.01%)

@rustbot rustbot added the perf-regression Performance regression. label Jul 22, 2024
@saethlin saethlin deleted the alloc-decoding-lock branch July 22, 2024 13:57
@pnkfelix
Copy link
Member

Visiting for weekly rustc-perf triage

  • the regressions are to secondary benchmarks and this is fixing a subtle ICE that arises from a race condition (and may actually represent a chance of miscompilation, maybe?)
  • marking as triaged

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants