-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect when pthread_mutex_t is moved #3784
Conversation
9ec55eb
to
7b044ba
Compare
@rustbot ready |
Hi! Just wanted to quickly check if this fell through the cracks. (But no pressure if you are busy!). Thanks! |
It's in my list. :) Sorry for the slow reply. Too many things that need to be done...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for giving this a shot! I like the AdditionalMutexData
part, but the way it gets used in unix/sync
should be changed a bit -- see the comments in the review.
src/concurrency/sync.rs
Outdated
@@ -214,6 +246,8 @@ pub(super) trait EvalContextExtPriv<'tcx>: crate::MiriInterpCxExt<'tcx> { | |||
} else { | |||
let new_index = this.machine.sync.mutexes.push(Default::default()); | |||
assert_eq!(next_index, new_index); | |||
let data = initialize_data()?; | |||
this.machine.sync.mutexes[new_index].data = data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of initializing with default values and then changing them, please immediately push the right values.
src/concurrency/sync.rs
Outdated
pub kind: MutexKind, | ||
|
||
/// The address of the mutex. | ||
pub address: Pointer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we really only care about the address, this should be a u64
. Currently you are also storing provenance but we don't actually care about that.
src/shims/unix/sync.rs
Outdated
@@ -118,13 +118,29 @@ fn mutex_get_id<'tcx>( | |||
ecx: &mut MiriInterpCx<'tcx>, | |||
mutex_op: &OpTy<'tcx>, | |||
) -> InterpResult<'tcx, MutexId> { | |||
let address = ecx.deref_pointer_as(mutex_op, ecx.libc_ty_layout("pthread_mutex_t"))?.ptr(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let address = ecx.deref_pointer_as(mutex_op, ecx.libc_ty_layout("pthread_mutex_t"))?.ptr(); | |
// FIXME: might be worth changing mutex_get_or_create_id to take the mplace | |
// rather than the `OpTy`. | |
let address = ecx.deref_pointer_as(mutex_op, ecx.libc_ty_layout("pthread_mutex_t"))?.ptr(); |
☔ The latest upstream changes (presumably #3823) made this pull request unmergeable. Please resolve the merge conflicts. |
@rustbot author |
4e17711
to
cd25fd2
Compare
(please write @ |
Yeah, I wanted to post a comment first with a few things I am not sure are correct. I realized it was late after I pushed the changes so I decided to do that today. sorry if it caused confusion. Here is what I wanted to say: To check for the static initializers I have to ignore a "header" in the This also solves another issue. When i was reading the full contents of the mutex, miri thought there was a race condition. I did not spend too much time debugging this. But I believe it is because when trying to lock, for example, a mutex that was initialized with a static initializer, in thread A, all the bytes of Finally, instead of adding a @rustbot ready |
cd25fd2
to
2e1a917
Compare
Ah yes, this has to be done quite carefully to avoid races. I'd say for now just scratch this sanity check. Checking just some of the bytes is a bit too ugly IMO, after all we don't have to catch these kinds of bugs anyway. So let's just use the "zero ID" as indicator that this must be initialized and not do a separate check for now; we can explore how to add such a check separately in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a separate create_id
leads to some duplication in the API surface -- we'll need this for all types that have static and explicit initializers. But on the other hand, having the "explicit initialization" separate from "initialize from static initializer" is also nice, so yeah let's go with that.
src/shims/unix/sync.rs
Outdated
} | ||
|
||
fn mutex_set_kind<'tcx>( | ||
/// Checks if a mutex has been moved since its first use and returns an error if it has. | ||
fn mutex_check_moved<'tcx>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only called once, please inline it at the call site.
src/shims/unix/sync.rs
Outdated
) | ||
/// Returns the kind of a mutex, based on its contents. | ||
/// The contents must not contain the initial "header" where the mutex id is stored. | ||
fn kind_from_contents<'tcx>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind_from_static_initializer
or so would be better IMO, making it clear that we're parsing target-specific constants here but never our own data.
@@ -90,79 +90,120 @@ fn mutex_id_offset<'tcx>(ecx: &MiriInterpCx<'tcx>) -> InterpResult<'tcx, u64> { | |||
Ok(offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a FIXME to the sanity check logic above that we should also check the other static initializers for platforms where they exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still needed after you checked that there are no other other initializers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it needs to be checked for Linux.
src/shims/unix/sync.rs
Outdated
ecx.eval_path(&["libc", "PTHREAD_MUTEX_INITIALIZER"]).assert_mem_place().ptr(); | ||
if contents == get_contents(ecx, default_initializer)? { | ||
return Ok(MutexKind::new(ecx.eval_libc_i32("PTHREAD_MUTEX_DEFAULT"))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still makes the assumption that the platform-specific field that encodes the mutex kind in the static initializer is after the "header". So it's not enough to be sure we are matching platform behavior, and it's not clean enough IMO to keep it as a half-measure.
I also just realized that this function is also called after the new ID has been written, so a full comparison with pthread_mutex_t
would anyway not work.
So I think the best option is to have a match &*ecx.tcx.sess.target.os
inside this function and then determine for each platform what we have to read to find the initial mutex kind -- basically exactly what mutex_get_kind
did before this PR, but only doing it here when we "parse" a static initializer, rather than doing it each time the mutex is used. On Linux this would read 4 bytes at offset if ecx.pointer_size().bytes() == 8 { 16 } else { 12 }
. For the other supported targets we'll have to check whether they have any static initializers other than PTHREAD_MUTEX_DEFAULT. And if any platform adds such initializers in the future, Miri will just ignore them until we realize that... maybe one day we can do the comparison with PTHREAD_MUTEX_DEFAULT
but it will require reworking get_or_create_id
a bit more so that create_obj
is called before the new ID is written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked and none of the other targets has any other static initializers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. I am not sure I interpreted all of your comment correctly, so let me know.
@@ -573,8 +623,6 @@ pub trait EvalContextExt<'tcx>: crate::MiriInterpCxExt<'tcx> { | |||
throw_ub_format!("destroyed a locked mutex"); | |||
} | |||
|
|||
// Destroying an uninit pthread_mutex is UB, so check to make sure it's not uninit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please preserve this comment, it is still relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rustbot author
src/shims/unix/sync.rs
Outdated
ecx.eval_path(&["libc", "PTHREAD_MUTEX_INITIALIZER"]).assert_mem_place().ptr(); | ||
if contents == get_contents(ecx, default_initializer)? { | ||
return Ok(MutexKind::new(ecx.eval_libc_i32("PTHREAD_MUTEX_DEFAULT"))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked and none of the other targets has any other static initializers.
@rustbot ready |
src/shims/unix/sync.rs
Outdated
ecx.eval_path(&["libc", "PTHREAD_MUTEX_INITIALIZER"]).assert_mem_place().ptr(); | ||
if contents == get_contents(ecx, default_initializer)? { | ||
// Only linux has static initializers other than PTHREAD_MUTEX_DEFAULT. | ||
if &*ecx.tcx.sess.target.os != "linux" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make this an exhaustive match, like the old mutex_get_kind
. When we add a new target we need to check that this is still true.
src/shims/unix/sync.rs
Outdated
ecx.machine.layouts.i32, | ||
)? | ||
.to_i32()?; | ||
return Ok(MutexKind::new(kind)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check that kind
is a valid mutex kind.
In fact, probably it would make sense to make this an enum? Then all the other mutex functions won't have to compare this with libc constants any more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re making this an enum:
I don't disagree, but I am a bit unsure on how to do this. In my mind ideally the MutexKind
would be a type known only to the pthreads implementation, but currently it lives with the generic mutex implementation. At least by making this an i32
it would allow other mutex APIs other than pthreads to possibly reuse it. However if we make it an enum it might be too coupled with pthreads terminology.
As the code is now, we can't make the AdditionalMutexData
generic so that each mutex implementation can define its own type to hold whatever data they want.
On the other hand this might be abstracting way to soon since there is no other implementation that needs additional data. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the code is now, we can't make the AdditionalMutexData generic so that each mutex implementation can define its own type to hold whatever data they want.
We could use dyn Any
trickery.
But IMO it's not worth it. We can just say the enum lists all mutex kinds of all implementations. 🤷 You can mark the enum #[non_exhaustive]
to drive home that point, and force all matches to end with _ => panic!("pthread mutex with invalid mutex kind")
or so.
@rustbot ready |
src/concurrency/sync.rs
Outdated
@@ -69,16 +69,13 @@ declare_id!(MutexId); | |||
/// The mutex kind. | |||
/// The meaning of this type is defined by concrete mutex implementations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That line of the comment seems outdated now.
@@ -0,0 +1,72 @@ | |||
//@ignore-target-windows: No pthreads on Windows | |||
//@[unlock_register]only-target-linux: recursive initializers are non-standard. | |||
//@revisions: lock trylock unlock_register unlock_detect init |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please declare the revisions before using them.
Also, I am confused by the names. lock
really is static_initializer
, right? init
is initializer_function
. And why is unlock_register
only for Linux? Recursive mutexes exist on all platforms, they just don't have a static initializer on non-Linux.
Maybe it just needs comments explaining what these variants are testing, but currently I am quite confused. I expected only 2 tests -- one that uses the static initializer and one that uses pthread_mutex_init
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the tests:
In my mind what needs to be tested is the ability of the pthread_mutex_*
functions to record and then later detect a moved pthread_mutex_t
. So for each such function I call it two times. If the move is caught it means that the function both recorded the original address and detected the move. At least that's the intention.
For the unlock
function however I needed a slight different approach. Static initializers are only supported on linux, and unlock
can only be called on recursive mutexes if they are unlocked. So I used two different revisions, one to test that unlock
correctly registers the address (with a static initializer), and one to test that it detects a move, when called after init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks.
However, given our implementation, we know this is all the same codepath, so it doesn't seem worth testing all combinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack, removed the extra tests.
@rustbot ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM, just two comment nits. Then please squash the commits.
a892791
to
aba9bb2
Compare
done thanks! |
Thanks. :-) @bors r+ |
I only just noticed that you even suggested this initially. :D
If you want to do this in a follow-up PR, let me know. :) |
☀️ Test successful - checks-actions |
What I am not sure about this PR is how to support storing the additional mutex data like its address and kind. If I understand correctly the
concurrency::sync::Mutex
struct is to be used by any mutex implementation. This possibly means that different implementation might want to store different data in the mutex. So any additional data should be implementation defined somehow. Solutions that come to mind:Box<dyn Any>
and the implementations can downcast their data when they fetch them.static mut
map betweenMutexID
s and the additional data.Let me know
Fixes #3749