Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using std::ptr::copy and similar methods, does the memory need to be valid for T? #330

Closed
5225225 opened this issue Apr 24, 2022 · 9 comments · Fixed by rust-lang/rust#99084

Comments

@5225225
Copy link

5225225 commented Apr 24, 2022

Plenty of methods in std::ptr take a generic T and use *mut/const T, but does this mean that the following code is UB?

let src: u8 = 42;
let mut dest: u8 = 10;
std::ptr::copy(&src as *const u8 as *const bool, &mut dest as *mut u8 as *mut bool);

What about copying a u32 as a (u16, u8), or a (u16, u8, !), that doesn't uninit any padding, right?

Same question applies to copy_nonoverlapping, swap(_nonoverlapping), write_bytes. I don't see any other methods that this would apply to.

What about copying a pointer with an integer type, that preserves provenance?


I'd be inclined to say "no, it's not UB, the generic parameter is only used for its size and alignment, no other property of the type matters", but if that's the case, we should make that clear. It definitely seems to be implied.

@digama0
Copy link

digama0 commented Apr 24, 2022

It's not UB necessarily, but it is semantically different. The implementation is permitted to not copy padding bytes and/or make them uninit in the destination if you copy a u32 as a (u16, u8), so reading the destination at u32 later will most likely be UB.

Relevant discussion: rust-lang/rust#63159 . The resolution of that issue will determine whether copy and copy_nonoverlapping copy padding bytes or do a typed copy.

@elichai
Copy link

elichai commented Apr 24, 2022

About the UB issue, AFAIU the whole point of raw pointers operations (ie ptr.write(val) instead of *ptr = val) is to not assert validity, meaning that this code is valid AFAIU:

let src: MaybeUninit<i32> = MaybeUninit::uninit();
let mut dst: MaybeUninit<i32> = MaybeUninit::uninit();
let src_ptr: *const i32 = src.as_ptr();
let dest_ptr: *mut i32 = dest.as_mut_ptr();
unsafe {ptr::copy(src_ptr, dest_ptr, 1);}

because we never created a reference to u8 nor did we read its value, so it should be valid code.

but as @digama0 said, the type is used for more than just size+align, I read this code as "Copy the values like you would with *ptr = value; without asserting the validity of the value", so when you copy a bool it could just copy a single bit and ignore the rest, just like with a regular bool write.

@digama0
Copy link

digama0 commented Apr 24, 2022

@elichai That's not my impression, although you might be right in the specific case of ptr::copy as discussed in the linked issue. For ptr::write and ptr::read though I'm pretty sure these must be typed copies because the type signature of the function includes an explicit value of type T which is typed-copied from the callee's return place to the caller's temporary storage (at least semantically, even if this is optimized away in practice). So something like MaybeUninit::<u32>::uninit().as_ptr().read() looks like it should be UB (I think it is still open whether to do this for u32 but if you use bool instead then this is definitely UB), even if the value is unused.

Edit: Miri will currently report UB if you use ptr::write or ptr::read with uninit bool but not on u32. It will not complain if you ptr::copy an uninit bool.

@elichai
Copy link

elichai commented Apr 26, 2022

Edit: Miri will currently report UB if you use ptr::write or ptr::read with uninit bool but not on u32. It will not complain if you ptr::copy an uninit bool.

Well, to use ptr::write on uninitialized data you need to first read it into the stack as T not as MaybeUninit<T>, so that by itself is UB.
It seems like ptr::copy* are the only functions that allow you to write uninit data without first reading it.

@RalfJung
Copy link
Member

RalfJung commented Apr 30, 2022

As was already noted, this is basically the same as rust-lang/rust#63159: do these operations do a typed copy (like a read followed by a write), or do they propagate the underlying bytes unchanged and T basically doesn't matter (except for its type)?

For backwards compatibility, we probably have to go with the latter here.

@RalfJung
Copy link
Member

Edit: Miri will currently report UB if you use ptr::write or ptr::read with uninit bool but not on u32.

With -Zmiri-check-number-validity it will also complain on u32.

It seems like ptr::copy* are the only functions that allow you to write uninit data without first reading it.

👍

@5225225
Copy link
Author

5225225 commented Jul 8, 2022

Looks like (swap/copy) overlapping/non-overlapping were all documented as being untyped in rust-lang/rust#97712.

So the answer is "no", for all but write_bytes. Which says

Additionally, the caller must ensure that writing count * size_of::() bytes to the given region of memory results in a valid value of T. Using a region of memory typed as a T that contains an invalid value of T is undefined behavior.

Which I'm not sure if that's saying "of course you can't use a [42_u8] as a bool", or if the very act of doing the write is UB.

fn main() {
    let mut x: u8 = 0;

    unsafe {
        std::ptr::write_bytes::<bool>(&mut x as *mut u8 as *mut bool, 42, 1);
    }
}

By the strictest reading of the docs, this is UB. Miri doesn't complain on this.

@RalfJung
Copy link
Member

RalfJung commented Jul 8, 2022

Oh, good catch. Yes as far as I am concerned that was always meant to just mean "of course you can't use a [42_u8] as a bool".

Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Jul 25, 2022
clarify how write_bytes can lead to UB due to invalid values

Fixes rust-lang/unsafe-code-guidelines#330

Cc `@5225225`
JohnTitor added a commit to JohnTitor/rust that referenced this issue Jul 25, 2022
clarify how write_bytes can lead to UB due to invalid values

Fixes rust-lang/unsafe-code-guidelines#330

Cc ``@5225225``
@5225225
Copy link
Author

5225225 commented Jul 26, 2022

Looks like this is done! The answer for all ptr methods that don't take/return a T by value (swap, copy, write_bytes) is "no. Only the size and align is used".

Now if that's actually respected is another story, see rust-lang/rust#99604. But I think at least the docs say that won't happen, so that's an implementation bug that can be tracked there.

@5225225 5225225 closed this as completed Jul 26, 2022
workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
clarify how write_bytes can lead to UB due to invalid values

Fixes rust-lang/unsafe-code-guidelines#330

Cc ``@5225225``
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants