Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PathBuf: replace transmuting by accessor functions #124410

Merged
merged 1 commit into from
Apr 27, 2024

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Apr 26, 2024

The existing repr(transparent) was anyway insufficient as OsString was not repr(transparent). And furthermore, on Windows it was blatantly wrong as OsString wraps Wtf8Buf which is a repr(Rust) type with 2 fields:

/// An owned, growable string of well-formed WTF-8 data.
///
/// Similar to `String`, but can additionally contain surrogate code points
/// if they’re not in a surrogate pair.
#[derive(Eq, PartialEq, Ord, PartialOrd, Clone)]
pub struct Wtf8Buf {
bytes: Vec<u8>,
/// Do we know that `bytes` holds a valid UTF-8 encoding? We can easily
/// know this if we're constructed from a `String` or `&str`.
///
/// It is possible for `bytes` to have valid UTF-8 without this being
/// set, such as when we're concatenating `&Wtf8`'s and surrogates become
/// paired, as we don't bother to rescan the entire string.
is_known_utf8: bool,
}

So let's just be honest about what happens and add accessor methods that make this abstraction-breaking act of PathBuf visible on the APIs that it pierces through.

Fixes #124409

@rustbot
Copy link
Collaborator

rustbot commented Apr 26, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 26, 2024
@RalfJung
Copy link
Member Author

RalfJung commented Apr 26, 2024

Turns out when this transmute was originally added 9 years ago, Wtf8Buf was still defined as

pub struct Wtf8Buf {
    bytes: Vec<u8>
}

So, it wasn't repr(transparent), but it was just a newtype. At some point is_known_utf8 got added and then the PathBuf assumption was silently broken.

A good lesson in why we should not use transmutes to circumvent field privacy.

@RalfJung
Copy link
Member Author

RalfJung commented Apr 26, 2024

FWIW I did not review whether the things PathBuf does with direct access to the internals of Wtf8Buf are actually correct. The only mutation it does is truncate and the field that got added to Wtf8Buf, is_known_utf8, sounds like truncation would preserve its invariant if truncation is done at character boundaries, but I can't dig into this right now. This PR doesn't make things any worse, it just makes them more visible.

A better fix may be to add a private truncate method to OsString for use by PathBuf (and it needs a last() as well).

@RalfJung RalfJung force-pushed the path-buf-transmute branch from 998fedb to c47978a Compare April 26, 2024 16:09
@Noratrieb
Copy link
Member

lovely.. i wonder how many sussy transmutes there are in other places...
r=me

@RalfJung
Copy link
Member Author

@bors r=Nilstrieb

@bors
Copy link
Contributor

bors commented Apr 26, 2024

📌 Commit c47978a has been approved by Nilstrieb

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 26, 2024
@RalfJung
Copy link
Member Author

@bors rollup

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Apr 26, 2024
…strieb

PathBuf: replace transmuting by accessor functions

The existing `repr(transparent)` was anyway insufficient as `OsString` was not `repr(transparent)`. And furthermore, on Windows it was blatantly wrong as `OsString` wraps `Wtf8Buf` which is a `repr(Rust)` type with 2 fields:

https://github.com/rust-lang/rust/blob/51a7396ad3d78d9326ee1537b9ff29ab3919556f/library/std/src/sys_common/wtf8.rs#L131-L146

So let's just be honest about what happens and add accessor methods that make this abstraction-breaking act of PathBuf visible on the APIs that it pierces through.

Fixes rust-lang#124409
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 26, 2024
…iaskrgr

Rollup of 5 pull requests

Successful merges:

 - rust-lang#124341 (resolve: Remove two cases of misleading macro call visiting)
 - rust-lang#124383 (Port run-make `--print=native-static-libs` to rmake.rs)
 - rust-lang#124391 (`rustc_builtin_macros` cleanups)
 - rust-lang#124408 (crashes: add more tests)
 - rust-lang#124410 (PathBuf: replace transmuting by accessor functions)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 27, 2024
Rollup of 5 pull requests

Successful merges:

 - rust-lang#124341 (resolve: Remove two cases of misleading macro call visiting)
 - rust-lang#124383 (Port run-make `--print=native-static-libs` to rmake.rs)
 - rust-lang#124391 (`rustc_builtin_macros` cleanups)
 - rust-lang#124408 (crashes: add more tests)
 - rust-lang#124410 (PathBuf: replace transmuting by accessor functions)

r? `@ghost`
`@rustbot` modify labels: rollup
@crlf0710
Copy link
Member

@RalfJung @Nilstrieb If it's exposing mutable reference to the buffer, i think Wtf8Buf's is_known_utf8 field needs to be revalidated after certain modifications...

@bors bors merged commit 7cbba53 into rust-lang:master Apr 27, 2024
10 checks passed
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 27, 2024
Rollup merge of rust-lang#124410 - RalfJung:path-buf-transmute, r=Nilstrieb

PathBuf: replace transmuting by accessor functions

The existing `repr(transparent)` was anyway insufficient as `OsString` was not `repr(transparent)`. And furthermore, on Windows it was blatantly wrong as `OsString` wraps `Wtf8Buf` which is a `repr(Rust)` type with 2 fields:

https://github.com/rust-lang/rust/blob/51a7396ad3d78d9326ee1537b9ff29ab3919556f/library/std/src/sys_common/wtf8.rs#L131-L146

So let's just be honest about what happens and add accessor methods that make this abstraction-breaking act of PathBuf visible on the APIs that it pierces through.

Fixes rust-lang#124409
@rustbot rustbot added this to the 1.79.0 milestone Apr 27, 2024
@RalfJung
Copy link
Member Author

Yes, see what I wrote above: #124410 (comment)

Feel free to open an issue to track a proper cleanup here. I just have the time for a basic soundness fix right now, not the time to make it nice.

@RalfJung RalfJung deleted the path-buf-transmute branch April 27, 2024 07:50
@workingjubilee
Copy link
Member

Hmm, it would have been sufficient to set the field to false every time that as_mut_vec_for_path_buf was called (on the assumption that whatever happens next would violate that).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PathBuf incorrectly transmutes OsString to Vec<u8>
7 participants