Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsFd impl for Stdout/StdoutLock can violate locking guarantees of those types. #114140

Open
the8472 opened this issue Jul 27, 2023 · 21 comments
Open
Labels
A-io Area: std::io, std::fs, std::net and std::path C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@the8472
Copy link
Member

the8472 commented Jul 27, 2023

The stdout/err APIs guarantee that they're globally synchronized:

Each handle returned is a reference to a shared global buffer whose access is synchronized via a mutex. If you need more explicit control over locking, see the Stdout::lock method.

This can be relied on to write whole lines at a time, e.g. for json-lines style logging where multiple threads shouldn't interleave their output. It can be ok if some random other library or thread outputs non-JSON output, but it's important that it's not emitted in the middle of a line that's currently being emitted.

The combination of impl AsFd for Stdout, try_clone_to_owned() and impl From<OwnedFd> for File can then be used to break that guarantee. AsRawFd is somewhat less of an issue because any use of the created RawFd should be unsafe.

Possible solutions:

  • weaken the guarantees
  • deprecate or remove the impls
  • add "don't do that" warnings

Related: rust-lang/libs-team#148 (comment) and following.

@the8472 the8472 added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-bug Category: This is a bug. A-io Area: std::io, std::fs, std::net and std::path labels Jul 27, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 27, 2023
@the8472 the8472 removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 27, 2023
@ChrisDenton
Copy link
Member

I think try_clone_to_owned is the weak link in the chain.

Converting back and forth between OwnedFd and File should be fine.

@the8472
Copy link
Member Author

the8472 commented Jul 27, 2023

No that's not enough. You can get multiple Stdout instances. They all share the underlying lock, but they have separate lifetimes so you can get another BorrowedFd even when the lock is taken.

Well, I suppose we could acquire the lock before returning the borrowedfd... but we'd have to tell libraries to not use dup() on borrowed descriptors.
But that's still weird because at least part of the motivation of the BorrowedFd lifetime is to prevent double-close. Which the try_clone_to_owned does properly solve. Removing it because Stdout made additional promises that are incompatible with cloning seems like shooting the messenger.

@ChrisDenton
Copy link
Member

Fair. But using the fd returned by AsRawFd is unsafe and OwnedFd -> File should be a no-op. Only try_clone_to_owned actually does something. And tbh, there has been some discussion recently about how safe dup actually is.

However, I do see your point.

@sunfishcode
Copy link
Member

a shared global buffer whose access is synchronized via a mutex

The mutex protects the buffer.

AsRawFd is somewhat less of an issue because any use of the created RawFd should be unsafe.

Lots of existing Rust code does things like unsafe { libc::ioctl(stdout.as_raw_fd(), libc::TCSETS, ptr) };. This use of an unsafe block means code like this is assuming it knows the safety condition needed to call libc like this. Consequently, we need to be careful before claiming that we can make the unsafety condition stronger.

The existing long-standing situation in Rust is that the raw fd safety condition does not include "you must respect the Stdout lock".

I recommend just adding documentation to warn users of the potential hazards.

@the8472
Copy link
Member Author

the8472 commented Jul 27, 2023

Stdout says

Each handle shares a global buffer of data to be written to the standard output stream. Access is also synchronized via a lock and explicit control over locking is available via the lock method.

i.e. access is synchronized, not just the buffer.

@the8472 the8472 changed the title AsRawFd impl for Stdout/StdoutLock can violate locking guarantees of those types. AsFd impl for Stdout/StdoutLock can violate locking guarantees of those types. Jul 27, 2023
@SUPERCILEX
Copy link
Contributor

My vote is for weakening the guarantees, but I also agree that @the8472 is reading too much into the existing guarantees and that they don't say anything about the stdout stream.

Each handle shares a global buffer of data to be written to the standard output stream.

When I read this line I hear "There's a global buffer where everyone can play nice and synchronize their writes to stdout, but that says nothing about others writing to stdout elsewhere".

@the8472
Copy link
Member Author

the8472 commented Jul 27, 2023

but that says nothing about others writing to stdout elsewhere".

Nominally that doesn't exist inside Rust. We assume exclusive ownership of file descriptors. This is part of IO Safety.

@SUPERCILEX
Copy link
Contributor

I need to reread the io safety docs, but then doesn't that mean any method of getting a file descriptor through unsafe to 0,1,2 is disallowed? I would be quite annoyed if a systems programming language said you can't mess with those FDs.

@sunfishcode
Copy link
Member

Accessing stdio fds through 0,1,2 is fine, in std-using code. std holds these descriptors live with an effectively 'static lifetime, so if std is present, you know they'll always be open, and you can BorrowedFd<'static>::borrow_raw(1) and so on. C code can work "as if" it works this way too.

@the8472
Copy link
Member Author

the8472 commented Jul 28, 2023

I disagree. Just because you know they're open does not mean they're safe to use. If a regular file is passed as Stdout and it gets mmaped then something futzing around with the file can lead to unsafety.
Acquiring a lock on Stdout should make that access exclusive because Stdout is the nominal owner of that file descriptor.

@the8472
Copy link
Member Author

the8472 commented Jul 28, 2023

I would be quite annoyed if a systems programming language said you can't mess with those FDs.

But many things already assume exclusive access to the FDs. @sunfishcode already mentions piping a jpeg file to stdout in rust-lang/libs-team#148 (comment). This can only work with exclusive access.

If we say that code cannot assume exclusive ownership of the stdio handles then reliable communication with the environment (via that mechanism) becomes impossible. That's nonsense imo.

@the8472
Copy link
Member Author

the8472 commented Jul 28, 2023

The situation could perhaps be papered over if we had a io::stdout().take() that acquires... "even more exclusive" access to the fd by leaving a /dev/null behind in its place.

@RalfJung
Copy link
Member

The question is whether outside code "interrupting" things printed to stdout is considered a logic problem (like a bad Hash implementation) or a violation of std's library invariants. The choice can be made either way, but it needs to be made consistently and documented properly.

Interrupting some jpeg or JSON printed to stdout sounds to me to be more of a logic issue than a potential soundness issue?

@RalfJung
Copy link
Member

I think try_clone_to_owned is the weak link in the chain.

I do agree this method is very unfortunate. It means that when I get an owned FD, I cannot know whether anyone else is also reading/writing that FD. This makes it fundamentally impossible to have guarantees like "uninterrupted stdout" if even just a BorrowedFd is ever handed out to an unknown party.

I would prefer if IO safety actually expressed exclusive ownership of the FD. Right now one could use IO safety to define a custom ExclusivelyOwnedFd type that ensures that there are no dups, but the existing OwnedFd doesn't provide this guarantee.

@the8472
Copy link
Member Author

the8472 commented Jul 28, 2023

Interrupting some jpeg or JSON printed to stdout sounds to me to be more of a logic issue than a potential soundness issue?

In principle a tool could try to mmap its inputs/outputs (for performance reasons). Then something else accessing fd 0/1/2 could become unsound.
Which is why I brought up a hypothetical io::stdout().take(). If one does that early enough in main then the unreliable locking becomes less of an issue.

@RalfJung
Copy link
Member

In principle a tool could try to mmap its inputs/outputs (for performance reasons). Then something else accessing fd 0/1/2 could become unsound.

Yes, in principle. The question is whether that's something that makes sense to support. You can just use a different FD instead, one you actually control and make sure nobody will dup.

@RalfJung
Copy link
Member

I opened #114167 for the general problem around try_clone. This discussion here is more specifically about how std manages ownership of the 0,1,2 handles.

@carbotaniuman
Copy link
Contributor

carbotaniuman commented Sep 22, 2023

This is a guarantee that we can't really make: if it's unsound to write to 0, 1, 2, then it becomes impossible to safely wrap a C library that calls printf. set_var should've taught us that making these kind of global guarantees is ripe with pain, and I do not believe we can salvage this in a way that does not result in set_var.

@the8472
Copy link
Member Author

the8472 commented Sep 22, 2023

Well, such a library is unsound if you're trying to write a file to stdout, e.g. a binary that's going to be executed later. The C library would just corrupt that. Contracts matter!

But as I commented previously #114140 (comment) it's probably better to add a take() method if we can't even ensure that libraries don't randomly clobber things.

@carbotaniuman
Copy link
Contributor

Right, but Rust code can do arbitrary bad misbehavior which is not unsound - I don't think breaking existing code and practice is worth it to ensure a guarantee that we might not even be making. I think a take() might be useful, but I think that is more a feature request and not really related to this.

@RalfJung
Copy link
Member

RalfJung commented Sep 22, 2023

Stdout/Stderr are very much shared resources, so I don't think we can say that it is a soundness bug when a binary dumped there gets interrupted by other things. That's a logic bug IMO. Even if the binary is executed later -- it's not a Rust soundness bug for a "compiler implemented in Rust" to produce a bad binary. For instance, a bug in a MIR transform is a logic bug from the perspective of Rust-the-language-used-to-implement-rustc, not a soundness bug. (It becomes a soundness bug only via the bootstraping cycle, but that's not a concern here.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: std::io, std::fs, std::net and std::path C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants