Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdio handles (stdin, stdout, stderr) should be Seek at least on Unix #72802

Open
ijackson opened this issue May 31, 2020 · 5 comments
Open

stdio handles (stdin, stdout, stderr) should be Seek at least on Unix #72802

ijackson opened this issue May 31, 2020 · 5 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@ijackson
Copy link
Contributor

Currently, std::io::Stdin et al do not implement Seek.

It is true that a Unix program is not guaranteed that its stdin/stdout/stderr will be seekable. Often they won't be. But, they may be. If stdin is redirected from a file, for example, lseek (the system call) will work fine.

The Seek trait does not promise that seek actually works. seek returns io::Result (quite properly). Indeed a std::io::File object can easily refer to a non-seekable object, depending what path was passed.

Conversely, not implementing Seek makes it impossible to perform this operation in safe Rust. There are perhaps some workarounds like asking to open /dev/stdin, but they are non-portable and often quite horrible. Using unsafe { libc::lseek(...) } is quite straightforward (apart, perhaps, from a question about whether lseek64 should be used instead) but we shouldn't expect our users to do that when we could have provided a safe way.

There are file data integrity and synchronisation issues with Seek. But these are not significantly worse for stdin/out/err than for std::fs::File. They are not a memory safety issue and we have already chosen to address this in the documentation rather than through the type system.

I come from a Unix background. I don't know what the situation is on Windows. But, the fact that in Standard C, it is legal to call fseek on stdin (for example) suggests that the correct answer for Rust is to permit the user to try to call seek on all platforms.

@LeSeulArtichaut LeSeulArtichaut added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 31, 2020
@retep998
Copy link
Member

retep998 commented May 31, 2020

The stdin/stdout handles on Windows can also be file handles, in which case seeking can function correctly. So therefore this feature request applies equally well to Windows.

That said, I question what sort of use case you have where you need to seek stdin/stdout.

@koutoftimer
Copy link

@retep998 in ACM with tough memory limits and a lot of reads you may want to read input several times instead of allocating tons of additional memory.

@SolraBizna
Copy link

I'm currently writing a tool that needs to be able to produce a very large output file, and write each block of that output file in a semi-random order. Its output file is specified by redirecting stdout. This particular tool could be changed to accept its output file as a command line argument, but if I were writing a tool that needed to avoid file creation race conditions (like the ones tmpnam creates), passing an open file descriptor to the tool would be the only safe way to do it.

@panzi
Copy link

panzi commented Apr 19, 2022

I use this hack where it is ok. It looses already buffered input and is platform specific, though:

        let lock = std::io::stdin().lock();

        #[cfg(any(target_family="unix", target_family="wasi"))]
        let mut seekable_stdin = unsafe {
            use std::os::unix::io::{AsRawFd, FromRawFd};
            std::fs::File::from_raw_fd(lock.as_raw_fd())
        };

        #[cfg(target_family="windows")]
        let mut seekable_stdin = unsafe {
            use std::os::windows::io::{AsRawHandle, FromRawHandle};
            std::fs::File::from_raw_handle(lock.as_raw_handle())
        };

The question is with non-lexical lifetimes how can I be sure the lock is held long enough? Can I be sure if I do an explicit drop(lock); after all my IO that it really happens in that order?

@corneliusroemer
Copy link

corneliusroemer commented Mar 6, 2023

What's the way forward here? I'm confused why this is classified as an "enhancement" isn't this arguably more of a "bug"/question of design? Btw, here's the current implementation for File in unix:

pub fn seek(&self, pos: SeekFrom) -> io::Result<u64> {
let (whence, pos) = match pos {
// Casting to `i64` is fine, too large values will end up as
// negative which will cause an error in `lseek64`.
SeekFrom::Start(off) => (libc::SEEK_SET, off as i64),
SeekFrom::End(off) => (libc::SEEK_END, off),
SeekFrom::Current(off) => (libc::SEEK_CUR, off),
};
let n = cvt(unsafe { lseek64(self.as_raw_fd(), pos as off64_t, whence) })?;
Ok(n as u64)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants