-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current handling of Unix close() can lead to silent data loss #98338
Comments
If - assuming we're talking about regular files here - you need durability then relying on let file = File::from(owned_fd);
file.sync_all()?; Of course that's expensive since it forces the OS to do immediate writeback, so it should only be done when durability is critical. |
I think it makes sense to have a close method for The API proposal is just impl OwnedFd {
fn close(self) -> io::Result<()>;
} |
@the8472 close(2) on Linux even states:
But it also includes the comment I quoted above about the importance of still checking the return value of close(). @ChrisDenton A scenario in which an error here may be relevant could be this: Let's say you are processing items from a queue of some sort. If processing is successful, you remove the item from the queue. If not, it remains on the queue. An error from close would be a reason to leave the item on the queue, or to signal failure to the network client, etc. I don't know the particulars of the NFS issue, but I can imagine them. I also imagine various FUSE filesystems (eg, s3fs) do the bulk of their work only when close is called. Another scenario I could imagine would involve pipes. If a prior write() put data into the pipe buffer, but the process on the other end of the pipe crashed before reading it, the only way to detect this would be via SIGPIPE (which is blocked by default in Rust; see #97889) or, I would imagine, by the return value of close(). (I have not personally verified this) @ChrisDenton Yes, you are correct about my proposal. It should consume and cause the dropping of the OwnedFd because, as the close(2) manpage notes, you really shouldn't attempt to close it again, even on error (with a side asterisk that certain Unices, such as HP/UX, you SHOULD close it after EINTR, but this is ambiguous in POSIX and is not the case on Linux at least; I'd argue that if somebody ports Rust to one of those OSs, EINTR handling should be internal to fn close there.) I would also like to see it bubbled up to the higher-level interfaces (File, sockets, etc). This shouldn't be difficult to use in cross-platform code, I'd hope. I don't know if Windows has the same close() semantics, but even if on Windows it is identical to drop, having an interface available would be helpful. The one trick I'd see would be preventing a double-close when there is a call to fn close (by drop). |
My understanding is that Looking at the return value can only ever be a diagnostic thing. Unless you're opening a file with O_SYNC or O_DIRECT close doesn't guarantee anything due to writeback.
NFS supports fsync. Anything that ignores fsync is not a durable filesystem and you shouldn't entrust critical data to it.
Close has no special treatment for pipes. Any data still stuck in a pipe is lost. |
FUSE for its part documents that filesystems should avoid doing this. |
@the8472 and @sunfishcode I would like to understand -
If 1, how do you reconcile that with the strong admonishment in close(2) that "failing to check the return value when closing a file may lead to silent loss of data"? If 2, I'd like to hear more. Over at https://stackoverflow.com/questions/24477740/what-are-the-reasons-to-check-for-error-on-close there is a conversation about this which I found interesting. Among the points are: "Consider the inverse of your question: "Under what situations can we guarantee that close will succeed?" The answer is: when you call it correctly, and when you know that the file system the file is on does not return errors from close in this OS and Kernel version" It is this second that has me particularly troubled. On my Debian Linux box, there are 58 in-kernel filesystem modules and dozens more implemented via FUSE. Some of them (eg, ext4) are specifically designed for POSIX semantics. Some of them (eg, cifs, exfat) were not. I use sshfs and NTFS via FUSE on a regular basis and who knows about them; I am certain they are not fully POSIX compliant. Even NFS has never been fully compliant with POSIX semantics, only getting somewhat better in more recent years. I'm not arguing for everything to be perfect here; in a strict sense, if one wants perfect ACID, one would not just fsync, but also check the return value of fsync and close, and fsync the parent directory after a rename also. Except on MacOS, where fsync doesn't actually force data to durable storage, unlike Linux, where fsync explicitly does. I just want the tools to be a careful programmer wherever possible, that's all. |
This is probably a separate issue, but an incidental remark: I don't see any way to fsync a directory in std, either. File::open() won't open a directory (perhaps via O_DIRECTORY in custom_flags, but there is nothing in std that exposes O_DIRECTORY), and read_dir doesn't expose the fd for syncing either. |
I myself am.
Ignoring errors from Rust has ignored errors from If we approach this issue as "Do we care about this kind of data loss?", it's difficult to explain saying no. But we can instead ask "Who is responsible for this kind of data loss?", and there are multiple options:
Sometimes fixing existing code to correctly handle errors from To my knowledge, the main filesystems where If NFS were as common as it once was, things might look different, but in practice, POSIX filesystem APIs, with all the guarantees and common assumptions, aren't a great fit for networks. Instead of exposing the async nature of networks to applications via mechanisms that could map to Rust's Neither option is without downsides. I don't have data, but I assume the costs of passing on the responsibility for errors from As a separate consideraion, stdout can be redirected to a file, and all the same data-loss scenarios are possible when writing to a file via stdout. If we add an expectation to Rust that code check for errors from
This is why it's useful to make these decisions at a broad level, like Rust (or ideally higher, but one step at a time), so that we can collectively agree on what should be expected of individual careful programmers. I also aspire to be a careful programmer, and I once added logic to LLVM to |
I'm ambivalent about adding the API. Generally it does not make sense to check for errors on close. But I do dislike the justification. Checking for close is not what you do when you care about your data making it to storage intact. As I said earlier, fsync is the go-to mechanism here. To me it would be purely a diagnostic thing that something has gone horribly wrong.
But it does? (playground)
Sure but would you run a production database on a dodgy USB stick formatted with exfat? |
Thank you for these thoughtful comments. I definitely do not want to create a precedent to require libraries to check close(). You are right that this would be unworkable. Already I wouldn't necessarily trust libraries with writing (are they doing the POSIX-safe way to do atomic writes, with fsyncs and renames? Almost uniformly not, but then you may not always want that). I want the option to be able to check it myself in a safe way, that's all. There are a lot of machines out there that are running on filesystems that I wouldn't count to apply these guarantees, or have other errors that may be signaled at close. FreeBSD can return ECONNRESET for instance. An entire class of devices (Raspeberry Pis) typically runs on dodgy micro SDs. I particular bane of my existance for years has been VMs (and even hardware) that fails to propagate fsync. I /frequently/ work with software that has to deal with USB sticks formatted with exfat or NTFS (or, heck, vfat). While NFS no doubt isn't as prolific as it once was, it is still with us, and now we have a bunch of others in that range: afs, coda, glusterfs, ocfs2, ceph, cifs, davfs, sshfs, s3fs, rclone -- to name a few. We have a /proliferation/ of filesystems or filesystem-like things, actually. It is an explicit design goal of the program I was working on when I created this issue to be maximally compatible with filesystems that differ from POSIX semantics in significant ways; for instance, s3fs, in which a file created may not be able to be opened for reading right away due to eventual consistency. I care about data integrity. I am realistic that checking the return value from close is not itself sufficient to gaurantee it. I don't believe there exists a perfect guarantee, in fact, because hardware isn't perfect. But I want to take every step I reasonably can, and which documentation has showed I ought to, along that path -- in part because I realize I am dealing with filesystems with a lot more failure modes than ext4 on a local SSD. I wouldn't try to run PostgreSQL atop S3. But for Filespooler, written in Rust, this is an explicit design goal and documented practice. Although written in Go, a similar tool, NNCP, is something I use for getting data to airgapped backup & archival systems. |
There seems to be some separate scenarios here
Ok. Then imo it should be documented in a way that it redirects users to |
Apparently we actually were debug-asserting on this since 1.0 and the only reason it's not come up is due to the precompiled stdlib thing. |
The new issue is about closing |
While working on #98209 , I went searching for where close() is called. I found it in std/src/os/fd/owned.rs:
The Linux close(2) manpage states:
Anyhow, checking the return value of close(2) is necessary in a number of cases. But here we have a conundrum, because what can we possibly do with a failure of close(2) inside a Drop impl?
These all have their pros and cons. But aren't we looking for something more like this?
fn close(self) -> io::Result<()>
In fact, a
Close
trait could define this function and it could be implemented on files, pipes, sockets, etc.Meta
rustc --version --verbose
:The text was updated successfully, but these errors were encountered: