-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std::fs::copy returns the combined size of all streams on NTFS on Windows #44532
Comments
I have a fix ready if this is deemed an issue. |
cc @retep998 |
Personally I don't mind whether we report the size of all streams or just the main stream, and really I wish |
Technically, the breaking change happened with PR #26751 which changed the return value of ’std::fs::copy’ for multi-stream files on Windows + NTFS. This truly makes the return value of ’copy’ useless for platform/filesystem-agnostic code in the sense that it cannot be compared reliably with the file size. As noted in the doc of ’copy’, the underlying called function may change, so one should not rely on it to copy alternate streams and if it matters to them, they should use a platform specific function instead to be robust against implementation changes in the std library. |
Oops, I didn't mean to close. |
Regardless of whether Fortunately in this case we don't have to change what OS function we use, we just have to change the callback that we pass to |
Sorry if my comment was unclear. I don't mean to change the underlying function called to perform the copy, just the callback implementation to use the stream index and stream byte count. I agree that the implementation of std::fs::copy should go on calling the OS function for the reasons you mentioned. I was just pointing out that if an application requires such a platform-specific behavior (i.e. ensuring all streams in a file are copied), relying on the std function to do so may not be a robust and future-proof solution and would not work correctly when copying a file on a NTFS partition from Linux for example. |
The libs team discussed this today and the conclusion was that we agreed with sticking with the current implementation but we should figure out how to get the size reported the same one as returned by |
I can work on this if it helps. I have a fix ready and just need to add a unit test. |
Awesome, thanks @stephaneyfx! Any help is greatly appreciated :) |
@alexcrichton, I modified the documentation of |
Sure yeah, sounds great! |
On Windows with the NTFS filesystem, `fs::copy` would return the sum of the lengths of all streams, which can be different from the length reported by `metadata` and thus confusing for users unaware of this NTFS peculiarity. This makes `fs::copy` return the same length `metadata` reports which is the value it used to return before PR rust-lang#26751. Note that alternate streams are still copied; their length is just not included in the returned value. This change relies on the assumption that the stream with index 1 is always the main stream in the `CopyFileEx` callback. I could not find any official document confirming this but empirical testing has shown this to be true, regardless of whether the alternate stream is created before or after the main stream. Resolves rust-lang#44532
Made `fs::copy` return the length of the main stream On Windows with the NTFS filesystem, `fs::copy` would return the sum of the lengths of all streams, which can be different from the length reported by `metadata` and thus confusing for users unaware of this NTFS peculiarity. This makes `fs::copy` return the same length `metadata` reports which is the value it used to return before PR #26751. Note that alternate streams are still copied; their length is just not included in the returned value. This change relies on the assumption that the stream with index 1 is always the main stream in the `CopyFileEx` callback. I could not find any official document confirming this but empirical testing has shown this to be true, regardless of whether the alternate stream is created before or after the main stream. Resolves #44532
NTFS files may contain alternate data streams besides the main one. Since PR #26751,
std::fs::copy
returns the combined size of all streams, which is confusing and likely unexpected unless the user is aware of this Windows/NTFS peculiarity.An example where Windows uses alternate data streams seems to be when storing whether a file was downloaded from the internet (e.g. to display a security section and an "Unblock" button in the file properties). Such alternate streams are more like metadata, and to be consistent with other file metadata (attributes, permissions, extended attributes, etc.), their size should not be reflected in the value returned by
copy
.In summary, I think
std::fs::copy
should return the size of the main stream for the following reasons (alternate data streams are still copied, their size is just not reported, like other metadata):copy
.@retep998, what do you think?
Steps to reproduce
std::fs::copy
and note the number of bytes returned.Expected result
The number of bytes copied reported by
copy
is equal to the file size.Actual result
copy
reports a number of bytes greater than the file size.The text was updated successfully, but these errors were encountered: