Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cp: implement --sparse option #3362

Closed
jfinkels opened this issue Apr 4, 2022 · 1 comment
Closed

cp: implement --sparse option #3362

jfinkels opened this issue Apr 4, 2022 · 1 comment
Labels

Comments

@jfinkels
Copy link
Collaborator

jfinkels commented Apr 4, 2022

The --sparse option to cp is not implemented yet. This is an option that controls whether an optimization is applied in storing the file on the filesystem, so if we just want to get things running, it probably suffices for us to just add this argument and ignore it.

GNU cp:

$ touch a && cp --sparse=always a b
# no output

uutills cp

$ touch a && ./target/release/cp --sparse=always a b
./target/release/cp: Option 'sparse' not yet implemented.
@jfinkels jfinkels added the U - cp label Apr 4, 2022
pimzero added a commit to pimzero/coreutils that referenced this issue Aug 1, 2022
This begins to address uutils#3362

At the moment, only the `--sparse=always` logic matches the requirement
form GNU cp info page, i.e. always make holes in destination when
possible.

Sparse copy is done by copying the source to the destination block by
block (blocks being of the destination's fs block size). If the block
only holds NUL bytes, we don't write to the destination.

About `--sparse=auto`: according to GNU cp info page, the destination
file will be made sparse if the source file is sparse as well. The next
step are likely to use `lseek` with `SEEK_HOLE` detect if the source
file has holes. Currently, this has the same behaviour as
`--sparse=never`. This `SEEK_HOLE` logic can also be applied to
`--sparse=always` to improve performance when copying sparse files.

About `--sparse=never`: from my understanding, it is not guaranteed that
Rust's `fs::copy` will always produce a file with no holes, as
["platform-specific behavior may change in the
future"](https://doc.rust-lang.org/std/fs/fn.copy.html#platform-specific-behavior)

About other platforms:
 - `macos`: The solution may be to use `fcntl` command `F_PUNCHHOLE`.
 - `windows`: I only see `FSCTL_SET_SPARSE`.

This should pass the following GNU tests:
 - `tests/cp/sparse.sh`
 - `tests/cp/sparse-2.sh`
 - `tests/cp/sparse-extents.sh`
 - `tests/cp/sparse-extents-2.sh`

`sparse-perf.sh` needs `--sparse=auto`, and in particular a way to skip
holes in the source file.
pimzero added a commit to pimzero/coreutils that referenced this issue Aug 1, 2022
This begins to address uutils#3362

At the moment, only the `--sparse=always` logic matches the requirement
form GNU cp info page, i.e. always make holes in destination when
possible.

Sparse copy is done by copying the source to the destination block by
block (blocks being of the destination's fs block size). If the block
only holds NUL bytes, we don't write to the destination.

About `--sparse=auto`: according to GNU cp info page, the destination
file will be made sparse if the source file is sparse as well. The next
step are likely to use `lseek` with `SEEK_HOLE` detect if the source
file has holes. Currently, this has the same behaviour as
`--sparse=never`. This `SEEK_HOLE` logic can also be applied to
`--sparse=always` to improve performance when copying sparse files.

About `--sparse=never`: from my understanding, it is not guaranteed that
Rust's `fs::copy` will always produce a file with no holes, as
["platform-specific behavior may change in the
future"](https://doc.rust-lang.org/std/fs/fn.copy.html#platform-specific-behavior)

About other platforms:
 - `macos`: The solution may be to use `fcntl` command `F_PUNCHHOLE`.
 - `windows`: I only see `FSCTL_SET_SPARSE`.

This should pass the following GNU tests:
 - `tests/cp/sparse.sh`
 - `tests/cp/sparse-2.sh`
 - `tests/cp/sparse-extents.sh`
 - `tests/cp/sparse-extents-2.sh`

`sparse-perf.sh` needs `--sparse=auto`, and in particular a way to skip
holes in the source file.
pimzero added a commit to pimzero/coreutils that referenced this issue Aug 1, 2022
This begins to address uutils#3362

At the moment, only the `--sparse=always` logic matches the requirement
form GNU cp info page, i.e. always make holes in destination when
possible.

Sparse copy is done by copying the source to the destination block by
block (blocks being of the destination's fs block size). If the block
only holds NUL bytes, we don't write to the destination.

About `--sparse=auto`: according to GNU cp info page, the destination
file will be made sparse if the source file is sparse as well. The next
step are likely to use `lseek` with `SEEK_HOLE` detect if the source
file has holes. Currently, this has the same behaviour as
`--sparse=never`. This `SEEK_HOLE` logic can also be applied to
`--sparse=always` to improve performance when copying sparse files.

About `--sparse=never`: from my understanding, it is not guaranteed that
Rust's `fs::copy` will always produce a file with no holes, as
["platform-specific behavior may change in the
future"](https://doc.rust-lang.org/std/fs/fn.copy.html#platform-specific-behavior)

About other platforms:
 - `macos`: The solution may be to use `fcntl` command `F_PUNCHHOLE`.
 - `windows`: I only see `FSCTL_SET_SPARSE`.

This should pass the following GNU tests:
 - `tests/cp/sparse.sh`
 - `tests/cp/sparse-2.sh`
 - `tests/cp/sparse-extents.sh`
 - `tests/cp/sparse-extents-2.sh`

`sparse-perf.sh` needs `--sparse=auto`, and in particular a way to skip
holes in the source file.
pimzero added a commit to pimzero/coreutils that referenced this issue Aug 2, 2022
This begins to address uutils#3362

At the moment, only the `--sparse=always` logic matches the requirement
form GNU cp info page, i.e. always make holes in destination when
possible.

Sparse copy is done by copying the source to the destination block by
block (blocks being of the destination's fs block size). If the block
only holds NUL bytes, we don't write to the destination.

About `--sparse=auto`: according to GNU cp info page, the destination
file will be made sparse if the source file is sparse as well. The next
step are likely to use `lseek` with `SEEK_HOLE` detect if the source
file has holes. Currently, this has the same behaviour as
`--sparse=never`. This `SEEK_HOLE` logic can also be applied to
`--sparse=always` to improve performance when copying sparse files.

About `--sparse=never`: from my understanding, it is not guaranteed that
Rust's `fs::copy` will always produce a file with no holes, as
["platform-specific behavior may change in the
future"](https://doc.rust-lang.org/std/fs/fn.copy.html#platform-specific-behavior)

About other platforms:
 - `macos`: The solution may be to use `fcntl` command `F_PUNCHHOLE`.
 - `windows`: I only see `FSCTL_SET_SPARSE`.

This should pass the following GNU tests:
 - `tests/cp/sparse.sh`
 - `tests/cp/sparse-2.sh`
 - `tests/cp/sparse-extents.sh`
 - `tests/cp/sparse-extents-2.sh`

`sparse-perf.sh` needs `--sparse=auto`, and in particular a way to skip
holes in the source file.
pimzero added a commit to pimzero/coreutils that referenced this issue Aug 2, 2022
This begins to address uutils#3362

At the moment, only the `--sparse=always` logic matches the requirement
form GNU cp info page, i.e. always make holes in destination when
possible.

Sparse copy is done by copying the source to the destination block by
block (blocks being of the destination's fs block size). If the block
only holds NUL bytes, we don't write to the destination.

About `--sparse=auto`: according to GNU cp info page, the destination
file will be made sparse if the source file is sparse as well. The next
step are likely to use `lseek` with `SEEK_HOLE` detect if the source
file has holes. Currently, this has the same behaviour as
`--sparse=never`. This `SEEK_HOLE` logic can also be applied to
`--sparse=always` to improve performance when copying sparse files.

About `--sparse=never`: from my understanding, it is not guaranteed that
Rust's `fs::copy` will always produce a file with no holes, as
["platform-specific behavior may change in the
future"](https://doc.rust-lang.org/std/fs/fn.copy.html#platform-specific-behavior)

About other platforms:
 - `macos`: The solution may be to use `fcntl` command `F_PUNCHHOLE`.
 - `windows`: I only see `FSCTL_SET_SPARSE`.

This should pass the following GNU tests:
 - `tests/cp/sparse.sh`
 - `tests/cp/sparse-2.sh`
 - `tests/cp/sparse-extents.sh`
 - `tests/cp/sparse-extents-2.sh`

`sparse-perf.sh` needs `--sparse=auto`, and in particular a way to skip
holes in the source file.
pimzero added a commit to pimzero/coreutils that referenced this issue Aug 3, 2022
This begins to address uutils#3362

At the moment, only the `--sparse=always` logic matches the requirement
form GNU cp info page, i.e. always make holes in destination when
possible.

Sparse copy is done by copying the source to the destination block by
block (blocks being of the destination's fs block size). If the block
only holds NUL bytes, we don't write to the destination.

About `--sparse=auto`: according to GNU cp info page, the destination
file will be made sparse if the source file is sparse as well. The next
step are likely to use `lseek` with `SEEK_HOLE` detect if the source
file has holes. Currently, this has the same behaviour as
`--sparse=never`. This `SEEK_HOLE` logic can also be applied to
`--sparse=always` to improve performance when copying sparse files.

About `--sparse=never`: from my understanding, it is not guaranteed that
Rust's `fs::copy` will always produce a file with no holes, as
["platform-specific behavior may change in the
future"](https://doc.rust-lang.org/std/fs/fn.copy.html#platform-specific-behavior)

About other platforms:
 - `macos`: The solution may be to use `fcntl` command `F_PUNCHHOLE`.
 - `windows`: I only see `FSCTL_SET_SPARSE`.

This should pass the following GNU tests:
 - `tests/cp/sparse.sh`
 - `tests/cp/sparse-2.sh`
 - `tests/cp/sparse-extents.sh`
 - `tests/cp/sparse-extents-2.sh`

`sparse-perf.sh` needs `--sparse=auto`, and in particular a way to skip
holes in the source file.
sylvestre added a commit that referenced this issue Aug 4, 2022
* cp: Refactor `reflink`/`sparse` handling to enable `--sparse` flag

`--sparse` and `--reflink` options have a lot of similarities:
 - They have similar options (`always`, `never`, `auto`)
 - Both need OS specific handling
 - They can be mutually exclusive

Prior to this change, `sparse` was defined as `CopyMode`, but `reflink`
wasn't. Given the similarities, it makes sense to handle them similarly.

The idea behind this change is to move all OS specific file copy
handling in the `copy_on_write_*` functions. Those function then
dispatch to the correct logic depending on the arguments (at the moment,
the tuple `(reflink, sparse)`).

Also, move the handling of `--reflink=never` from `copy_file` to the
`copy_on_write_*` functions, at the cost of a bit of code duplication,
to allow `copy_on_write_*` to handle all cases (and later handle
`--reflink=never` with `--sparse`).

* cp: Implement `--sparse` flag

This begins to address #3362

At the moment, only the `--sparse=always` logic matches the requirement
form GNU cp info page, i.e. always make holes in destination when
possible.

Sparse copy is done by copying the source to the destination block by
block (blocks being of the destination's fs block size). If the block
only holds NUL bytes, we don't write to the destination.

About `--sparse=auto`: according to GNU cp info page, the destination
file will be made sparse if the source file is sparse as well. The next
step are likely to use `lseek` with `SEEK_HOLE` detect if the source
file has holes. Currently, this has the same behaviour as
`--sparse=never`. This `SEEK_HOLE` logic can also be applied to
`--sparse=always` to improve performance when copying sparse files.

About `--sparse=never`: from my understanding, it is not guaranteed that
Rust's `fs::copy` will always produce a file with no holes, as
["platform-specific behavior may change in the
future"](https://doc.rust-lang.org/std/fs/fn.copy.html#platform-specific-behavior)

About other platforms:
 - `macos`: The solution may be to use `fcntl` command `F_PUNCHHOLE`.
 - `windows`: I only see `FSCTL_SET_SPARSE`.

This should pass the following GNU tests:
 - `tests/cp/sparse.sh`
 - `tests/cp/sparse-2.sh`
 - `tests/cp/sparse-extents.sh`
 - `tests/cp/sparse-extents-2.sh`

`sparse-perf.sh` needs `--sparse=auto`, and in particular a way to skip
holes in the source file.

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
@sylvestre
Copy link
Contributor

I think it is now done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

2 participants