Skip to content
This repository has been archived by the owner on Jun 10, 2019. It is now read-only.

Issue with moving sparse files with tmpfs_workspace plugin #442

Open
mengelmann opened this issue Jan 21, 2018 · 6 comments
Open

Issue with moving sparse files with tmpfs_workspace plugin #442

mengelmann opened this issue Jan 21, 2018 · 6 comments

Comments

@mengelmann
Copy link
Contributor

There is an issue with copying sparse files (e.g. raw format truncated with minimize_size plugin /yep, shrinking plugin supports raw, see PR #441/) when using plugin tmpfs_workspace: files are moved from temporary to final destination as a normal, not sparsed file.

I think it is because the file crosses filesystem boundaries and looks like Python function shutil.move() does not support sparse files in this case.

shutil.move(info.volume.image_path, destination)

It does not affect formats that do not use sparse files but just truncates original size (e.g. qcow2).

I do not know if we can do something about it. Workaround is to use fallocate --dig-hole <file> or use Python code to "punch hole" after moving the image to final destination, see: https://unix.stackexchange.com/questions/52012/can-a-file-that-was-originally-sparse-and-expanded-be-made-re-sparse/105948

Log with with added du -m - it's easy to see where we lost sparse:

[...]
[121944.415808] INFO: Shrinking the volume with qemu-img
[121944.756985] DEBUG: Executing: qemu-img convert -O raw /media/dist/9d00a825/volume.raw /media/dist/9d00a825/shrunk.raw
[122650.207996] DEBUG: Executing: du -m /media/dist/9d00a825/volume.raw /media/dist/9d00a825/shrunk.raw
[122654.117823] DEBUG: 2016     /media/dist/9d00a825/volume.raw
[122654.292822] DEBUG: 516      /media/dist/9d00a825/shrunk.raw
[122905.475855] DEBUG: Executing: du -m /media/dist/9d00a825/volume.raw
[122909.615993] DEBUG: 516      /media/dist/9d00a825/volume.raw
[123009.863853] INFO: Moving volume image
[124908.543825] INFO: The volume image has been moved to /media/dist/debian-stretch-amd64-TEST-20180121.raw
[124909.185886] DEBUG: Executing: du -m /media/dist/debian-stretch-amd64-TEST-20180121.raw
[124913.273811] DEBUG: 2049     /media/dist/debian-stretch-amd64-TEST-20180121.raw
[125013.417006] INFO: Unmounting tmpfs-based workspace
[...]

Pinging @liori just in case.

BTW: In my case tmpfs_workspace plugin boosts building image from 3m30s to 2m10s (SSD-backed VM).

@andsens
Copy link
Owner

andsens commented Jan 21, 2018

Heh, I mean there's a really obvious solution here: Change the order :-)
Shrinking the image after moving it would solve the issue. However, I'm pretty sure there are some tasks that rely on the image being "done" when MoveImage has completed, so it's maybe not a very clean solution, hmm...

@mengelmann
Copy link
Contributor Author

Changing the order is an obvious one, but I do not know bootstrap-vz internals well enough and assumed that we are not supposed to touch the image after it has been moved to the final destination and we can "touch" the image only when it is in the temporary directory.

@andsens
Copy link
Owner

andsens commented Jan 22, 2018

No you're right. It's not a rule per se, but it would definitely be something one would expect. I think we need to find another solution.

@mengelmann
Copy link
Contributor Author

Probably this issue has a very narrow impact - it affects only if three conditions are met:

  • raw image format
  • plugin tmpfs_workspace
  • plugin minimize_size

Maybe it's enough just to add a notice in tmpfs_workspace or minimize_size plugin documentation that in this combination raw images are not shrunk.

@andsens
Copy link
Owner

andsens commented Jan 22, 2018

Ideally one could just add a check to validate_manifest() that throws an error when minimize_size has shrink enabled and the image type is raw.

@liori
Copy link
Contributor

liori commented Jan 24, 2018

Some notes. Starting from Python 3.5, shutil.move has an optional parameter copy_function, that is supposed to copy a single file. It can be replaced by a sparse file-aware copying routine. I can't find any implementation though, quite surprising… Some discussion on how to do that properly is here: https://yakking.branchable.com/posts/moving-files-2-sparseness/ The required OS API was implemented in Python 3.3.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants