Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable trimming of the diffdisk #356

Closed
jandubois opened this issue Oct 21, 2021 · 21 comments · Fixed by #1102
Closed

Enable trimming of the diffdisk #356

jandubois opened this issue Oct 21, 2021 · 21 comments · Fixed by #1102
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@jandubois
Copy link
Member

The diffdisk starts out small, but grows quickly as users create and delete container images. This can become an issue on laptops with limited free space.

I tried to enable trim support just for Alpine, for testing, but couldn't get it to work. Here is the patch I tried:

--- pkg/cidata/cidata.TEMPLATE.d/boot/05-persistent-data-volume.sh
+++ pkg/cidata/cidata.TEMPLATE.d/boot/05-persistent-data-volume.sh
@@ -14,7 +14,7 @@ DATADIRS="/etc /home /tmp /usr/local /var/lib"
 if [ "$(awk '$2 == "/" {print $3}' /proc/mounts)" == "tmpfs" ]; then
        mkdir -p /mnt/data
        if [ -e /dev/disk/by-label/data-volume ]; then
-               mount -t ext4 /dev/disk/by-label/data-volume /mnt/data
+               mount -t ext4 -o discard /dev/disk/by-label/data-volume /mnt/data
        else
                # Find an unpartitioned disk and create data-volume
                DISKS=$(lsblk --list --noheadings --output name,type | awk '$2 == "disk" {print $1}')
@@ -32,7 +32,7 @@ if [ "$(awk '$2 == "/" {print $3}' /proc/mounts)" == "tmpfs" ]; then
                                echo 'type=83' | sfdisk --label dos /dev/"${DISK}"
                                PART=$(lsblk --list /dev/"${DISK}" --noheadings --output name,type | awk '$2 == "part" {prin
                                mkfs.ext4 -L data-volume /dev/"${PART}"
-                               mount -t ext4 /dev/disk/by-label/data-volume /mnt/data
+                               mount -t ext4 -o discard /dev/disk/by-label/data-volume /mnt/data
                                for DIR in ${DATADIRS}; do
                                        DEST="/mnt/data$(dirname "${DIR}")"
                                        mkdir -p "${DIR}" "${DEST}"
--- pkg/qemu/qemu.go
+++ pkg/qemu/qemu.go
@@ -261,7 +261,7 @@ func Cmdline(cfg Config) (string, []string, error) {
                args = appendArgsIfNoConflict(args, "-boot", "order=c,splash-time=0,menu=on")
        }
        if diskSize, _ := units.RAMInBytes(cfg.LimaYAML.Disk); diskSize > 0 {
-               args = append(args, "-drive", fmt.Sprintf("file=%s,if=virtio", diffDisk))
+               args = append(args, "-drive", fmt.Sprintf("file=%s,if=virtio,discard=unmap", diffDisk))
        } else if !isBaseDiskCDROM {
                args = append(args, "-drive", fmt.Sprintf("file=%s,if=virtio", baseDisk))
        }

I've created some files via dd if=/dev/urandom of=1.bin bs=64M count=64 iflag=fullblock etc and verified the growth in diffdisk size.

I then deleted the *.bin files and ran:

lima-alpine:~$ sudo fstrim -v /mnt/data
/mnt/data: 105074479104 bytes trimmed
lima-alpine:~$ sudo fstrim -v /mnt/data
/mnt/data: 0 bytes trimmed

But the size of the disk never shrinks.

I wonder if this is a macOS limitation, that qemu doesn't implement the sparse file logic for APFS.

Thoughts?

@jandubois jandubois added enhancement New feature or request help wanted Extra attention is needed labels Oct 21, 2021
@dee-kryvenko
Copy link
Contributor

Will this patch+fstrim+qemu-img do the trick? That will be slightly better than building virt-sparsify on a Mac or running it in a secondary VM.

@jandubois
Copy link
Member Author

Will this patch+fstrim+qemu-img do the trick?

I don't know (can you try and let us know?); I was hopping for virtio-blk to do the right thing by itself. Another thing to try would be virtio-scsi, which had "discard" support for years, but I believe it is now fully functional in virtio-blk as well.

Having to copy the image after running fstrim is not a general solution, except in the case where the fs becomes almost empty, as you will need significant temporary storage to hold the copy at a time when you are likely to be close to out-of-space.

I'm not sure how well macOS support sparse files directly; I'm only aware of .sparseimage and .sparsebundle objects, but not regular files that have unallocated holes in them. So maybe this isn't implemented on qemu for macOS. One day I will have to hunt down the sources to check, but not today.

@jandubois
Copy link
Member Author

Another thing to test would be to try this patch on Linux and see if it just works (I fully expect it to). That way we'll know if the issue is with the patch, or with the host OS.

@afbjorklund
Copy link
Member

afbjorklund commented Oct 22, 2021

As far as I remember, you had to copy the image offline (with qemu-img convert) to reclaim space.

@jandubois
Copy link
Member Author

As far as I remember, you had to copy the image offline (with qemu-img convert) to reclaim space.

Yeah, that's what @dee-kryvenko is referring to above. The problem is that when you want to reclaim space (because you have run out, or are close to it), you most likely don't have space for another (temporary) copy.

@emilte
Copy link

emilte commented Jul 13, 2022

What happens if I delete the file? It has grown to 60GB on my system.

@jandubois
Copy link
Member Author

What happens if I delete the file? It has grown to 60GB on my system.

It will break the VM. Maybe Alpine would recover because it is just a separate volume and not a real diffdisk, but it would be better/cleaner to just delete and recreate the VM.

@yumauri
Copy link

yumauri commented Sep 17, 2022

As far as I remember, you had to copy the image offline (with qemu-img convert) to reclaim space.

Can someone please reveal full algorithm, step by step with commands, how to reclaim space, while I still have space 😅
Without purging all images and without reinstalling lima and/or docker vm.
I couldn't find it :(

@afbjorklund
Copy link
Member

afbjorklund commented Sep 17, 2022

Something like this:

$ limactl stop
$ qemu-img convert -O qcow2 -B ~/.lima/default/basedisk ~/.lima/default/diffdisk ~/.lima/default/diffdisk.new
$ mv ~/.lima/default/diffdisk ~/.lima/default/diffdisk.old
$ mv ~/.lima/default/diffdisk.new ~/.lima/default/diffdisk
$ limactl start

It's not always obvious what the real file size is though, with sparse files and such.

So might need to check also with qemu-img, or with du --apparent-size etc ?

Assuming that your question was not how to clean up space inside the VM ?


EDIT: Nope, that would require some more work before it actually works ....

Important to still have the same base disk and format as the original image.

@yumauri
Copy link

yumauri commented Sep 18, 2022

Thank you for the answer, but I got diffdisk.new file with the same size (31G in my case), as diffdisk ._.
While inside VM df reports that 9.8G is used on /dev/vda1

@afbjorklund
Copy link
Member

afbjorklund commented Sep 18, 2022

I think the fstrim step mentioned, was supposed to "zero out" all the unused disk space in the image ?

Ultimately it would need some more clever tool within lima, like https://libguestfs.org/virt-sparsify.1.html

@yumauri
Copy link

yumauri commented Oct 1, 2022

I made it a hard way :)

  • Copied basedisk and diffdisk to external drive with plenty of space
  • Using vagrant and virtualbox created new virtual machine, with nested virtualization support and 4Gb of memory
  • Added new sync folder to virtualbox machine, pointed to external drive, and with exactly the same path as my local lima files inside virtualbox machine
  • Installed libguestfs-tools inside virtualbox machine
  • Executed sudo virt-sparsify --in-place diffdisk inside virtualbox machine
  • Executed qemu-img convert ... inside virtualbox machine
  • Replaced lima's diffdisk with shrank one

Thus I've managed to reduce diffdisk file from 31Gb to 6.7Gb, which is even less, than df shows inside lima machine, I don't know how, but it looks like everything still works :)

@CzBiX
Copy link

CzBiX commented Oct 7, 2022

In my test, just add discard=on into qemu's arguments, and exec fstrim in the guest OS will immediately reduces the size of diffdisk.
Please note, qemu uses sparse file, so the file size won't change. you should use du -h diffdisk to check the real size of it.
I'm using APFS on Mac, but I believe it will be the same on Linux.

Right now I have to get qemu args from debug logs, then manually exec qemu with the added discard argument. @jandubois can you verify this and apply your patch to the repo?

@chrisx8
Copy link
Contributor

chrisx8 commented Oct 13, 2022

@CzBiX, I can confirm that with discard=on in QEMU's args, running fstrim in the guest will reduce the size of the disk.

Here's the QEMU arg:

-drive file=/Users/chris/.lima/fedora/diffdisk,if=virtio,discard=on

@jandubois
Copy link
Member Author

@chrisx8 Sorry to abuse this issue to contact you, but do you happen to know a way to resize the diffdisk, i.e. to modify the max size it can grow to? I assume you cannot shrink it once the sparse filesize has expanded, but can you increase the max size after the fact (on macOS)?

@chrisx8
Copy link
Contributor

chrisx8 commented Oct 14, 2022

@jandubois You can resize the virtual disks with qemu-img. It is part of the qemu Formulae (Homebrew package), hence it runs on macOS.

qemu-img has a resize option, which can grow (and shrink - can be DANGEROUS) disks. See https://qemu-project.gitlab.io/qemu/tools/qemu-img.html#cmdoption-qemu-img-arg-resize and qemu-img --help.

Some examples:

# Grow diffdisk to 200GB
qemu-img resize ~/.lima/default/diffdisk 200G
# Increase the size by 10GB (note the + sign)
qemu-img resize ~/.lima/default/diffdisk +10G

Also, I wouldn't mind helping out if you'd like! Feel free to open issue and tag me along. If you prefer email, let me know here and I'll reach out.

@PKizzle
Copy link

PKizzle commented Oct 14, 2022

Is there already a follow-up issue to automatically run fstrim when stopping the VM? If there is a way to also run fstrim after nerdctl rmi or rm that would be even nicer.

@Atemu
Copy link

Atemu commented Oct 14, 2022

@PKizzle you'd have to configure a service in the VM to do that for you.

@jandubois
Copy link
Member Author

There is no good way to run something from the guest-agent when you stop the VM. I'm not sure if we would get a SIGTERM before it gets killed.

It would be trivial to run fstrim during boot, which probably would be good enough for 95% of the users: if you delete a massive amount of images and need to reclaim some disk space on the host, just restart the VM.

@NedWilbur
Copy link

NedWilbur commented Aug 17, 2023

qemu-img has a resize option, which can grow (and shrink - can be DANGEROUS) disks. See qemu-project.gitlab.io/qemu/tools/qemu-img.html#cmdoption-qemu-img-arg-resize and qemu-img --help.

I want to also highlight there is a specific --shrink option if trying to reduce the size.

# Shrink the diffdisk to 50GB
qemu-img resize --shrink ~/.lima/default/diffdisk 50G

@Taylor150
Copy link

qemu-project.gitlab.io/qemu/tools/qemu-img.html#cmdoption-qemu-img-arg-resize

Hello, is there an easy way to find what the location of ~/.lima/default/diffdisk is? Mine does not seem to be here, when i run the resize i get "No such file or directory".

I cannot prune my images or containers because the docker daemon doesn't start without rancher desktop...

DennisRasey pushed a commit to DennisRasey/lima that referenced this issue Jan 11, 2024
* Resolves lima-vm#356
* Add 'discard=on' argument to '-drive' flag for basedisk and diffdisk, so that
running `fstrim` in the guest would reduce the size of QEMU virtual
disks.

Signed-off-by: Chris Xiao <30990835+chrisx8@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.