Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add virtfs/9p mounts, instead of sshocker/sshfs #726

Merged
merged 2 commits into from
Apr 8, 2022

Conversation

afbjorklund
Copy link
Member

@afbjorklund afbjorklund commented Mar 13, 2022

This PR allows selecting mount type, as "9p"

The default mount type is still as it was before.

Add mount option variable, for "rw" vs "ro"

Add mount type, for "reverse-sshfs" vs "9p"

Issue #20

QEMU with 9p-darwin patches available here:
https://github.com/afbjorklund/homebrew-core/blob/qemu-9p-darwin/Formula/qemu.rb
Patches from: https://github.com/willcohen/qemu/commits/v6.2.0-9p-darwin (or use HEAD)
You can also use the latest/greatest QEMU HEAD.

EDIT: The qemu in brew now supports virtfs, also for darwin systems.


Example mount when booting with "sshfs":

:/tmp/lima on /tmp/lima type fuse.sshfs (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000,allow_other)

Example mount when booting with "9p":

mount1 on /tmp/lima type 9p (rw,sync,dirsync,relatime,access=client,msize=131072,trans=virtio)

docs/internal.md Outdated Show resolved Hide resolved
examples/default.yaml Outdated Show resolved Hide resolved
examples/default.yaml Outdated Show resolved Hide resolved
pkg/qemu/qemu.go Outdated Show resolved Hide resolved
pkg/qemu/qemu.go Outdated Show resolved Hide resolved
@afbjorklund afbjorklund force-pushed the virtfs branch 2 times, most recently from 269c799 to a802a7b Compare March 13, 2022 16:00
@afbjorklund
Copy link
Member Author

afbjorklund commented Mar 13, 2022

It is possible that it should not be called "mountType", and that the values should not be "sshfs" and "9p"

Those are the technical implementation details, it could be called "mountDriver" or whatever

And the human readable choices could be reverse-sshfs and virtfs, rather than the file system types.

I have decoupled them now, so there is a human type with constants and there are strings for the scripts...

@afbjorklund
Copy link
Member Author

afbjorklund commented Mar 13, 2022

@AkihiroSuda

In order to test this on the Mac, we need to either patch qemu in brew or wait until the PR has been merged

It can meanwhile be tested on Linux, with the right GHA-fu.

EDIT: It has been merged

@afbjorklund
Copy link
Member Author

afbjorklund commented Mar 13, 2022

Instead of having to transport all the mount variables in environment variables,
maybe it would be better to run the mount over ssh - similar to the ReverseSSHFS

In the initial PoC, I just added the mount directly after the mkdir -p of the mountpoint

pkg/cidata/cidata.TEMPLATE.d/boot/25-guestagent-base.sh

And the hostagent calls are basically no-op, instead of running ssh like with sshfs

        rsf := &reversesshfs.ReverseSSHFS{
                SSHConfig:           a.sshConfig,
                LocalPath:           expanded,
                Host:                "127.0.0.1",
                Port:                a.sshLocalPort,
                RemotePath:          expanded,
                Readonly:            !(*m.Writable),
                SSHFSAdditionalArgs: []string{"-o", sshfsOptions},
        }

There is no long-lived host process (but qemu) keeping the ssh connection open, though.
The ssh command would terminate, right after the mount (or the umount) is complete.

@chancez
Copy link
Contributor

chancez commented Mar 13, 2022

I tried to test this on my m1 Mac using qemu with patches from Homebrew/homebrew-core#96655 but got an error:

(⎈ |rancher-desktop:default) ~/p/lima ❯❯❯ which qemu-system-aarch64                                                                                                                                                                                                                                                                           virtfs
/opt/homebrew/bin/qemu-system-aarch64
(⎈ |rancher-desktop:default) ~/p/lima ❯❯❯ ls -l `which qemu-system-aarch64`                                                                                                                                                                                                                                                                   virtfs
lrwxr-xr-x  1 chancezibolski  admin  46 Mar 13 11:30 /opt/homebrew/bin/qemu-system-aarch64 -> ../Cellar/qemu/6.2.0_1/bin/qemu-system-aarch64
(⎈ |rancher-desktop:default) ~/p/lima ❯❯❯ less ~/.lima/default/ha.stderr.log                                                                                                                                                                                                                                                                  virtfs
{"level":"debug","msg":"firmware candidates = [/opt/homebrew/share/qemu/edk2-aarch64-code.fd /usr/share/AAVMF/AAVMF_CODE.fd /usr/share/qemu-efi-aarch64/QEMU_EFI.fd]","time":"2022-03-13T14:58:20-07:00"}
{"level":"debug","msg":"OpenSSH version 8.6.1 detected","time":"2022-03-13T14:58:20-07:00"}
{"level":"debug","msg":"AES accelerator seems available, prioritizing aes128-gcm@openssh.com and aes256-gcm@openssh.com","time":"2022-03-13T14:58:20-07:00"}
{"level":"info","msg":"Starting QEMU (hint: to watch the boot progress, see \"/Users/chancezibolski/.lima/default/serial.log\")","time":"2022-03-13T14:58:20-07:00"}
{"level":"debug","msg":"qCmd.Args: [/opt/homebrew/bin/qemu-system-aarch64 -cpu host -machine virt,accel=hvf,highmem=off -smp 4,sockets=1,cores=4,threads=1 -m 4096 -drive if=pflash,format=raw,readonly=on,file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd -boot order=c,splash-time=0,menu=on -drive file=/Users/chancezibolski/.lima/default/dif
{"level":"debug","msg":"qemu[stderr]: qemu-system-aarch64: -virtfs local,path=/Users/chancezibolski,mount_tag=mount0,security_model=mapped-xattr,readonly: warning: short-form boolean option 'readonly' deprecated","time":"2022-03-13T14:58:20-07:00"}
{"level":"debug","msg":"qemu[stderr]: Please use readonly=on instead","time":"2022-03-13T14:58:20-07:00"}
{"level":"info","msg":"Waiting for the essential requirement 1 of 3: \"ssh\"","time":"2022-03-13T14:58:20-07:00"}
{"level":"debug","msg":"qemu[stderr]: qemu-system-aarch64: -virtfs local,path=/tmp/lima,mount_tag=mount1,security_model=mapped-xattr: cannot initialize fsdev 'mount1': failed to open '/tmp/lima': No such file or directory","time":"2022-03-13T14:58:20-07:00"}
{"level":"debug","msg":"executing script \"ssh\"","time":"2022-03-13T14:58:20-07:00"}
{"error":"exit status 1","level":"info","msg":"QEMU has exited","time":"2022-03-13T14:58:20-07:00"}
{"level":"debug","msg":"executing ssh for script \"ssh\": /usr/bin/ssh [ssh -F /dev/null -o IdentityFile=\"/Users/chancezibolski/.lima/_config/user\" -o IdentityFile=\"/Users/chancezibolski/.ssh/id_rsa\" -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o NoHostAuthenticationForLocalhost=yes -o GSSAPIAuthentication=no -o Preferr
{"level":"debug","msg":"stdout=\"\", stderr=\"ssh: connect to host 127.0.0.1 port 60022: Connection refused\\r\\n\", err=failed to execute script \"ssh\": stdout=\"\", stderr=\"ssh: connect to host 127.0.0.1 port 60022: Connection refused\\r\\n\": exit status 255","time":"2022-03-13T14:58:20-07:00"}

Seems like perhaps lima was creating the mount location if it didn't exist previously? This was using the default template by using _output/bin/limactl start, interactively editing the config and setting mountType: 9p and modifying nothing else.

Things generally work well with 9p otherwise so far.

@AkihiroSuda AkihiroSuda modified the milestones: v0.9.1, V0.10.0 Mar 14, 2022
examples/README.md Outdated Show resolved Hide resolved
pkg/cidata/cidata.go Outdated Show resolved Hide resolved
@afbjorklund
Copy link
Member Author

afbjorklund commented Mar 14, 2022

Seems like perhaps lima was creating the mount location if it didn't exist previously?

Yeah, it was created too late. Please try again now.

@jandubois
Copy link
Member

I guess the next step will be to try to reproduce these issues with plain qemu, just so we can report them upstream if they are reproducible there and not specific to lima. I probably won't have time for further testing until next week though.

@afbjorklund
Copy link
Member Author

afbjorklund commented Apr 6, 2022

the weird additional attributes created on the host

This is due to using mapped-xattr, you can store them in a separate file with mapped-file if you prefer.

So, it's a feature.

See https://wiki.qemu.org/Documentation/9psetup

mapped-xattr: Some of the file attributes like uid, gid, mode bits and link target are stored as file attributes. This is probably the most reliable and secure option.
mapped-file: The attributes are stored in the hidden .virtfs_metadata directory. Directories exported by this security model cannot interact with other unix tools.


EDIT: oops, accidentally hardcoded security_model in one of the rebases, will fix to use SecurityModel

This PR allows selecting mount type, as "9p"

The default mount type is still as before.

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
External filesystems can be mounted in /etc/fstab.

It is only the reverse-sshfs that needs to use ssh.

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
@abiosoft
Copy link

abiosoft commented Apr 6, 2022

EDIT: oops, accidentally hardcoded security_model in one of the rebases, will fix to use SecurityModel

So I fixed this locally and tried the four security models. I noticed the following.

Security Model Read/Write existing files Write new files Chown Symlink Remarks
passthrough designed for rootful Qemu
none it is basically a fail-silent passthrough
mapped-xattr (default) still the best option for rootless Qemu
mapped-file unlike mapped-xattr, no device timeout errors for symlinks. Host created symlinks on guest are empty files, guest created symlinks on host are text files with symlink dest path

I do not notice a difference between mapped-xattr and mapped-file, however I am more cautious of mapped-file due to the following warning on the documentation page. Though I have no idea what it means.

Directories exported by this security model cannot interact with other unix tools.

@AkihiroSuda
Copy link
Member

@jandubois Could you report your issue to https://gitlab.com/qemu-project/qemu/-/issues ?

What should we do now? Are we ready to merge this and then explore how to optimize the performance ?

Copy link
Member

@jandubois jandubois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@AkihiroSuda AkihiroSuda merged commit 939f79b into lima-vm:master Apr 8, 2022
@jandubois
Copy link
Member

@jandubois Could you report your issue to https://gitlab.com/qemu-project/qemu/-/issues ?

I want to reproduce the issue with plain qemu before reporting it upstream. I've been notoriously short on time lately, so this will probably take a while.

What should we do now? Are we ready to merge this and then explore how to optimize the performance ?

Yes, I'm fine with merging it, and then iterating over it later. I think the remaining issues are not with this PR, but within upstream code (pending verification).

Only thing is that maybe there should be some text in default.yaml to make it clear that 9p support is still experimental. That can be added in a later PR before the release though.

Another thought that just occurred to me: is it possible to check if the qemu version supports 9p, and show an error if it doesn't?

@AkihiroSuda
Copy link
Member

suda@lima-default:/Users/suda/gopath/src/github.com/lima-vm/lima$ time du -s
33868   .

real    0m0.598s
user    0m0.000s
sys     0m0.346s
suda@lima-9p:/Users/suda/gopath/src/github.com/lima-vm/lima$ time du -s
32492   .

real    0m2.408s
user    0m0.001s
sys     0m0.291s

Why is the du result is different? 🤔

@jandubois
Copy link
Member

My gut feeling is that there is some code somewhere in the 9p implementation in qemu that doesn't take into account some networking differences between Linux and FreeBSD/Darwin, similar to the issue I fixed in vde_switch virtualsquare/vde-2#35 (on FreeBSD/Darwin you can get an ENOBUFS error on a blocking send call, and you have to retry; on Linux the call will just block until buffers become available again).

The errors I'm seeing are mostly "Network dropped connection on reset", which looks like some error occurs, and connections are being retried. That could explain the unexpected slowdown of filesystem operations as well.

But so far this is just a hunch, not backed by actual investigation.

@willcohen
Copy link

It seems that some additional followup notes for 9p are being flagged: https://lists.gnu.org/archive/html/qemu-devel/2022-04/msg00957.html

@afbjorklund
Copy link
Member Author

It seems to be that people have unreasonable expectations on what can be done on a remote file system, though I would have thought that symlinks would work better even if things like inotify wasn't available remotely. Maybe that's because of Docker ?

Probably using NFS would have stayed more in the "known bad" part of things, but I would still recommend using local files and something like Mutagen if having issues with running builds or databases or whatever over the network. It just doesn't work.

@jandubois
Copy link
Member

It seems to be that people have unreasonable expectations on what can be done on a remote file system, though I would have thought that symlinks would work better even if things like inotify wasn't available remotely. Maybe that's because of Docker ?

Yes, people have unrealistic expectations because things like inotify work with osxfs on Docker Desktop.

But I don't think expecting symlinks is unreasonable. I'm honestly surprised that they don't work; I would have expected them to work automatically without special support; they are really nothing more than a file with a special mode bit (or so I thought).

@afbjorklund
Copy link
Member Author

It seems the main issue with symlinks was creating on one end and using on the other, I thought they worked on the "same" side

@jandubois
Copy link
Member

It seems the main issue with symlinks was creating on one end and using on the other, I thought they worked on the "same" side

That seems to be the case, but doesn't help if you expect symlinks on the host to transparently work inside the guest (as long as the symlink target is also mounted). It may not matter to you if you don't use symlinks, but I happen to use them extensively both within and across device and filesystem boundaries.

This is kind of a basic assumption of lima that you can run nerdctl inside the VM, and it behaves just as if it was running on the host.

I'm obviously ignorant about how this actually works; I always thought a symlink is just a text file containing the target name, and a single mode bit in the directory telling the system that this is a symlink and not a regular file. I assumed the redirection happened outside the filesystem driver, as you can have cross-filesystem symlinks. Therefore it is rather surprising to me how this can be broken.

@abiosoft
Copy link

abiosoft commented Apr 9, 2022

Probably using NFS would have stayed more in the "known bad" part of things, but I would still recommend using local files and something like Mutagen if having issues with running builds or databases or whatever over the network. It just doesn't work.

I was actually surprised to the see amount of requests for writable volumes when Colima had readonly volumes as the default. I personally would only use a local volume for anything serious and could care less if the mounted volumes are not writable.

But I don't think expecting symlinks is unreasonable. I'm honestly surprised that they don't work; I would have expected them to work automatically without special support; they are really nothing more than a file with a special mode bit (or so I thought).

Symlinks actually work if you change the security policy to passthrough or none. That should be fine if you only want readonly mounts as the main issue with those policies (in my limited tests) are write permissions.

domq pushed a commit to epfl-si/wp-dev that referenced this pull request Oct 31, 2022
This is an attempt to placate sketchy `docker` replacement wannabees who believe that symlinks should be handled by Plan 9, or something. [I wish I was making that up...](lima-vm/lima#726 (comment))
domq pushed a commit to epfl-si/wp-dev that referenced this pull request Oct 31, 2022
This is an (ultimately unsuccessful) attempt to placate sketchy `docker` replacement wannabees who believe that symlinks should be handled by Plan 9, or something. [I wish I was making that up...](lima-vm/lima#726 (comment))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants