Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplest way to setup a "virtual" link (or a switch) between two QEMU VMs #1103

Open
vmaffione opened this issue Feb 20, 2017 · 30 comments
Open
Labels

Comments

@vmaffione
Copy link
Contributor

Hi,
I'd like to test snabb as a fast "virtual" link between two QEMU VM (using vhost-user as a backend for the virtio-net-pci VM network device).
Is there a minimal Snabb AppEngine program implementing this (i.e. two vhost-user ports with snabb unconditionally forwarding from one port to the other)?

Is there a minimal Snabb AppEngine program implementing a "virtual switch" like the standard linux in-kernel bridge (i.e. the one you create with brctl), where you can attach multiple vhost-user ports and a single ixgbe port?

Are there some updated instructions on what QEMU command-line should be used?

Thanks!

@lukego
Copy link
Member

lukego commented Feb 22, 2017

Howdy!

For the second question - make Snabb run as a bridge between VMs sharing an ixgbe port - check out the snabbnfv getting started.

For the first question - make a Snabb script that writes two VM virtio-net ports together directly - you can write a script that creates to VhostUser app instances and run this with snabb snsh myscript.lua. Give a yell if you have more questions about this :)

@vmaffione
Copy link
Contributor Author

vmaffione commented Feb 22, 2017

Thanks a lot for the quick reply!
I think this makes the job (first question):

module(..., package.seeall)

local vhostuser = require("apps.vhost.vhost_user")

function run (parameters)
   if not (#parameters == 2) then
     print("Usage: vm2vm <unix-socket-1> <unix-socket-2>")
      main.exit(1)
   end
   local usock1 = parameters[1]
   local usock2 = parameters[2]

   local c = config.new()
   config.app(c, "vh1", vhostuser.VhostUser, {socket_path=usock1,is_server=false})
   config.app(c, "vh2", vhostuser.VhostUser, {socket_path=usock2,is_server=false})

   config.link(c, "vh1.tx -> vh2.rx")
   config.link(c, "vh2.tx -> vh1.rx")

   engine.configure(c)
   engine.main({report = {showlinks=true, showapps=true}})
end

I was not able to make it work because QEMU segfaults when using vhost-user, after the vhost-user session has been established with a snabb vhostuser App (I see Snabb logging the vhost-user connection setup). This is the QEMU command that I use (I have setup the hugepages):

# qemu-system-x86_64 /path/to/image.qcow2 -m 512M -object memory-backend-file,id=mem,size=512M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -chardev socket,id=chr0,server,path=/var/run/vhu1.sock -netdev type=vhost-user,id=net0,chardev=chr0 -device virtio-net-pci,netdev=net0

And I run Snabb with

#  src/snabb vm2vm /var/run/vhu1.sock /var/run/vhu2.sock

where "vm2vm" is my script above.

Do you think I'm getting something wrong?
Do you happen to have some tutorials on how to setup QEMU + vhost-user + Snabb?

Thanks!

@lukego
Copy link
Member

lukego commented Feb 22, 2017

Have you seen the HOWTO linked above? That goes into a lot of detail about running all the bits including QEMU. For me it is always crafting the QEMU command line that is the hardest part...

@vmaffione
Copy link
Contributor Author

Ah ok, I will check, thanks!
Btw, I was using the upstream QEMU, because vhost-user is now a standard feature that is also used by OpenVSwitch.

@lukego
Copy link
Member

lukego commented Feb 22, 2017

Upstream should be fine, except that if you restart Snabb you may also need to restart the VM.

@vmaffione
Copy link
Contributor Author

vmaffione commented Feb 22, 2017 via email

@vmaffione
Copy link
Contributor Author

Hi,
I've managed at least to run the two VMs with vhost-user + snabb. I see some initialization going on snabb output (vm2vm is the simple snabb script above)

sudo src/snabb vm2vm /var/run/vm10-10.socket /var/run/vm11-11.socket
Get features 0x18428001
 VIRTIO_F_ANY_LAYOUT VIRTIO_NET_F_MQ VIRTIO_NET_F_CTRL_VQ VIRTIO_NET_F_MRG_RXBUF VIRTIO_RING_F_INDIRECT_DESC VIRTIO_NET_F_CSUM
Get features 0x18428001
 VIRTIO_F_ANY_LAYOUT VIRTIO_NET_F_MQ VIRTIO_NET_F_CTRL_VQ VIRTIO_NET_F_MRG_RXBUF VIRTIO_RING_F_INDIRECT_DESC VIRTIO_NET_F_CSUM
vhost_user: Skipped old feature cache in /tmp/vhost_features___var__run__vm10-10.socket
vhost_user: Caching features (0x18008001) in /tmp/vhost_features___var__run__vm10-10.socket
Set features 0x18008001
 VIRTIO_F_ANY_LAYOUT VIRTIO_NET_F_MRG_RXBUF VIRTIO_RING_F_INDIRECT_DESC VIRTIO_NET_F_CSUM
rxavail = 0 rxused = 0
rxavail = 0 rxused = 0

However, I don't see any traffic. That is because something is failing, and indeed qemu reports:

qemu-system-x86_64: unable to start vhost net: 1: falling back on userspace virtio

which means that is not really using vhost.
Tracking qemu code, I see the failure happens in the vhost_user_set_vring_enable function: QEMU checks for the VHOST_USER_F_PROTOCOL_FEATURES (bit 30), that apparently is not exported by Snabb VhostUser App. Any idea why?

I'll also try with Snabb QEMU fork.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

@vmaffione which QEMU version are you using? Currently our CI is testing interoperability with several upstream QEMU versions (2.1.3, 2.2.1, 2.3.1, 2.4.1, 2.5.1, 2.6.0, and our patched version) and seeing exactly the same behavior/performance with all of them. However, this list needs to be updated to include the latest QEMU releases and it is possible that they have made some changes that are not backwards compatible (Virtio-net is a moving target and version 1.0 was finalized after we wrote our code.)

@vmaffione
Copy link
Contributor Author

Hi,
Indeed, your QEMU branch works also for me. I'm using upstream qemu, that is the latest code on the qemu master branch on the git repository (git://git.qemu-project.org/qemu.git). The latest QEMU release is 2.8.0 (the one I have installed on my system as original QEMU package).

@lukego
Copy link
Member

lukego commented Mar 31, 2017

Thanks for the report. I will add the latest QEMU versions to our CI coverage as the first step towards fixing support and ensuring that it works going forwards.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

@vmaffione I updated the list of QEMU versions to test and the CI is running that now. This triggered around 2500 new test runs that take around one minute each (booting VMs and sending traffic through snabbnfv with iperf or DPDK) and are spread cross 10 servers. We can check the result once it completes and maybe identify a compatibility problem between Snabb and newer QEMU.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

(The reason it takes so many tests is that it is checking many permutations of configurations and also software setup inside the guests e.g. many versions of DPDK for the tests that need it.)

@vmaffione
Copy link
Contributor Author

Ok sure. It should fail at feature negotiation time, so your VMs should not able to receive/send any traffic through the virtio-net interfaces.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

One CI extension that we have talked about but never done is to also test the master branches of projects like QEMU and DPDK so that we are aware of any interop issues before they are released and have the chance to make a fix ahead of time and/or engage with them to understand what is going on. But - CI ambitions are infinite... :)

@lukego
Copy link
Member

lukego commented Mar 31, 2017

Odd. The tests all passed including with QEMU 2.7.1 and 2.8.0. This is testing with the unmodified sources from the upstream QEMU release tarballs.

Here is one randomly chosen set of logs (stdout) from two QEMU VMs connected by a Snabb process and running iperf in between: logs. Can you spot what might be a difference compared with your setup? (Are you using the upstream QEMU or a vendor branch?) If the difference is not clear could you please gist more complete information including the full QEMU command-line and output?

@vmaffione
Copy link
Contributor Author

I'm not using a QEMU release (tarball), I just use the current master branch on the official QEMU git repository (git://git.qemu-project.org/qemu.git). It may be that the difference is due to developments after 2.8.0, that are already in the master branch (note that the QEMU master branch is stable code).

@vmaffione
Copy link
Contributor Author

QEMU just tells me

qemu-system-x86_64: unable to start vhost net: 1: falling back on userspace virtio

as explained above, and the cause is the feature bit.

This is the full QEMU command line I use:

qemu-system-x86_64 /home/vmaffione/git/vm/netmap.qcow2 -enable-kvm -smp 2 -m 512M -vga std -nographic -snapshot -device e1000,netdev=mgmt,mac=00:AA:BB:CC:0a:99 -netdev user,id=mgmt,hostfwd=tcp::20010-:22 -numa node,memdev=mem0 -object memory-backend-file,id=mem0,size=512M,mem-path=/dev/hugepages,share=on -device virtio-net-pci,netdev=data10,mac=00:AA:BB:CC:0a:0a,ioeventfd=on,mrg_rxbuf=on -chardev socket,id=char10,path=/var/run/vm10-10.socket,server -netdev type=vhost-user,id=data10,chardev=char10

@lukego
Copy link
Member

lukego commented Mar 31, 2017

Interesting. I don't see anything relevant in the QEMU 2.9.0-rc release notes. I also don't immediately see a reason when browsing the commits to master:

qemu$  git log --oneline v2.8.0..master | grep vhost-user
0c0eb30 tests: fix vhost-user-test leaks
e7c83a8 vhost-user: delay vhost_user_stop
79cad2f qemu-options.hx: add missing id=chr0 chardev argument in vhost-user example
e0b283e vhost-user: delete chardev on cleanup
c5f048d vhost-user: Add MTU protocol feature and op
2858bc6 virtio: avoid using guest_notifier_mask in vhost-user mode
e10e798 tests/vhost-user-bridge: use contrib/libvhost-user
7b2e5c6 contrib: add libvhost-user
98206d4 tests/vhost-user-bridge: do not accept more than one connection
9652f57 tests/vhost-user-bridge: indicate peer disconnected
4e4212d tests/vhost-user-bridge: remove unnecessary dispatcher_remove
3d1ad18 tests/vhost-user-bridge: remove false comment

The error message strikes me as a little suspicious. Are you sure this is coming from the vhost-user subsystem? It kind of sounds like it is from trying to talk with the kernel where it would like to use /dev/vhost-net acceleration but that fails so it falls back on read()/write() from userspace. Could possibly be a bug introduced on the master branch that breaks argument parsing so that you are not really testing vhost-user even?

If you have the patience it would be interesting to git bisect to identify the relevant commit on QEMU.

@vmaffione
Copy link
Contributor Author

vmaffione commented Mar 31, 2017

I understand your doubts, but I'm sure :)
I pointed you at the exact "if branch" that causes the failure, and it is the first check in the function vhost_user_set_vring_enable, line 387 of hw/virtio/vhost-user.c. So it is vhost user code.
I just added a printf() statement there to be sure that check is failing.

This function (and the associated check) is not present on your QEMU fork, that's why.

@vmaffione
Copy link
Contributor Author

The missing bit VHOST_USER_F_PROTOCOL_FEATURES is in QEMU since 2015, whereas I see that your QEMU fork is aligned with 2014. This explains why your QEMU fork works and the current one doesn't.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

The curious thing is that our CI is testing 9 different versions of QEMU i.e. our old fork + the last 8 major upstream releases without any patches. None of these is failing in the CI tests. So I need to double-check that the CI is really testing what it is supposed to.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

@vmaffione It would help if you can verify that the problem happens with a released version of QEMU like 2.8.0. Just to rule out the possibility of interference from a recent change in master that is not included in the tests that I am doing.

@vmaffione
Copy link
Contributor Author

I can do that if you point me at the release tarball.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

@lukego
Copy link
Member

lukego commented Mar 31, 2017

Hm... I see something suspicious in the CI that it may be building the right QEMU but actually running the test with the standard packaged version. In that case the CI is not really testing what it thinks it is. @domenkozar do you agree with that assessment based on the runtime dependencies graph showing qemu-2.5.1 on a test that is supposed to be using 2.7.1?

@vmaffione
Copy link
Contributor Author

With 2.7.1 I don't even see the virtio-net interface for yet another problems (unrelated)

[    1.214409] virtio_net virtio0: virtio: device uses modern interface but does not have VIRTIO_F_VERSION_1

You could try to run snabb and qemu manually to check for the error I'm getting.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

Thanks @vmaffione for the extra details. Could be that we are lagging badly on QEMU support due to a CI error. Looking into it.

@lukego
Copy link
Member

lukego commented Mar 31, 2017

@vmaffione I have manually tested 2.7.1 and 2.8.0 now. 2.7.1 is working for me but 2.8.0 is showing the faling back on userspace virtio error message. So - thanks for persisting with the report. This seems to be both an interop problem between Snabb and QEMU 2.8.0 and also likely a CI bug since the problem is not visible there (but cannot yet rule out something else e.g. related to peculiar command line arguments.)

@vmaffione
Copy link
Contributor Author

ok!

@lukego lukego added the bug label Mar 31, 2017
@domenkozar
Copy link
Member

@lukego it is possible to specify qemu and dpdk from git: https://github.com/snabblab/snabblab-nixos/blob/master/jobsets/snabb-matrix.nix#L41

I've just pushed a fix for qemu testing in snabblab/snabblab-nixos@8621ed4

dpino pushed a commit to dpino/snabb that referenced this issue Jun 19, 2018
Add ability to "snabb top" to skim back through time
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants