Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtio net #681

Merged
merged 13 commits into from
Feb 1, 2016
Merged

Virtio net #681

merged 13 commits into from
Feb 1, 2016

Conversation

ghost
Copy link

@ghost ghost commented Dec 22, 2015

This is an implementation of the virtio-net driver part of the virtio specification. The driver uses indirect descriptor tables and preallocated virtio headers for better performance. Checksum offload packet preparation is available as an optional feature.

It obsoletes #645.

Nikolay Nikolaev added 6 commits December 15, 2015 18:55
The functions will be used from the userspace virtio_net dirver,
to prepare the packet before sending to the virtio device.
The full checksum is done by the device.

Fix the long line in the tcp4 selftest.
This makes the paket an object, and allows for example p:free() instead
of packet.free(p)

Add a new method `physical` to get the physical address of packet.data
The fstruct function creates a dynamic structure out of a ctype string definition.

Add also 2 accessor functions for read/write of file descriptr assigned structures.
VIRTIO_NET_F_MRG_RXBUF option changes the format of the virtio net
packet header. Make this optional by adding an argument to the
VirtioNetDevice new method: disable_mrg_rxbuf.
An implementation of the virtio-net driver specification, intended to run
inside a guest virtual machine.

Using indirect descriptors allow for more effective usage of the descriptor
table. Each statically allocated to a indirect descriptor table, where a
preallocated packet header is placed. Adding a buffer is simply finding the
index of the descriptor and putting the buffer physical address into its addr field.
local min_features = C.VIRTIO_RING_F_INDIRECT_DESC + C.VIRTIO_NET_F_CSUM
local want_features = C.VIRTIO_F_ANY_LAYOUT +
C.VIRTIO_RING_F_INDIRECT_DESC +
C.VIRTIO_NET_F_CSUM +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually want to negotiate checksum offloading, do we?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this should depend on the use_checksum argument.

@wingo
Copy link
Contributor

wingo commented Dec 23, 2015

In general, looking pretty good to me! Some comments above. Especially looking forward to a version that avoids checksumming :)

I realize this might not be a thing you can do quickly, but what are your thoughts about testing and benchmarking? We should put some measure in place to make sure the functionality doesn't regress, and if we can, it would be good to ensure that performance doesn't regress either. /Cc @lukego for his thoughts there.

end

function fieldwr(field, fd, val)
local buf = ffi.typeof('$ [1]', field.ct)()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a source of a trace abort.

e.g. in a -jv output there are these lines

[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE  22 (20/0) zone.lua:38 -- fallback to interpreter]

AFAIU This means that a trace 22, coming from trace 20, met something it couldn't compile, so it is falling back to the interpreter. I don't know why it printed it like 4 times, did it try 4 times? I don't know. Probably so, thinking maybe if it tried again the trace would go somewhere else. It means that this allocation won't be sunk, the whole thing will go on the interpreted path, etc. It's only used in one place; let's just inline it into its caller. Same with fstruct and fieldrd.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The virtio PCI BAR0 access is done on initialization. At runtime, only the device notification is done. And with this approach it will be slow.
Unfortunately, the mmap of rersource0 for virtio-pci is not failing. It is not entirely clear why,though.

By inlining - do you mean to just replace ffi.typeof('$ [1]', field.ct) with ffi.typeof('struct pci_whatever')?


## Configuration

The `VhostUser` app accepts a table as its configuration argument. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it say “VirtioNet” here?

Nikolay Nikolaev added 2 commits January 5, 2016 00:20
Instead use directly memory.virtual_to_physical
@eugeneia
Copy link
Member

Merged into max-next.

eugeneia added a commit to eugeneia/snabb that referenced this pull request Jan 19, 2016
@lukego
Copy link
Member

lukego commented Jan 22, 2016

I read somewhere - can't find where now - that @wingo said this virtio-net implementation has now been verified to meet the performance targets of the snabb-lwaftr application? In this case a hearty congratulations to all concerned!

@wingo
Copy link
Contributor

wingo commented Jan 22, 2016

Yes! When we were testing, we had a test setup in which one host process was providing the vhost-user interface for two NICs, not via Snabb-NFV but written in Snabb and using the Snabb vhost-user bits. Turns out the performance was limited by the host and not the guest. When running under snabb-nfv, we were able to hit our 4MPPS target for the lwaftr. Excellent work @nnikolaev-virtualopensystems and team!

The test application in the guest, which was similar to the DPDK's l2fwd, was also seeing some trace aborts. It turned out this was something to do with the guest's configuration. I don't remember the details; @dpino knows more. Anyway with a different guest program or even with the lwaftr we don't spend any time in the interpreter.

@wingo
Copy link
Contributor

wingo commented Jan 22, 2016

Clarification: switching to snabb-nfv in the host provided the speedup because you have to run it in a core-per-NIC configuration, so the host side had two cores to service traffic instead of one.

@lukego
Copy link
Member

lukego commented Jan 22, 2016

Super cool.

I would love to have a CI benchmark that is equivalent to the DPDK-VM one we already have but running Snabb with Virtio-net inside the VM. The entrypoint to that code is packetblaster_bench.sh if you guys are interested. That way we could keep track of VirtioNet performance both to catch regressions and also to see whether it can take advantage of optimizations that we make on the host underneath.

@lukego
Copy link
Member

lukego commented Jan 22, 2016

Once more: This is a great piece of work!

Just reading the code now on the max-next branch and really enjoying it. Really nice touch with the diagram of the vrings :-)

@eugeneia eugeneia merged commit 7a17d0a into snabbco:master Feb 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants