-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virtio net #681
Virtio net #681
Conversation
The functions will be used from the userspace virtio_net dirver, to prepare the packet before sending to the virtio device. The full checksum is done by the device. Fix the long line in the tcp4 selftest.
This makes the paket an object, and allows for example p:free() instead of packet.free(p) Add a new method `physical` to get the physical address of packet.data
The fstruct function creates a dynamic structure out of a ctype string definition. Add also 2 accessor functions for read/write of file descriptr assigned structures.
VIRTIO_NET_F_MRG_RXBUF option changes the format of the virtio net packet header. Make this optional by adding an argument to the VirtioNetDevice new method: disable_mrg_rxbuf.
An implementation of the virtio-net driver specification, intended to run inside a guest virtual machine. Using indirect descriptors allow for more effective usage of the descriptor table. Each statically allocated to a indirect descriptor table, where a preallocated packet header is placed. Adding a buffer is simply finding the index of the descriptor and putting the buffer physical address into its addr field.
local min_features = C.VIRTIO_RING_F_INDIRECT_DESC + C.VIRTIO_NET_F_CSUM | ||
local want_features = C.VIRTIO_F_ANY_LAYOUT + | ||
C.VIRTIO_RING_F_INDIRECT_DESC + | ||
C.VIRTIO_NET_F_CSUM + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually want to negotiate checksum offloading, do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this should depend on the use_checksum
argument.
In general, looking pretty good to me! Some comments above. Especially looking forward to a version that avoids checksumming :) I realize this might not be a thing you can do quickly, but what are your thoughts about testing and benchmarking? We should put some measure in place to make sure the functionality doesn't regress, and if we can, it would be good to ensure that performance doesn't regress either. /Cc @lukego for his thoughts there. |
end | ||
|
||
function fieldwr(field, fd, val) | ||
local buf = ffi.typeof('$ [1]', field.ct)() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is a source of a trace abort.
e.g. in a -jv output there are these lines
[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE --- (20/0) zone.lua:38 -- NYI: unsupported variant of FastFunc ffi.typeof at lib.lua:713]
[TRACE 22 (20/0) zone.lua:38 -- fallback to interpreter]
AFAIU This means that a trace 22, coming from trace 20, met something it couldn't compile, so it is falling back to the interpreter. I don't know why it printed it like 4 times, did it try 4 times? I don't know. Probably so, thinking maybe if it tried again the trace would go somewhere else. It means that this allocation won't be sunk, the whole thing will go on the interpreted path, etc. It's only used in one place; let's just inline it into its caller. Same with fstruct and fieldrd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The virtio PCI BAR0 access is done on initialization. At runtime, only the device notification is done. And with this approach it will be slow.
Unfortunately, the mmap of rersource0 for virtio-pci is not failing. It is not entirely clear why,though.
By inlining - do you mean to just replace ffi.typeof('$ [1]', field.ct)
with ffi.typeof('struct pci_whatever')
?
|
||
## Configuration | ||
|
||
The `VhostUser` app accepts a table as its configuration argument. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it say “VirtioNet” here?
Instead use directly memory.virtual_to_physical
Merged into |
I read somewhere - can't find where now - that @wingo said this virtio-net implementation has now been verified to meet the performance targets of the |
Yes! When we were testing, we had a test setup in which one host process was providing the vhost-user interface for two NICs, not via Snabb-NFV but written in Snabb and using the Snabb vhost-user bits. Turns out the performance was limited by the host and not the guest. When running under snabb-nfv, we were able to hit our 4MPPS target for the lwaftr. Excellent work @nnikolaev-virtualopensystems and team! The test application in the guest, which was similar to the DPDK's l2fwd, was also seeing some trace aborts. It turned out this was something to do with the guest's configuration. I don't remember the details; @dpino knows more. Anyway with a different guest program or even with the lwaftr we don't spend any time in the interpreter. |
Clarification: switching to snabb-nfv in the host provided the speedup because you have to run it in a core-per-NIC configuration, so the host side had two cores to service traffic instead of one. |
Super cool. I would love to have a CI benchmark that is equivalent to the DPDK-VM one we already have but running Snabb with Virtio-net inside the VM. The entrypoint to that code is packetblaster_bench.sh if you guys are interested. That way we could keep track of VirtioNet performance both to catch regressions and also to see whether it can take advantage of optimizations that we make on the host underneath. |
Once more: This is a great piece of work! Just reading the code now on the |
This is an implementation of the virtio-net driver part of the virtio specification. The driver uses indirect descriptor tables and preallocated virtio headers for better performance. Checksum offload packet preparation is available as an optional feature.
It obsoletes #645.