-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tiny improvements #392
Tiny improvements #392
Conversation
These are build-related files.
@eugeneia SnabbBot failure? |
@lukego Seems to have been a one-off. Hmmm. We should probably investigate how the |
@lukego A quick further investigation led to something: It seems to depend on the CPU core in use. So when I use taskset to force a CPU on a different NUMA node than the NIC it will fail, but like wise I can force the task to run on a affine node and it will work. So removing the taskset all together led to random failures. I now explicitly pin the snabb_bot process to CPU 6 which is on the same node as the NIC used by snabb_bot. Let's see if my theory holds and we won't see any more ping failures. |
Cool! Let us get to the bottom of this. I am guessing the problem will ultimately be failure to allocate a HugeTLB page. Snabb Switch and QEMU both need these and they are allocated from per-NUMA-node pools. Do you have any of this information?
Or I could be barking up the wrong tree entirely and maybe it is about the relationship between the node of Snabb Switch and the node of the NIC and/or QEMU. |
@lukego Snabb does not crash, is there a scenario in which Snabb endures a "failure to allocate a HugeTLB page" without at least throwing an error? Regarding 3.: Regarding 4.:
I do suspect it has something to do with the NIC simply because the symptom is packets not arriving. There is no crash or anything, just absence of I/O. |
Cool problem. Could be NIC related but the evidence seems weak to me. "PING failed" is also what you see if the VM fails to boot, right? Also, we have tested various NUMA combinations between NIC/Snabb/QEMU and never seen any bugs of this kind before, the worst case has been ~33% performance impact. So it is possible but I would like to audit logs and eliminate other possible causes too. Thinking of the future: nice if SnabbBot would attach more log files to the gist. I would quite like to see the output of the snabb process, the two QEMU processes on the host, and the ping/iperf/etc processes inside the VMs. I still suspect the issue is related to hugetlb allocation. Here are more thoughts:
|
No since for a VM to be considered "up" the caller needs to successfully telnet into the VM and have it ping itself.
OK. Have increased verbosity in my
Good question. The man page only talks about the target process, not the children spawned by it. The Regarding isolcpus, as far as I can tell we don't do that on davos (or chur) because it broke
|
@lukego |
Check this out:
Looks like each NUMA node has 1024 hugepages available. QEMU 1 wants 512, QEMU 2 wants 512, Snabb Switch wants ~16... there won't be enough if all three processs are running on the same node. Can you try this?
Then hopefully we have 4096 huge pages (=8GB) available on each node. Could also edit the grub config to add this kernel parameter:
so that it happens automatically on boot. (Can be that the kernel will fail to allocate hugepages long after boot due to fragmentation and not being able to find enough contiguous regions.) |
But isn't our failure case happening in the opposite scenario, when all three processes are not on the same node? I tried anyways and increasing the number of hugepages does not fix the issue |
Thanks for checking on the hugepages. I'm surprised that was not the issue. On interlaken I can run snabbnfv on a different node to the NIC and guests can connect to that switch and ping each other. So it is not a black-and-white issue of the traffic process having to be on the same node as the NIC. How about the intel_app selftest, does that work on both nodes for the same NIC? (Can we reproduce this problem in a simpler way?) I'll see if I wake up with new ideas.. |
I guess the most curious thing is that you actually have to |
'lwaftr query' improvements
This branch contains tiny improvements. Initially an update of
.gitignore
so thatgit status
can be clean after a build.