-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn on ingress drop in NFV #1045
Conversation
Before, snabbnfv would run the engine for one-second intervals, checking for restart every second and restarting the engine. However this interacts poorly with the latency-tracking mechanism, which expects to be able to call non-performant functions like ffi.typeof() at setup, because it's just setup. With this fix we have less trace creation and interpreter bailout at run-time.
Note that the first patch is already part of |
Ingress drop monitor can warn, or warn and flush. Change lwaftr command line argument to take a parameter.
f6c1691
to
de93573
Compare
This has the side effect of removing the ingress drop monitor from the NFV; see upstream #1045.
Yes, I am on board with taking in the extensions to ingress drop monitor and running it on NFV. Just wanted to take the opportunity to muse on the future because the current implementation feels like a short-term expedient thing to me (e.g. will only detect network-sourced overloads and not VM-sourced overloads). I am also not sure yet on whether it's better to make global JIT flushes or selective ones or a combination of both. Interested in getting to the bottom of this and having a really robust solution. Some musing at LuaJIT/LuaJIT#218 (comment). |
I suggest putting off merging this until #1047 lands, FWIW. I think it has a minor conflict. |
This also addresses the review comments from upstream #1047, noting that the shm.exists() API changed.
Merged to |
Extend find-limit to support multiple NICs.
WDYT @lukego? We found it useful to know when the slow part of our system was the NFV and not the guest. Otherwise you can NACK it, that's fine, we keep it in the lwaftr as it is.
Following up with your comments from #1039 (comment):
All of what you say makes sense :) The overloaded/underloaded terms make sense and I would be happy to detect that via any of the options, though (3) does sound appealing. Still, detecting the precise number of packets dropped on ingress is important to some of our users and that's a useful thing too, and might even be orthogonal to a (3) solution.
I look forward to your trace-specific watchdog work -- that sounds much much better than doing a heavy-handled
jit.flush()
!