-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setsockopt out of memory causes babeld failure #24
Comments
I have installed a babeld-monitor on both the HE and Psychz exit node to detect and apply a workaround for the issue reported in #24 . Using a systemd timer babeld-monitor.timer , babeld log is scanned for specific memory error every 10 minutes. If detected, babeld is restarted and all active tunnel interfaces are re-added to babeld. All things can be observed in the systemd logs. All this is now also added when using create_exitnode via sudomesh/exitnode repo. Please see https://github.com/sudomesh/exitnode/tree/master/src/opt/babeld-monitor and https://github.com/sudomesh/exitnode/tree/master/src/etc/systemd/system if you'd like to learn more about this. |
I hope we can remove this hack once the root cause of the babeld error can be found and fixed. |
@jhpoelen nice haxxx! I'm reading up on systemd now... Would love to figure out the root cause too. Raw socket land seems like a daunting land tho. Maybe need to use a phone-a-friend. |
Tried to write a dead-simple stress test today at the software working group with @eenblam and @squeeesh, but we were unable to reproduce the bug. I think our test did not go quite deep enough--an Probably a better test would involve creating fresh network interfaces and adding to babeld instead of adding/removing my computer's default interface over and over again :-P I'm not sure what's a good way to create a bunch of functional network interfaces... Also, @eenblam noticed that in the re6stnet commit, they seem to suggest that their fix was to clean up their tunnels less aggressively. So: maybe babeld needs to And @squeeesh found this cool and terrifying network stress test lib https://github.com/dtaht/rtod. |
Oh, and if it /is/ a matter of giving babeld a chance to
from https://github.com/wlanslovenija/tunneldigger/blob/master/HISTORY.rst. (Hook scripts are executed in their own processes.) |
This resolves the issue described here sudomesh/bugs#24 Where babel will be uanble to free it's resources for the interface and run out of memory
This resolves the issue described here sudomesh/bugs#24 Where babel will be uanble to free it's resources for the interface and run out of memory
This pulls the latest version of the kernel_setup_interface function in from upstream with the hope that it fixes some obscure issues we're having. setsockopt(IPV6_JOIN_GROUP): Out of memory setsockopt(IPV6_LEAVE_GROUP): Address not available Warning: cannot restore old configuration for wgA. Warning: cannot save old configuration for wgB. We keep seeing these sorts of error messages on long running production nodes, presumably due to the race condition outlined here sudomesh/bugs#24 Obviously it would be best if we could recover from these errors in Babel rather than having to try and reduce them on the side of the interfacing application. That being said this isn't a well consdiered change, it may be that we have to cleanup old_if in this error case in a way upstream has not considered.
This pulls the latest version of the kernel_setup_interface function in from upstream with the hope that it fixes some obscure issues we're having. setsockopt(IPV6_JOIN_GROUP): Out of memory setsockopt(IPV6_LEAVE_GROUP): Address not available Warning: cannot restore old configuration for wgA. Warning: cannot save old configuration for wgB. We keep seeing these sorts of error messages on long running production nodes, presumably due to the race condition outlined here sudomesh/bugs#24 Obviously it would be best if we could recover from these errors in Babel rather than having to try and reduce them on the side of the interfacing application. That being said this isn't a well consdiered change, it may be that we have to cleanup old_if in this error case in a way upstream has not considered.
This pulls the latest version of the kernel_setup_interface function in from upstream with the hope that it fixes some obscure issues we're having. setsockopt(IPV6_JOIN_GROUP): Out of memory setsockopt(IPV6_LEAVE_GROUP): Address not available Warning: cannot restore old configuration for wgA. Warning: cannot save old configuration for wgB. We keep seeing these sorts of error messages on long running production nodes, presumably due to the race condition outlined here sudomesh/bugs#24 Obviously it would be best if we could recover from these errors in Babel rather than having to try and reduce them on the side of the interfacing application. That being said this isn't a well consdiered change, it may be that we have to cleanup old_if in this error case in a way upstream has not considered.
This pulls the latest version of the kernel_setup_interface function in from upstream with the hope that it fixes some obscure issues we're having. setsockopt(IPV6_JOIN_GROUP): Out of memory setsockopt(IPV6_LEAVE_GROUP): Address not available Warning: cannot restore old configuration for wgA. Warning: cannot save old configuration for wgB. We keep seeing these sorts of error messages on long running production nodes, presumably due to the race condition outlined here sudomesh/bugs#24 Obviously it would be best if we could recover from these errors in Babel rather than having to try and reduce them on the side of the interfacing application. That being said this isn't a well consdiered change, it may be that we have to cleanup old_if in this error case in a way upstream has not considered.
This pulls the latest version of the kernel_setup_interface function in from upstream with the hope that it fixes some obscure issues we're having. setsockopt(IPV6_JOIN_GROUP): Out of memory setsockopt(IPV6_LEAVE_GROUP): Address not available Warning: cannot restore old configuration for wgA. Warning: cannot save old configuration for wgB. We keep seeing these sorts of error messages on long running production nodes, presumably due to the race condition outlined here sudomesh/bugs#24 Obviously it would be best if we could recover from these errors in Babel rather than having to try and reduce them on the side of the interfacing application. That being said this isn't a well consdiered change, it may be that we have to cleanup old_if in this error case in a way upstream has not considered.
Update CHANGES. Implement mandatory bits in all TLVs. Big fixes while parsing sub-TLVs. - Hello is not ignored if there is a mandatory sub-TLV, - non-wildcard Updates also, - Duplicated check for Requests, - wrong size for the beginning of sub-TLVs for Seqno Requests, - wrong size for source specific Requests and Seqno Requests. Fix unlikely corner-cases (not bugs). Fix parsing of sub-TLVs. Update handling of sub-TLVs to comply with latest spec. Remove keep_unfeasible, in compliance with rfc6126bis. Ignore unicast Hellos (for now). Implement unscheduled Hellos. This also removes special casing of late Hellos. Move hello history into a separate structure. Maintain unicast Hello history. Use unicast Hellos for reachability on wired links. Update CHANGES. Fix forgotten call to send_request_resend. Take unicast Hellos into account when scheduling neighbours check. Update CHANGES. Fix typo in send_request. Remove calls to send wildcard requests. Since send_request was buggy, these weren't doing anything. Don't change the behaviour, sending wildcard requests at startup is not a good idea. Fix parsing of source prefix length in filters for IPv4 routes. Fix parser memory leaks. Fix: ignore peer address of point-to-point interfaces. Point-to-point interfaces are bound to two link-local addresses: the local address and the peer address [1]. The former is advertised with an IFA_LOCAL TLV and the latter with an IFA_ADDRESS TLV. [1] $ip addr show [...] inet6 fe80::1234 peer fe80::5678/128 scope link Improve the test scripts in tests/: This commit improves the tests by making the rtt test almost identical to multihop-gdb (formerly known as multihop-hand). Also, it fixes a small grep problem in multihop-smoketest (formerly known as multihop-basic). Another thing which happens here is some name changes to more descriptive names. travis tests A quick set of compilation, linting and integration tests to run on patches Improve the price/quality multiplier: This commit intends to harden and document the price/quality tradeoff knob better. Here's what it does: * Change its name to quality_multiplier * Change its type to uint16_t * Start using strtoul() to parse it from the command line * Add an entry for -a in the manpage and the usage string Tidy up the Althea extensions to Babel This commit aims to make Althea-specific changes to Babel more robust and integrated with the implementation. babeld.c: * Switch the price to the uint32_t type * Check the price and multiplier more strictly * Add both of the new flags to the usage message babeld.man: * Add entries for the price and the quality multiplier configuration.c: * Make getuint() check for errno after the strtoul() call * Add a config option for the quality multiplier local.c: * Explicitly list "full-path-rtt" in dumps (used to be just "rtt") message.c: * Correct variable names to explicitly mention the full path RTT util.c: * Remove the redundant parse_price function xroute.c: * Hardcode a 0 price value in add_xroute so that we don't bill our neighbours and only for forwarding. tests/multihop-gdb-rtt.sh: * typo Rename price to fee This commit aims to make the price metric code more intuitive by renaming a node's profits from "per_byte_cost" to "fee" so that all price-related variables can be viewed from the running node's perspective, e.g.: Alice runs a node which takes a *fee* of $5 for forwarding a byte. It receives several *prices* from her neighbours Bob, Kevin and Charlotte. She then computes a new *price* equal to *received_price + fee* for every non-xroute route she wants to advertise. The previous approach would name nearly every aspect of the price differently: * The CLI argument was `-P` * The socket config value was `price` * The C code fee variable was `per_byte_cost` With this commit these names change as follows: * CLI arg is now `-F` * The socket config value is now `fee` * The C code fee variable is now just `fee` The price-related members of the different route structs around the code are still named `price`, because a *price* means a "retail" advertised (to us or by us) value with all fees included. Weatherproof the tests: This commit tries to make obvious errors easier to catch both in CI and manual testing. Note: multihop-smoketest.sh contains commented out suboptimal routes. Even though Babel is capable of converging on the best routes in mere seconds, the suboptimal paths in the graph are often incorrect or possibly riddled with cycles. The suboptimal routes will get uncommented/deleted once the problem is resolved/explained. tests/multihop-gdb-rtt.sh: tests/multihop-gdb.sh: * Change netlab-4 fee to 7 - no two prices give the same sum anymore tests/multihop-smoketest.sh: * Extend the node layout to a 4-node diamond * Minimize hello and update intervals to speed up convergence * Decrease the delay to 5 seconds (12 times quicker, baby!) * Strengthen route checking - installed route optimality is now precisely verified Useless initialization (do_filter do the job). Rename price to fee in the usage string Account for the 0% loss breaking change in netem test.sh: typo: cppchecki -> cppcheck Add a script for backwards compatibility testing: This commit introduces a simple script based on multihop-smoketest.sh which takes any two revisions of Babel and Checks whether they can talk to each other. .gitignore: * Ignore test-time temporary repos test.sh: * Add the compat test Add reachability testing to the smoketest Make all statements follow debugging rules This debugging statement was wrapped in a printf instead of a debugf resulting in noisy normal operation Useless initialization (do_filter do the job). Update CHANGES for 1.8.1. Fix parsing of source length in filters. This fixes a bug that was introduced in commit 4f4e3cb, and prevented non-source-specific IPv4 routes from being redistributed. Thanks to Niklas Yann Wettengel for the detective work. Update CHANGES for 1.8.2. Tests: change ports to Rita integration test ones Make test scripts automatically cd into the test dir Fix runtime fee changes The problem was an old name for the "fee" config value (used to be "price") which was included in a parser if statement which would exclude unrecognized keywords. Harden and rename the getuint() function (getuint32_t() from now on) Modify price/quality behavior This commit implements the log2()-based price/quality metric: metric(p, m, f) = log2(p) + log2(m) * f p: The price m: Babel's traditional quality-based metric value f: The metric factor - decides how much we value metric (quality) improvement vs. price improvement when comparing routes Makefile: * Link libm to get math.h to work babeld.c: * Change quality_multiplier to metric_factor, widen it to uint32_t and change the default to 1900 (1.9) * Change the metric factor's CLI option to 'q' babeld.h: disambiguation.c: local.c: message.c: neighbour.c: resend.c: route.h: source.c: xroute.c * INFINITY -> BABEL_INFINITY to resolve a conflict with math.h configuration.c: * INFINITY -> BABEL_INFINITY to resolve a conflict with math.h * Add metric-factor to the weird blacklist if in parse_option() route.c: * INFINITY -> BABEL_INFINITY to resolve a conflict with math.h * Implement the new metric formula tests/multihop-smoketest.sh: * Account for the metric factor option name change * Add a debug mode stop before test assertions too Don't use %d for unsigned numbers This caused our unsigned integer prices to be interpreted and printed as signed integers. Increase allowed management socket connections FIX: NO SUCH DEVICE when adding routes When adding unreachable routes and setting the RTNH_F_ONLINK flag, a device is required to be specified. In Linux kernel 4.16 support for this flag was added. Until now it was ignored. If RTNH_F_ONLINK is specified while the device is missing, newer kernels will respond with No such device. The result is: * spam in the log file * missing routes for both ipv4 and ipv6 Pull in upstream kernel_setup_interface This pulls the latest version of the kernel_setup_interface function in from upstream with the hope that it fixes some obscure issues we're having. setsockopt(IPV6_JOIN_GROUP): Out of memory setsockopt(IPV6_LEAVE_GROUP): Address not available Warning: cannot restore old configuration for wgA. Warning: cannot save old configuration for wgB. We keep seeing these sorts of error messages on long running production nodes, presumably due to the race condition outlined here sudomesh/bugs#24 Obviously it would be best if we could recover from these errors in Babel rather than having to try and reduce them on the side of the interfacing application. That being said this isn't a well consdiered change, it may be that we have to cleanup old_if in this error case in a way upstream has not considered. Set MAX_INTERFACES = 2 increase max interfaces to 10000
Thanks to https://peoplesopen.net/monitor it is now easier to track this. See #8 for early observations.
The Bug
On a fresh boot of the psychz exit node, home nodes dig tunnels, babel babels, and everybody's routing tables get filled with mesh routes. But...
Over time (after about 24-48 hours), routes start to slowly disappear from the routing table, and they don't return until
babeld
andtunneldigger-broker
are restarted on the exit node.Debugging
This appears to be due to a memory leak in babeld. When the exit node is in the bad state, looking at
/var/log/babeld.log
during a tunnel connect shows:i.e. babeld tries to add the socket to its ipv6 broadcast group and fails due to a memory allocation error.
When the exit node is in a healthy state, no such errors get logged to
/var/log/babeld.log
, and the mesh routes get added to the routing table as expected.Conclusion
It looks like there's a socket option memory leak in babeld. I think we're only seeing this bug now in the last month because someone happens to be running a weird node that disconnects and reconnects its tunnel every 5 minutes. You can see this behavior by watching
/var/log/syslog
on the psychz node for 5 minutes.Every time the rogue node destroys and recreates a tunnel, the tunneldigger up and down hooks are run, the old tunnel interface is removed from babeld (
babeld -x $ifname
) and the new tunnel interface is added (babeld -a $ifname
).It seems that removing an interface from babeld does not properly clean up all used memory, and eventually babeld is unable to
setsockopt
on new sockets.Todo
Look into socket option memory allocation? Halp!
The text was updated successfully, but these errors were encountered: