-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore memleak detect, fix leaks we missed. #5055
Merged
rustyrussell
merged 19 commits into
ElementsProject:master
from
rustyrussell:guilt/restore-memleak-detect
Feb 26, 2022
Merged
Restore memleak detect, fix leaks we missed. #5055
rustyrussell
merged 19 commits into
ElementsProject:master
from
rustyrussell:guilt/restore-memleak-detect
Feb 26, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ACK 6109132 |
rustyrussell
force-pushed
the
guilt/restore-memleak-detect
branch
from
February 25, 2022 10:42
6109132
to
8f2defc
Compare
These tests have proven to be: a) very expensive, as they spin up many nodes, and perform long setup b) are not testing anything specific, they just fuzz functionality that is already tested otherwise c) have not helped pinpoint any issues in living memory d) are very flaky, making for really bad signal-to-noise, so much that devs usually just restart without even looking at the logs e) even if we were to look at the logs, we'd be unable to reproduce due to the inherent randomness involved in these tests f) are really noisy neighbors, causing other tests to flake as well, further muddying the water All in all, these tests are a waste of time, and source of frustration. [ Cleaned up python unused imports --RR ] Changelog-None
client_read_next(…) calls io_read_wire(…), passing &c->msg_in as the address of a pointer that will be set to the address of the buffer that io_read_wire_(…) will allocate, and passing c (a pointer to the struct client instance) as the parent for the new allocation. As long as the struct client instance eventually gets freed, the allocated message buffer will be freed too, so there is no "leak" in the strict sense of the term, but the freeing of the buffer may not occur for an arbitrarily long time after the buffer has become disused, and indeed many millions of message buffers may be allocated within the lifetime of one struct client instance. handle_client(…) ultimately hands off the c->msg_in to one of several message-type-specific handler functions, and those functions are not TAKES or STEALS on their message buffer parameters and do not free their message buffer arguments. Consequently, each successive call to client_read_next(…) will cause io_read_wire_(…) to overwrite the c->msg_in pointer with the address of a newly allocated message buffer, and the old buffer will be left dangling off of the struct client instance indefinitely. Fix this by initializing c->msg_in to NULL in new_client(…) and then having client_read_next(…) do `c->msg_in = tal_free(c->msg_in)` prior to calling io_read_wire(…). That way, the previous message buffer will be freed just before beginning to read the next message. The same strategy is already employed in common/daemon_conn.c, albeit without nulling out dc->msg_in after freeing it. Fixes: ElementsProject#5035 Changelog-Fixed: hsmd: Fixed a significant memory leak
Seeing if this helps my build box which previously gives: ``` /home/rusty/lightning-ltest/lightningd/lightning_connectd: libbacktrace: unrecognized DWARF version in .debug_info at 6 /home/rusty/lightning-ltest/lightningd/lightning_connectd: libbacktrace: no debug info in ELF executable ``` Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. p is a child of cmd, so it's freed by command_failed. 2. cltv_budget is set a few lines up to the same thing already. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They can only follow pointers (without help) if they point to the *start* of a tal object. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Not really a leak, but it would last as long as the peer. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The current client is in dbid_zero_clients (i.e. it's lightningd). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. We don't keep a pointer to payments (unlike pay, which keeps a linked list), so mark it notleak. 2. plugin->our_features is overloaded for "features we want to set" (used by keysend) and then "features we have". Create a new field, which is cleaner. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This causes weirdness in pay, which tal_steal's b11->routes and expects to get it all. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. The dijkstra can be temporary, doesn't need to last as long as pay cmd. 2. We fail multiple times in several places, so don't leak old failreason. 3. Make payments findable by our memleak detector. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Tell memleak about our linked-list of current payments. 2. Don't remove them from list until we actually free them (in destructor, naturally). 3. Decode invoices into tmpctx (we steal / copy what we want anyway). 4. Free params after we've used them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Not actually leaks, but they do live longer than they need. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We allocate the default, then callback allocates over the top. Mark params with a default, so we can free that when it's called. (We can't do this generally, since not all param args are actually pointers to pointers, though opt_param_def has to be). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell
force-pushed
the
guilt/restore-memleak-detect
branch
from
February 26, 2022 00:50
8f2defc
to
56a0d7c
Compare
It's a small leak, but real. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Accidentally removed in 6c9b752. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We were ignoring the *parent* of the notleak pointers, not the notleak pointer itself! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We often hand an exclude pointer (usually the current command) to memleak. But when we encountered this we would stop iterating, rather than just ignore it: this means we would often ignore significant siblings. In particular, fixing this (which has always been there) reveals many previously-undetected leaks. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell
force-pushed
the
guilt/restore-memleak-detect
branch
from
February 26, 2022 04:06
56a0d7c
to
b0cbf1a
Compare
Ack b0cbf1a |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Based on #5051 this fixes all the other (far less important!) leaks, and fixes the leak detector which had multiple problems.
Of course, I wrote it in the other order, to make sure it found #5051!
(Rebased on #5056 now #5051 is merged)