Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common/gossip_store: optimize case where entries are filtered out. #5328

Merged

Conversation

rustyrussell
Copy link
Contributor

@whitslack complained of large CPU usage by connectd at startup;
I ran perf record on connectd on my machine (which sees a little spike, only)
and I see the cost of reading and discarding the entries:

-   95.52%     5.24%  lightning_conne  lightning_connectd  [.] gossip_store_next
   - 90.28% gossip_store_next 
      + 40.27% tal_alloc_arr_ 
      + 22.78% tal_free
      + 11.74% crc32c
      + 9.32% fromwire_peektype
      + 4.10% __libc_pread64 (inlined)
        1.70% be32_to_cpu

Much of this is caused by the search for our own gossip: keeping this separately
would be even better, but this fix is minimal.

@rustyrussell rustyrussell added this to the v0.11.2 milestone Jun 20, 2022
@whitslack complained of large CPU usage by connectd at startup;
I ran perf record on connectd on my machine (which sees a little spike, only)
and I see the cost of reading and discarding the entries:

```
-   95.52%     5.24%  lightning_conne  lightning_connectd  [.] gossip_store_next
   - 90.28% gossip_store_next
      + 40.27% tal_alloc_arr_
      + 22.78% tal_free
      + 11.74% crc32c
      + 9.32% fromwire_peektype
      + 4.10% __libc_pread64 (inlined)
        1.70% be32_to_cpu
```

Much of this is caused by the search for our own gossip: keeping this separately
would be even better, but this fix is minimal.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: connectd: reduce initial CPU load when connecting to peers.
@rustyrussell rustyrussell force-pushed the optimize-gossip-serve branch from cf1bc8a to b28adf1 Compare June 20, 2022 02:44
@whitslack
Copy link
Collaborator

@rustyrussell: To be clear, it's not just at startup. lightning_connectd never settles down. I started my current Core Lightning process 171 hours ago, and lightning_connectd has used 152 hours of CPU time and is still pegging. That's an average of 88.9% CPU utilization over the lifetime of the process.

@rustyrussell
Copy link
Contributor Author

We will do this scan every time someone reconnects: for me that's worst at startup, but if you have enough peers....

This patch should drop it by a factor of ~10. We can get more sophisticated (keep our own gossip separately for this case) but that's a more invasive change which risks introducing other bugs.

@cdecker
Copy link
Member

cdecker commented Jun 22, 2022

This patch should drop it by a factor of ~10. We can get more sophisticated (keep our own gossip separately for this case) but that's a more invasive change which risks introducing other bugs.

That would be great for sharing/copying a single gossip file (for bootstrapping) among multiple nodes as the non-local gossip messages would not be intermingled with the public ones. 👍

@cdecker
Copy link
Member

cdecker commented Jun 22, 2022

ACK b28adf1

Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK b28adf1

@rustyrussell rustyrussell merged commit 0b78973 into ElementsProject:master Jun 24, 2022
@rustyrussell
Copy link
Contributor Author

Not sure how much this helps #5301, which this should have referred to!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants