-
Notifications
You must be signed in to change notification settings - Fork 919
Description
Description
We are aware of an issue on the mainnet network which is causing Lighthouse to consume more memory than it should. This is leading to Out of Memory (OOM) process terminations on some machines.
The root cause of the issue (we believe) are messages being queued on gossipsub to be sent out. This is a combination of messages being published, messages being forward and gossipsub control messages. The queues are filling up and the memory is not being dropped. This appears to only be occuring on mainnet, we assume in part to the size of the network and the number of messages being transmitted.
There are a number of solutions being put in place and being tested. This issue is mainly a tracking issue, so users can follow along with development updates as we correct this issue.
Primarly the end solution will consist of more efficient memory management (avoid duplicating any memory in messages when sending) this should reduce allocations, a priortisation of messages so that we can prioritise published, forward and control messages individually and finally a dropping mechanism that allows us to drop messages when the queues grow too large.
Memory Allocations:
- feat(quick-protobuf-codec): reduce allocations during encoding libp2p/rust-libp2p#4782
- Arc-ing messages in gossipsub: libp2p/rust-libp2p@d28ff63
Message Prioritisation: