-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pinning: Notification block pinning limit reached for warp-sync #4389
Comments
I will warp sync a new kusama node on a dev machine and see if I can find anything fishy. |
I was able to reproduce this. As a quick recap, we do not prune blocks that have It looks like during initialization, the beefy worker is expecting a given header ancestry to be available and is basically waiting for headers to be available here: polkadot-sdk/substrate/client/consensus/beefy/src/lib.rs Lines 640 to 646 in efc2132
In my test example it took 6 hours for this loop to continue. But in the meantime we have a finality stream that is not getting polled here:
So there, the notifications pile up and the pinning cache limit is reached. I am not too familiar with the beefy logic, but maybe we could drain the stream while working for the headers or, if we need to act on every notification, we could map them to a different type that does not include the handle. @acatangiu What do you think? |
As I already proposed here: #3945 (comment) the beefy networking should not be registered before beefy is active. |
(Basically this entire code should not do anything until BEEFY is ready.) |
I believe this is a different case that would still be a problem even if BEEFY registers networking only after being active. In case of warp sync, only mandatory headers get synced, and chain "normal operation" starts while rest of the headers sync in the background. In either case, BEEFY worker needs to register for finality notifications because it is exclusively driven by GRANDPA finality. Simplified initialization steps:
Therefore, not helped by #3945 (comment) |
Yes, either drain the notifications and map/enqueue just the relevant data or just full header to some BEEFY worker queue - once the sync finished, worker needs to process the queue then switch back to processing notifications once it caught up. |
This should prevent excessive pinning of blocks while we are waiting for the block ancestry to be downloaded after gap sync. We spawn a new task that gets polled to transform finality notifications into an unpinned counterpart. Before this PR, finality notifications were kept in the notification channel. This led to pinning cache overflows. fixes #4389 --------- Co-authored-by: Bastian Köcher <git@kchr.de>
) This should prevent excessive pinning of blocks while we are waiting for the block ancestry to be downloaded after gap sync. We spawn a new task that gets polled to transform finality notifications into an unpinned counterpart. Before this PR, finality notifications were kept in the notification channel. This led to pinning cache overflows. fixes paritytech#4389 --------- Co-authored-by: Bastian Köcher <git@kchr.de>
The following warning is continuously printed for a warp-sync kusama node (running litep2p backend) at around ~20 seconds intervals.
This warning did not reproduce however for a full-syncing node.
This was discovered during a long-running node triage (around 2 weeks running litep2p backend).
It seems that during the warp-sync the pinning limit is reached, then the pinning always keeps around 1k pinned blocks. When a new block is discovered, the LRU block is unpinned causing an warning.
It might be the case that references to pinned blocks are kept around for too long (maybe because we have too many blocks).
Warning is coming from:
polkadot-sdk/substrate/client/service/src/client/notification_pinning.rs
Lines 96 to 103 in 0fedb4d
cc @paritytech/sdk-node @paritytech/networking
The text was updated successfully, but these errors were encountered: