-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix peer down-scoring behaviour when gossip blobs/columns are received after getBlobs
or reconstruction
#6686
Conversation
…hey are recieved from the EL or available via column reconstruction.
@mergify queue |
🛑 The pull request has been removed from the queue
|
@mergify dequeue |
This pull request has been removed from the queue for the following reason: Pull request #6686 has been dequeued by a You should look at the reason for the failure and decide if the pull request needs to be fixed or if you want to requeue it. If you want to requeue this pull request, you need to post a comment with the text: |
✅ The pull request has been removed from the queue
|
@mergify requeue |
✅ This pull request will be re-embarked automaticallyThe followup |
✅ The pull request has been merged automaticallyThe pull request has been merged automatically at 847c801 |
Issue Addressed
Fixed a peer disconnection issue recently reported by @hangleang.
On PeerDAS devnet, he noticed that the proposer Grandline node gets disconnect by Lighthouse after publishing the columns. Logs show that Lighthouse performed reconstruction after receiving 50% (64 of 128) data columns, and ignored the remaining columns received from gossip. Additionally Lighthouse also penalise the peer - even though it's a
HighToleranceError
, due to the large number of messages for data columns that we can receive after reconstruction (50%, or up to 64 columns per slot), the node got disconnected immediately. He also noticed that it happened not only on Grandine, but also on Lighthouse peers as well.Logs:
We should not penalise the node for sending us valid blobs / data columns on time. This is less obvious in Deneb as due to the lower message count, but gets worse in PeerDAS with higher message count for columns and reconstruction.
Proposed Changes
Remove peer penalty for blobs and columns that are received from gossip for the first time, but already available via other channels, such as
getBlobs
, supernode column reconstruction, or RPC.Note that duplicates received after the first message is filtered by gossip. We only send
IDONTWANT
when publishing columns, and because we do gradual publication,IDONTWANT
may not be sent to peers immediately - probably something to consider: sendIDONTWANT
immediately and but gradually publish to avoid excessive bandwidth usage.Thanks @hangleang for reporting this!