-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network improvements #1036
Network improvements #1036
Conversation
Codecov Report
Flags with carried forward coverage won't be shown. Click here to find out more. |
3398978
to
4afa40f
Compare
… RLPx peer connections to improve networking distribution and connection reliability
…on execution log msg output
…banning old peers and replacing with a new peer on KBucket ping
…dditions of new neighbour peers
e1028a7
to
c110b70
Compare
Ah, forgot a bit about this PR, this is actually ready for review already for some days. 😄 As mentioned yesterday in the call, the initial goal here was to improve network connection quality on a level that there is a reliable connection to non-mainnet networks possible, before it was rather not possible at all any more to connect to e.g. This PR does the following:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
(Just have a question about the how the refill interval changes work.)
} else { | ||
// Still keep peer in queue | ||
return true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, do you know why this technique improves connection reliability?
With this change the connection attempts are spaced out and for any given peer there's a 1 in 10 chance it will be made?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This distributes the networking traffic over time, so there's not all connections hammered out at once - then followed by a longer non-networking-activity - and subsequently many of the replies from the peers also coming in at a similar PIT (in the old version). Instead the traffic gets out and in more frequently and distributed. I am not 100% sure about the reliability impacts in a narrower sense, so e.g. if this directly prevents connection losses by avoiding concurrent network handling, my deeper networking experience is too limited for that. But this definitely leads to a healthier protocol flow - this long waiting period is avoided - becoming e.g. very apparent when the last peer connection gets lost and there is another 30 second wait - but there is always some activity/connection reattempts going on. I would also think that these distributes better on available protocol respectively implementation-given resource restrictions like the maximum number of allowed peer connections or the number of slots for the queue, but not completely sure here.
However benefit reasons are distributed: this is one of the changes which can be really "felt" when running the client, so the first connection pickup after start is just directly and apparently getting so much faster that it gets a bit out of question if there "might be" some positive effect. 😄
No description provided.