Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider peer reputation score when deciding to disconnect #6187

Merged

Conversation

macfarla
Copy link
Contributor

Noticed in peering logs we are disconnecting quite useful peers with timeouts and "useless responses". This exacerbates peering issues because we are quite early to disconnect, whereas if we just wait a bit longer the peer can still be useful. More of an issue in low peer count networks.

  • consider peer reputation score when deciding whether to disconnect peer for repeated timeouts/useless responses
  • increase threshold for timeout counts
  • decrease score points deducted for timeout/useless response
➜  besu grep  "Disconnect - Outbound" besu03.log
2023-11-17 17:47:27.730+10:00 | nioEventLoopGroup-3-5 | DEBUG | EthProtocolManager | Disconnect - Outbound - 0x0b TIMEOUT - 0x45d5... - 5 peers left
2023-11-17 17:49:10.232+10:00 | nioEventLoopGroup-3-1 | DEBUG | EthProtocolManager | Disconnect - Outbound - 0x0b TIMEOUT - 0xcbba... - 4 peers left
2023-11-17 18:02:33.476+10:00 | nioEventLoopGroup-3-8 | DEBUG | EthProtocolManager | Disconnect - Outbound - 0x03 USELESS_PEER - 0x6a7... - 3 peers left

…useful

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Copy link

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

@macfarla macfarla changed the title don't disconnect if peer has enough of a score increase to have been … consider peer reputation score when deciding to disconnect Nov 20, 2023
}

LOG.trace(
"Not triggering disconnect for {} repeated timeouts for requestCode {} because peer has high score {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we adjust the score in this case? I might be mis-reading but if a previously useful peer becomes useless, would we never disconnect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the score still does get decremented on line 73

eg

2023-11-20 15:13:42.657+10:00 | nioEventLoopGroup-3-4 | TRACE | PeerReputation | Not triggering disconnect for exceeding useless response threshold because peer has high score 167
2023-11-20 15:13:42.716+10:00 | nioEventLoopGroup-3-2 | TRACE | PeerReputation | Not triggering disconnect for exceeding useless response threshold because peer has high score 116
2023-11-20 15:13:43.492+10:00 | nioEventLoopGroup-3-2 | DEBUG | PeerReputation | Disconnection triggered by exceeding useless response threshold, score 109
2023-11-20 14:28:19.456+10:00 | nioEventLoopGroup-3-5 | TRACE | PeerReputation | Not triggering disconnect for exceeding useless response threshold because peer has high score 117
2023-11-20 14:28:19.582+10:00 | nioEventLoopGroup-3-5 | TRACE | PeerReputation | Not triggering disconnect for exceeding useless response threshold because peer has high score 111
2023-11-20 14:28:22.063+10:00 | EthScheduler-Timer-0 | DEBUG | PeerReputation | Disconnection triggered by 13 repeated timeouts for requestCode 15, peer score 107

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Copy link
Contributor

@siladu siladu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@macfarla macfarla enabled auto-merge (squash) November 21, 2023 01:18
@macfarla macfarla merged commit ea376ba into hyperledger:main Nov 21, 2023
17 of 18 checks passed
jflo pushed a commit to jflo/besu that referenced this pull request Dec 4, 2023
…er#6187)

* don't disconnect if peer has enough of a score increase to have been useful

* use threshold not increase

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>

---------

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Justin Florentine <justin+github@florentine.us>
jflo pushed a commit to jflo/besu that referenced this pull request Dec 4, 2023
…er#6187)

* don't disconnect if peer has enough of a score increase to have been useful

* use threshold not increase

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>

---------

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
@macfarla macfarla mentioned this pull request Dec 5, 2023
@fab-10
Copy link
Contributor

fab-10 commented Dec 5, 2023

It think that PR should go along with these two PR that switch peer on retry, otherwise it is possible that we get stuck with a useless peer
#5508 #5509

gfukushima pushed a commit to gfukushima/besu that referenced this pull request Dec 15, 2023
…er#6187)

* don't disconnect if peer has enough of a score increase to have been useful

* use threshold not increase

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>

---------

Signed-off-by: Sally MacFarlane <macfarla.github@gmail.com>
Signed-off-by: Gabriel Fukushima <gabrielfukushima@gmail.com>
@macfarla macfarla deleted the peer-reputation-high-score-dont-disconnect branch January 3, 2024 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants