-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ban peer that advertises higher PoW than able to provide #3478
fix: ban peer that advertises higher PoW than able to provide #3478
Conversation
52f1e6f
to
2d4a8fb
Compare
- Can only transition to `HeaderSync` if claimed chain metadata is advertised - `HeaderSync` is now aware of the claimed `ChainMetadata` - `HeaderSync` now assumes (invariant) that all sync peers have claimed a higher accumulated PoW and will ban them if the validated accumulated difficulty does not reach the claimed acc diff. - Adds ban condition in `determine_sync_status` phase, if a peer is not able to improve on the local chain strength (because we know that in order to be selected for header sync it must have advertised a stronger chain) - Adds ban condition if header sync completes but is less than the claimed PoW. This is not strictly necessary since they were still able to provide a stronger chain as per Nakamoto consensus, but could still indicate some malicious intent. - If sync fails for all peers, the state machine continues rather than `WAITING`. This removes the disruption that false metadata can cause. - fix `select_sync_peers` to include peers claim that provide a enough full blocks for _our_ pruning horison (fixes cucumber test) higher than the local pruned
2d4a8fb
to
d124e02
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with the code for the most part, but I don't like the extra pruned_height
check in Listening
…ct-chain-metadata * development: fix: fix recovery test reporting message (tari-project#3479) chore: improve cucumber tests to wait for broadcast (tari-project#3461) test: use TCP node for daily sync test (tari-project#3464)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but the attack is still possible. We need to check each peer to its provided meta_data not just what headers it can provide.
actual: None, | ||
local: local_total_accumulated_difficulty, | ||
}), | ||
SyncStatus::WereAhead => Err(BlockHeaderSyncError::PeerSentInaccurateChainMetadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when we are the first node in our peer list to receive a block via sync? Won't we ban all our other peers?
We select our sync peers not only on who sent meta_data to us but also on whom we are connected.
So a peer might have provided accurate metadata and did not cause the activation of sync mode, but it can get banned here because some other peer provided fake metadata.
So I am thinking of this, what happens if a peer sends invalid metadata, but does not respond to the dial request. Your node won't sync to that peer because it thinks it's offline and keep it out of sync, but it will ban the other good nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check each peer to its provided meta_data not just what headers it can provide.
Not sure I follow, validation of headers (PoW) is the only way to determine if the peer is honest about its metadata claim
We select our sync peers not only on who sent meta_data to us but also on whom we are connected.
Good point, which is why I made this no longer the case. In order for the bn to enter the HeaderSync state, the peer MUST have made a claim to have a better block. This is an invariant of the HeaderSync state. So basically, I removed the code that dials all connected peers and now leave it up to the listening state to provide peers that have made a better metadata claim.
So I am thinking of this, what happens if a peer sends invalid metadata, but does not respond to the dial request.
Another good point. However, a peer is already connected when they make a metadata claim. A peer could time it to disconnect, which, as you say, would cause the header sync to fail. So to combat this, if a header sync fails the state machine immediately continues to listening state. This should cause no disruption (that is, a component listening for state events SHOULD NOT disrupted by the node entering the HeaderSync state - e.g mining should continue unless the chain has been confirmed to be behind)
EDIT: I believe the WAITING state after a failed header sync is the primary cause of disruptions, the change here is to follow a "all peers couldn't live up to their promises" header sync with going straight back to is_synced = true
listening state as though nothing happened (and ban the peers that were naughty)
it will ban the other good nodes?
No it will only ban the node that was unable to back up its claim, due to the change that a claim has to be made to be included in HeaderSync sync_peers list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too sure of the behaviour of upstream components (miner, maybe mempool, that's it?). I will look into and run a test on Sunday, but AFAIK it will not interrupt mining. The miner will only not begin if is_bootstrapped is false but after that will always continue on - will run a test to be sure, could be wrong.
EDIT: if necessary, another flag could be added is_synced
that is set to false when the headers have proven a stronger PoW
EDIT2: Another potential change would be for header sync to select the active connection, rather than using dial_peer
which will dial if necessary, which should never be - but I don't think this can be called an attack if there is no disruption caused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, then we cover this.
EDIT: passed every time 🤷 - rerunning on CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ct-chain-metadata
b177c09
b177c09
to
3018cfe
Compare
…ct-chain-metadata
12b7be7
to
5689976
Compare
5689976
to
9e6fc82
Compare
…ct-chain-metadata
…ct-chain-metadata * development: (32 commits) test: improve cucumber with wallets (tari-project#3507) feat: optimize pending transactions inbound query (tari-project#3500) test: add trace logs to wallet's base node monitor (tari-project#3502) fix: improve test Wallet should display transactions made (tari-project#3501) test: change timeouts in ci to reasonable values (tari-project#3494) fix: correct panic in tracing for comms (tari-project#3499) feat: optimize get transactions query (tari-project#3496) fix: fix config file whitespace issue when auto generated in windows (tari-project#3491) bump to rerun tests fix: improve responsiveness of wallet base node switching (tari-project#3488) feat: add decay_params method (tari-project#3454) Revert "macos-11" macos-11 clean remove scripts after install install to tmp then use script to copy to home wip wip path wip ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man this PR is getting stuck in the tumbler. I am going to merge it since the CI seems to have broken elsewhere
Description
HeaderSync
if claimed chain metadata isadvertised
HeaderSync
is now aware of the claimedChainMetadata
HeaderSync
now assumes (invariant) that all sync peers haveclaimed a higher accumulated PoW and will ban them if the validated
accumulated difficulty does not reach the claimed acc diff.
determine_sync_status
phase, if a peer is notable to improve on the local chain strength (because we know that in
order to be selected for header sync it must have advertised a
stronger chain)
This is not strictly necessary since they were still able
to provide a stronger chain as per Nakamoto consensus, but could still
indicate some malicious intent.
WAITING
. This removes the disruption that false metadata can cause.select_sync_peers
to include peers that claim to provide enough fullblocks for our pruning horizon (fixes cucumber test)
Motivation and Context
A peer may disrupt a base node by sending false chain metadata. This PR ensures that if the remote peer cannot provide
the claimed chain, it will get banned and the base node will continue in listening state immediately.
How Has This Been Tested?
Ran sync cucumber tests
Manually, by creating a node that inflates it's advertised accumulated Pow by a small amount and checking that the node is banned by the syncing node.
not entirely related: add block sync RPC unit tests