Use block requests to check if block responses are correct #7653

bkchr · 2020-12-02T10:11:50Z

Before this pr sync relied on recently announced blocks to check if a
given peer response is correct. However this could lead to situations
where we requested a block from a peer and it gave us the requested, but
we rejected the response because this peer never send us an announcement
for the given block. See the added tests for a reproduction of the
problem.
With this pr, we now take the block request to check if a given response
matches the request. A node should not send us a block response
without a request anyway.

Essentially there is still a bug, because as you see in the test, we are
requesting block 2, while we already have this block imported. It even
happens that we request a block from the network that we have authored.
However a fix for this would require some more refactoring of the sync code.

Before this pr sync relied on recently announced blocks to check if a given peer response is correct. However this could lead to situations where we requested a block from a peer and it gave us the requested, but we rejected the response because this peer never send us an announcement for the given block. See the added tests for a reproduction of the problem. With this pr, we now take the block request to check if a given response matches the request. A node should not send us a block response without a request anyway. Essentially there is still a bug, because as you see in the test, we are requesting block 2, while we already have this block imported. It even happens that we request a block from the network that we have authored. However a fix for this would require some more refactoring of the sync code.

client/network/src/block_requests.rs

client/network/src/protocol/sync.rs

arkpar · 2020-12-02T15:19:22Z

Essentially there is still a bug, because as you see in the test, we are
requesting block 2, while we already have this block imported.

In some cases this is inevitable. If a peer suddenly sends an announcement for a future block that we don't know anything about, we request a chain starting from the last known common block. If they did not announce that they now have intermediate blocks, we can't be sure if that future block is continuing a known chain.

It looks like in your tests some peers skip some announcements. I. e. Peer1 was on block 1 but suddenly sends and announcement for block 4? During normal operation this should not happen. All blocks are announced to all peers, unless full sync mode.

arkpar · 2020-12-02T15:23:03Z

client/network/src/protocol/sync.rs

@@ -1685,20 +1668,60 @@ fn validate_blocks<Block: BlockT>(blocks: &Vec<message::BlockData<Block>>, who:
 			}
 		}
 	}
+
+	if let Some(request) = request {


This should be checked first, before doing more expensive checks, such as computing trie roots.
Also, some of the checks against request are being performed on the protocol level here:
https://github.com/paritytech/substrate/blob/master/client/network/src/protocol.rs#L709
Might be a good idea to put them in one place.

bkchr · 2020-12-02T15:28:24Z

It looks like in your tests some peers skip some announcements. I. e. Peer1 was on block 1 but suddenly sends and announcement for block 4? During normal operation this should not happen. All blocks are announced to all peers, unless full sync mode.

I can give you a trace. This is exactly what the logs showed 🤷

arkpar · 2020-12-02T15:40:46Z

I can give you a trace. This is exactly what the logs showed 🤷

Would like to check it. Might be a further problem with how announcements are sent out or ignored when received.

Don't get me wrong, this is still a good test as we should account for all kinds of misbehaviour. And peers might legitimately skip announcements if they fall back to full sync mode for some reason, but this should not happen regularly. And if they do skip announcements and make us download extra blocks, we should not disconnect, but decrease reputation by a bit. This is what the current code does.

bkchr · 2020-12-02T15:43:23Z

Yeah :) The problem we have seen is that peers are getting disconnected really fast with latest polkadot master. I will send you the log

arkpar · 2020-12-02T16:34:28Z

Examining the logs, I guess the issue is that we don't receive back announcements for blocks that we have already announced to peers (e.g. blocks that are collated by this node). So the original issue is that recently_announced should really include not only received, but sent out announcements as well. But since you've removed it anyway, is should be OK

Looks good, modulo @tomaka's concerns.

tomaka

No objection from me

…h#7653) * Use block requests to check if block responses are correct Before this pr sync relied on recently announced blocks to check if a given peer response is correct. However this could lead to situations where we requested a block from a peer and it gave us the requested, but we rejected the response because this peer never send us an announcement for the given block. See the added tests for a reproduction of the problem. With this pr, we now take the block request to check if a given response matches the request. A node should not send us a block response without a request anyway. Essentially there is still a bug, because as you see in the test, we are requesting block 2, while we already have this block imported. It even happens that we request a block from the network that we have authored. However a fix for this would require some more refactoring of the sync code. * Revert change * Give the test a proper name * Add moar logging * Move cheaper checks * Move checks to common place

bkchr added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Dec 2, 2020

bkchr requested review from tomaka and arkpar December 2, 2020 10:11

Merge remote-tracking branch 'origin/master' into bkchr-sync-fix-39434

035321b

tomaka reviewed Dec 2, 2020

View reviewed changes

client/network/src/block_requests.rs Outdated Show resolved Hide resolved

tomaka reviewed Dec 2, 2020

View reviewed changes

client/network/src/protocol/sync.rs Outdated Show resolved Hide resolved

bkchr added 3 commits December 2, 2020 11:21

Revert change

2904e58

Give the test a proper name

55bc7cd

Add moar logging

afd9f16

arkpar reviewed Dec 2, 2020

View reviewed changes

bkchr added 3 commits December 2, 2020 16:48

Move cheaper checks

f243ccb

Move checks to common place

042fb09

Merge remote-tracking branch 'origin/master' into bkchr-sync-fix-39434

be75b2d

tomaka approved these changes Dec 2, 2020

View reviewed changes

arkpar approved these changes Dec 2, 2020

View reviewed changes

bkchr merged commit 864ce58 into master Dec 3, 2020

bkchr deleted the bkchr-sync-fix-39434 branch December 3, 2020 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use block requests to check if block responses are correct #7653

Use block requests to check if block responses are correct #7653

bkchr commented Dec 2, 2020

arkpar commented Dec 2, 2020

arkpar Dec 2, 2020

bkchr commented Dec 2, 2020

arkpar commented Dec 2, 2020 •

edited

Loading

bkchr commented Dec 2, 2020

arkpar commented Dec 2, 2020

tomaka left a comment

Use block requests to check if block responses are correct #7653

Use block requests to check if block responses are correct #7653

Conversation

bkchr commented Dec 2, 2020

arkpar commented Dec 2, 2020

arkpar Dec 2, 2020

Choose a reason for hiding this comment

bkchr commented Dec 2, 2020

arkpar commented Dec 2, 2020 • edited Loading

bkchr commented Dec 2, 2020

arkpar commented Dec 2, 2020

tomaka left a comment

Choose a reason for hiding this comment

arkpar commented Dec 2, 2020 •

edited

Loading