-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize fork sync targets handling #1199
Closed
Closed
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
6221582
Optimize fork sync targets handling
prybalko 606c3c8
fix fmt
prybalko fc259ff
Merge branch 'master' into optimize-sync-fork-targets
prybalko 6b341ef
Merge branch 'master' into optimize-sync-fork-targets
prybalko 30736f3
Merge branch 'master' into optimize-sync-fork-targets
prybalko 430e3e5
Merge branch 'master' into optimize-sync-fork-targets
prybalko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this logic in more detail? It checks the fork targets in reverse order by
(number, hash)
so the last block of the fork is checked first and if it's not available, it proceeds to the parent of the latest block, am I understanding this correctly?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite. The logic goes like this:
If there are multiple
fork_targets
, in what order should they be checked?I believe the optimal approach would be to begin with the
fork_target
with the highest number (aka latest). If it's available, then download (number - finalized) blocks. If not, go to the nextfork_target
(which can be unrelated to the first one).The optimization lies in the fact that other
fork_targets
with lower numbers might be part of the same fork that has already been downloaded.Note: at this point
fork_targets
have already been filtered bynumber < best_num + max_blocks_per_request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested running the original test yet with this optimization applied but how much better it got, do you remember?
It looks like it won't even traverse through the fork targets that are too far in to the future which is what we want but I think this could be optimized even further. Right now if a peer is not able to provide any fork targets, it still has to go through all of them which is not optimal. Which fork targets the peer is able to provide should probably be stored so the list of targets wouldn't have to be looped through each time we call
fork_sync_request()
. This will admittedly complicate the design and as @dmitry-markin is already working on the syncing code, it may be preferable to not make significant refactors until we've decided how theChainSync
should be structured.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Althought
fork_targets
were clearly stored not in optimal way, I didn't see any significalt changes in preformance. In my tests, the difference was negligible, as the time spent infork_sync_request
is in microseconds even for 10k blocks.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic itself looks good, but I'm not sure if this improvement makes significant difference. The thing is, due to the bug paritytech/substrate#12830
fork_sync_request
is only capable of downloading forks of less thanmax_blocks_per_request
(128 blocks). The performance difference on this scale can be not that big. For longer forks, may be, we somehow invert (or at least should) the notion of fork vs. canonical chain and download the fork as a canonical chain if we are lucky (or get stuck like in the issue above).Also, as a downside, it looks like if we have multiple fork targets on the same fork, this change increases the chances that we hit a 128 block limit (as we are starting from the higher fork targets) and never import the fork.
The problems described are due to the bug paritytech/substrate#12830 and not due to this PR per se.