Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize fork sync targets handling #1199

Closed
wants to merge 6 commits into from

Conversation

prybalko
Copy link

Resolves #525.

Utilize a BTreeMap to store fork_targets, providing efficient fork selection based on the block number.

@prybalko prybalko added R0-silent Changes should not be mentioned in any release notes I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. labels Aug 28, 2023
@prybalko prybalko requested a review from altonen August 28, 2023 07:01
Comment on lines +3004 to +3006
// Iterate in reverse order to download target with the highest number first.
// Other targets might belong to the same fork, so we don't need to download them separately.
for ((number, hash), r) in targets.range(range).rev() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this logic in more detail? It checks the fork targets in reverse order by (number, hash) so the last block of the fork is checked first and if it's not available, it proceeds to the parent of the latest block, am I understanding this correctly?

Copy link
Author

@prybalko prybalko Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. The logic goes like this:

If there are multiple fork_targets, in what order should they be checked?
I believe the optimal approach would be to begin with the fork_target with the highest number (aka latest). If it's available, then download (number - finalized) blocks. If not, go to the next fork_target (which can be unrelated to the first one).
The optimization lies in the fact that other fork_targets with lower numbers might be part of the same fork that has already been downloaded.

Note: at this point fork_targets have already been filtered by number < best_num + max_blocks_per_request

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested running the original test yet with this optimization applied but how much better it got, do you remember?

It looks like it won't even traverse through the fork targets that are too far in to the future which is what we want but I think this could be optimized even further. Right now if a peer is not able to provide any fork targets, it still has to go through all of them which is not optimal. Which fork targets the peer is able to provide should probably be stored so the list of targets wouldn't have to be looped through each time we call fork_sync_request(). This will admittedly complicate the design and as @dmitry-markin is already working on the syncing code, it may be preferable to not make significant refactors until we've decided how the ChainSync should be structured.

Copy link
Author

@prybalko prybalko Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how much better it got, do you remember?

Althought fork_targets were clearly stored not in optimal way, I didn't see any significalt changes in preformance. In my tests, the difference was negligible, as the time spent in fork_sync_request is in microseconds even for 10k blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic itself looks good, but I'm not sure if this improvement makes significant difference. The thing is, due to the bug paritytech/substrate#12830 fork_sync_request is only capable of downloading forks of less than max_blocks_per_request (128 blocks). The performance difference on this scale can be not that big. For longer forks, may be, we somehow invert (or at least should) the notion of fork vs. canonical chain and download the fork as a canonical chain if we are lucky (or get stuck like in the issue above).

Also, as a downside, it looks like if we have multiple fork targets on the same fork, this change increases the chances that we hit a 128 block limit (as we are starting from the higher fork targets) and never import the fork.

The problems described are due to the bug paritytech/substrate#12830 and not due to this PR per se.

@altonen altonen requested a review from dmitry-markin August 31, 2023 11:35
@bkchr
Copy link
Member

bkchr commented Nov 8, 2023

@altonen @dmitry-markin what are we gonna do with this?

@altonen
Copy link
Contributor

altonen commented Nov 8, 2023

@altonen @dmitry-markin what are we gonna do with this?

This would need a Versi burn-in at least and the last time I spoke with Pavel about this, there were some confusion (at least on my side) about why this didn't improve the original test that was very slow when I last debugged it a year ago. I think we can keep this open and revisit when the sync refactoring @dmitry-markin has started is done and we have some time look into the fork targets.

bkchr pushed a commit that referenced this pull request Apr 10, 2024
@bkchr bkchr closed this Jul 17, 2024
@bkchr bkchr deleted the optimize-sync-fork-targets branch July 17, 2024 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. R0-silent Changes should not be mentioned in any release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fork sync targets are handled inefficiently
4 participants