-
Notifications
You must be signed in to change notification settings - Fork 20.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
snap sync does not work on polygon fork #25965
Comments
So adjusting the pivot block is not really doable, unless the whole peer ecosystem changes the pruning thresholds.
Anyway, tldr; if you change the trienode heal request/response size, you may shoot yourself in the foot. Changing the size on other types (account/storage) is probably less dangeous, but you should keep an eye out for timeouts -- if a request times out the data is thrown away, so that would be a net loss.
This should work fine, but a couple of caveats... Let's call the full-synced node If So what you need to do is basically to start it up, and then import 128 blocks, and then take it offline. Or, set it to Also, here are some recent fixes we've made to snap-sync: We are going to backport them and make a new release with them. I recommend that you also make use of these fixes. Lastly, re places to dig further. It would benefit you to understand your own trie-churn, given your time ( Now, let's say it's for example 5k modifications, spread out across the trie. Then you could model how many trie heal requests would be needed to heal that. I'll leave this ticket open for a while longer, in case @karalabe has anything to add. |
Thanks a lot for the response.
I see. Also, just to confirm, peers would only respond if they have those blocks in the difflayer which is present due to the
Alright. Yes, we did try using a static timeout (~30s I believe) instead of a dynamic logic in there. Re that experiment, didn't think of this. We'll try to pull in some changes/fixes which you've done and try re-running that experiment again by making sure that last 128 block states is available with the full synced node.
Any scripts that can help us speed up the process? We'd quickly get moving with these action items and share the results here itself. CC: @JekaMas |
In order for a peer to deliver responses to account/storage requests, they need to have the snapshot layer for that root, and the trie nodes for that root. In order for a peer to deliver responses to trie healing requests, they theoretically only need to have the trie nodes for that root. In practice, however, they required having the snapshot layer too, until this PR (#25644).
Nothing off the top of my head, no. This might also be interesting to look into: #25022 |
Seems answered, closing |
Hey team, we’ve been unable to run snap sync on our mainnet (https://github.com/maticnetwork/bor). To be more specific, it completes the state sync phase, but keeps running in a never ending healing phase. We are aware that the sync has to run faster than the block production and state in order to finish. We have performed some experiments and checks at our end (The block time and block gas limit for our mainnet is 2s and 30M respectively). Also, we know that the process is I/O and network heavy and we've allocated more than enough in these machines.
We’re currently exploring some ways to understand the mechanism and internals through tests as we thought just tweaking the parameters might not help and would make the process much longer. But, it would be great if you can suggest us some important points/places to look at to dig further, or some experiments to conduct and ways to do so (like finding more internal details about the trie nodes and the rate at which they’re being produced vs downloaded, etc).
Let us know, if there's anything more which we can share from our end. Thanks!
EDIT: the tag was auto chosen to "docs". I'd put it under "help wanted".
The text was updated successfully, but these errors were encountered: