snap sync does not work on polygon fork #25965

manav2401 · 2022-10-11T10:53:23Z

Hey team, we’ve been unable to run snap sync on our mainnet (https://github.com/maticnetwork/bor). To be more specific, it completes the state sync phase, but keeps running in a never ending healing phase. We are aware that the sync has to run faster than the block production and state in order to finish. We have performed some experiments and checks at our end (The block time and block gas limit for our mainnet is 2s and 30M respectively). Also, we know that the process is I/O and network heavy and we've allocated more than enough in these machines.

We tried running a node in snap sync on our mumbai testnet to make sure the issue has nothing to do the other components of our PoS chain specifically the consensus. It works well on the testnet.
We tried scaling up some of the parameters involved in the sync mechanism like the pivot marker, the size (bytes) of data to be received in the snap sync trie node (and storage) requests, the dynamic timeout to see the behaviours. We did not see any significant changes in the mechanism. Also, we’re not sure which metrics would be appropriate to checkout while modifying these parameters. Well, to be specific, modifying the pivot parameters didn’t really work as it stopped the healing phase and node stopped syncing totally.
We conducted an experiment where we took a full synced mainnet node, disconnected it from outer world and only let 1 fresh node sync from it using snap sync. We saw a lot of peer connectivity issues and after some point, the snap sync node wasn’t able to connect to the full synced peer (maybe it figured out that the opposite peer is stale?). This was an attempt to see if the issue is with the state moving fast or not.

We’re currently exploring some ways to understand the mechanism and internals through tests as we thought just tweaking the parameters might not help and would make the process much longer. But, it would be great if you can suggest us some important points/places to look at to dig further, or some experiments to conduct and ways to do so (like finding more internal details about the trie nodes and the rate at which they’re being produced vs downloaded, etc).

Let us know, if there's anything more which we can share from our end. Thanks!

EDIT: the tag was auto chosen to "docs". I'd put it under "help wanted".

holiman · 2022-10-11T11:48:40Z

We tried scaling up some of the parameters involved in the sync mechanism like the pivot marker, the size (bytes) of data to be received in the snap sync trie node (and storage) requests, the dynamic timeout to see the behaviours.

Pivot. Ideally, you never want to pivot. Pivoting is not something we want to do, it's a necessity due to the fact that the peer(s) do in-memory pruning. The (geth) in-memory pruning does not touch the most recent 128 states, but once it becomes older than 128, we gc as best we can. So, if you want your peers to deliver, you need to ask them about roots that are within the last 128 blocks, otherwise you'll get no responses.

So adjusting the pivot block is not really doable, unless the whole peer ecosystem changes the pruning thresholds.

Size of data. Retrieving larger trie node responses during healing may have some effect, but it's not an easy problem. The trienode healing is essentially a classic fast-sync, and we can only start storing to disk once we reach the leaf-level. If we ask for too much data, we will expand the trie-iteration too much on the breadth -- what we ideally want to do is go depth-first. We recently made a fix in this area (eth/protocols/snap: throttle trie heal requests when peers DoS us #25666) , for specifically this problem.

Anyway, tldr; if you change the trienode heal request/response size, you may shoot yourself in the foot. Changing the size on other types (account/storage) is probably less dangeous, but you should keep an eye out for timeouts -- if a request times out the data is thrown away, so that would be a net loss.

We conducted an experiment where we took a full synced mainnet node, disconnected it from outer world and only let 1 fresh node sync from it using snap sync.

This should work fine, but a couple of caveats... Let's call the full-synced node A.

If A is shut down, then restarted with e.g netrestrict so that it only sees local nodes, then you have a problem. Because the shutdown will cause it to store only trie nodes for three states: head, head-1 and head-127 (iirc). It will happily try to serve snap-data, and it has a lot of snap data for all the most recent 128 layers, but it will only be able to provide proofs (from the trie) for those three specific states. Requests for any other root will not yield any response.

So what you need to do is basically to start it up, and then import 128 blocks, and then take it offline. Or, set it to gcmode=archive, let it import 128 blocks in archive-mode, shut it down. After that, you can boot it up, and it will have all the last 128 states available.

Also, here are some recent fixes we've made to snap-sync:
https://github.com/ethereum/go-ethereum/pulls?q=is%3Apr+is%3Aclosed+label%3Abackport

We are going to backport them and make a new release with them. I recommend that you also make use of these fixes.

Lastly, re places to dig further. It would benefit you to understand your own trie-churn, given your time (2s) and gas 30M. If you put the node in archive-mode, each new node will be stored to disk, after each block. During commit, you should thus be able to get some raw figures on exactly "how many trie updates are performed during a block".

Now, let's say it's for example 5k modifications, spread out across the trie. Then you could model how many trie heal requests would be needed to heal that.

I'll leave this ticket open for a while longer, in case @karalabe has anything to add.

manav2401 · 2022-10-11T12:24:27Z

Thanks a lot for the response.

So adjusting the pivot block is not really doable, unless the whole peer ecosystem changes the pruning thresholds.

I see. Also, just to confirm, peers would only respond if they have those blocks in the difflayer which is present due to the --snapshot flag, right?

Anyway, tldr; if you change the trienode heal request/response size, you may shoot yourself in the foot. Changing the size on other types (account/storage) is probably less dangeous, but you should keep an eye out for timeouts -- if a request times out the data is thrown away, so that would be a net loss.

Alright. Yes, we did try using a static timeout (~30s I believe) instead of a dynamic logic in there.

Re that experiment, didn't think of this. We'll try to pull in some changes/fixes which you've done and try re-running that experiment again by making sure that last 128 block states is available with the full synced node.

Lastly, re places to dig further. It would benefit you to understand your own trie-churn, given your time (2s) and gas 30M. If you put the node in archive-mode, each new node will be stored to disk, after each block. During commit, you should thus be able to get some raw figures on exactly "how many trie updates are performed during a block".

Any scripts that can help us speed up the process?

We'd quickly get moving with these action items and share the results here itself. CC: @JekaMas

holiman · 2022-10-11T12:38:06Z

I see. Also, just to confirm, peers would only respond if they have those blocks in the difflayer which is present due to the --snapshot flag, right?

In order for a peer to deliver responses to account/storage requests, they need to have the snapshot layer for that root, and the trie nodes for that root.

In order for a peer to deliver responses to trie healing requests, they theoretically only need to have the trie nodes for that root. In practice, however, they required having the snapshot layer too, until this PR (#25644).

Any scripts that can help us speed up the process?

Nothing off the top of my head, no. This might also be interesting to look into: #25022

holiman · 2023-03-08T10:26:06Z

Seems answered, closing

manav2401 added the type:docs label Oct 11, 2022

holiman closed this as completed Mar 8, 2023

lightclient mentioned this issue Jul 10, 2023

Get rid of snap sync heal phase #27692

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snap sync does not work on polygon fork #25965

snap sync does not work on polygon fork #25965

manav2401 commented Oct 11, 2022 •

edited

Loading

holiman commented Oct 11, 2022 •

edited

Loading

manav2401 commented Oct 11, 2022

holiman commented Oct 11, 2022

holiman commented Mar 8, 2023

snap sync does not work on polygon fork #25965

snap sync does not work on polygon fork #25965

Comments

manav2401 commented Oct 11, 2022 • edited Loading

holiman commented Oct 11, 2022 • edited Loading

manav2401 commented Oct 11, 2022

holiman commented Oct 11, 2022

holiman commented Mar 8, 2023

manav2401 commented Oct 11, 2022 •

edited

Loading

holiman commented Oct 11, 2022 •

edited

Loading