Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Reset mmr storage #12915

Closed
Closed

Conversation

Lederstrumpf
Copy link
Contributor

This resets mmr storage, as followup to discussion #12864.
It does not touch the offchain db, which will be overwritten, but resets NumberOfLeaves and Nodes.

I'm ooo atm, so will resume working on this next week and address review comments accumulated in the meantime then.

@Lederstrumpf Lederstrumpf added A3-in_progress Pull request is in progress. No review needed at this stage. B0-silent Changes should not be mentioned in any release notes E1-runtimemigration C1-low PR touches the given topic and has a low impact on builders. D2-notlive 💤 PR contains changes in a runtime directory that is not deployed to a chain that requires an audit. labels Dec 13, 2022
@Lederstrumpf
Copy link
Contributor Author

Lederstrumpf commented Dec 13, 2022

With a local rococo chain initialized with runtime 9330 and then upgraded at block 20:

> block=19; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$block']}' localhost:9933

{"jsonrpc":"2.0","error":{"code":8014,"message":"Error performing numeric op","data":"Error::InvalidNumericOp"},"id":1}
> block=20; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$block']}' localhost:9933

{"jsonrpc":"2.0","result":{"blockHash":"0xe866d821cf35400eb2e2fcb6ae3ef4edb9846400b8f048f1baba5bd7d4be45dc","leaves":"0x04c5010000000000e724b707b6d345d04670554f41d15c966f9fc56197486dfc588326864fd07e08010000000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000","proof":"0x040000000000000000010000000000000000"},"id":1}
> curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[[20, 30, 50], 50]}' localhost:9933 | jq .result -c > proof
> curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_verifyProof", "params":['`cat proof`']}' localhost:9933

{"jsonrpc":"2.0","result":true,"id":1}

Copy link
Contributor

@acatangiu acatangiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good! But let's merge after we make sure the gadget can gracefully handle this reset (#12864 (comment)).


#[cfg(feature = "try-runtime")]
fn post_upgrade(_: Vec<u8>) -> Result<(), &'static str> {
assert_eq!(Pallet::<T>::on_chain_storage_version(), 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

maybe also check NumberOfLeaves is 0 and Nodes is empty

@serban300
Copy link
Contributor

It does not touch the offchain db, which will be overwritten, but resets NumberOfLeaves and Nodes.

The keys at canonical paths will be overwritten indeed, but the ones at temp paths will remain. Would it be possible to empty everything that we have in the offchain db related to MMR ?

@Lederstrumpf
Copy link
Contributor Author

With the new MMR client gadget I’m not sure if we can reset the pallet (If resetting the pallet = setting NUM_MMR_LEAVES to 0). The MMR gadget will take the first block with MMR and compute leaf numbers based on it. After that, if we set NUM_MMR_LEAVES to 0 later, I think we’ll get errors for canonicalization and pruning.

@serban300, moving discussion from #12864 over here since other thread changed topic and more relevant here: your suspicion that we’d run into issues once hitting canonicalization window was correct:
Taking my above chain where I performed a runtime upgrade at block 20:

Generating proofs using the runtime state up until block 4134 works completely fine.

> block=39; bestblock=39; statequeryblock=4134; statequeryblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$statequeryblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$statequeryblockhash']}' localhost:9933


| jsonrpc":"2.0 | blockHash":"0xf62b6b5b47734b2ae76b2a5d025799f9c243c895f1ba4315e3a00af2b0f8f2ca | 0x04c501002600000028a81e1c437deb3ac36cda98c798a9d22dab87f9cf26411c9b444cce5038ce90040000000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000 | 0x04130000000000000014000000000000000c3a2589e52aa19462c846552fd1dab20c4f8324e89da700b67370bc6584a8ce15b9e84321e791cedbc7bf10e02676d5dcefc74c4e9bad0343582416ebd69813f9adf1358edd444443bbf57afec592aa9931e5c87a1a6070e25e1639134893ca58 | id:1} |

However, if I bump the statequeryblock to 4135 (an offset of 4096 from block 39, i.e. the canonicalization window) or higher, the leaf cannot be found.

> block=38; bestblock=38; statequeryblock=4135; statequeryblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$statequeryblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$statequeryblockhash']}' localhost:9933

| jsonrpc":"2.0 | blockHash":"0xedf40070d6375ebf6475a8e49524410454be394c7d758eb35238e98a719616a1 | 0x04c501001200000099bca24c634f54a7b77f06a92f1bc3d6382f9d57845add31acd7c1d0a5ce5fe2020000000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000 | 0x0412000000000000001300000000000000083a2589e52aa19462c846552fd1dab20c4f8324e89da700b67370bc6584a8ce15adf1358edd444443bbf57afec592aa9931e5c87a1a6070e25e1639134893ca58 | id:1} |
> block=38; bestblock=39; statequeryblock=4135; statequeryblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$statequeryblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$statequeryblockhash']}' localhost:9933

| jsonrpc":"2.0 | code | Error generating proof | Error::GenerateProof | id:1} |

node log:

    2022-12-21 18:26:24.983 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 34: leaf idx 18, canon key [12, 109, 109, 114, 34, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:26:24.983 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 30: leaf idx 15, canon key [12, 109, 109, 114, 30, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:26:24.983 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 33: leaf idx 17, canon key [12, 109, 109, 114, 33, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:26:51.201 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 34: leaf idx 18, canon key [12, 109, 109, 114, 34, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:26:51.201 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 30: leaf idx 15, canon key [12, 109, 109, 114, 30, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:26:51.201 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 35: leaf idx 19, canon key [12, 109, 109, 114, 35, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:26:51.201 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 35: leaf idx 19, hash 0x0000000000000000000000000000000000000000000000000000000000000000, temp key [12, 109, 109, 114, 35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:26:51.201 ERROR tokio-runtime-worker runtime::mmr: [<wasm:stripped>] MMR error: InconsistentStore
> block=39; bestblock=39; statequeryblock=4135; statequeryblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$statequeryblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$statequeryblockhash']}' localhost:9933

| jsonrpc":"2.0 | code | Leaf was not found | Error::LeafNotFound | id:1} |

node log:

    2022-12-21 18:22:33.548 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 30: leaf idx 15, canon key [12, 109, 109, 114, 30, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:22:33.548 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 34: leaf idx 18, canon key [12, 109, 109, 114, 34, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:22:33.548 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 33: leaf idx 17, canon key [12, 109, 109, 114, 33, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:22:42.477 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 35: leaf idx 19, canon key [12, 109, 109, 114, 35, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:22:42.477 DEBUG tokio-runtime-worker runtime::mmr::offchain: offchain db get 35: leaf idx 19, hash 0x0000000000000000000000000000000000000000000000000000000000000000, temp key [12, 109, 109, 114, 35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    2022-12-21 18:22:42.477 DEBUG tokio-runtime-worker runtime::mmr: [<wasm:stripped>] MMR error: Ok(None)

My best take for now is that we should reset both mmr_gadget::offchain_mmr::OffchainMmr::{first_mmr_block, best_canonicalized} when resetting the pallet storage. Or restarting the MMR gadget on the runtime upgrade where the MMR pallet storage is reset should also do it, fwict.

@bkchr
Copy link
Member

bkchr commented Dec 22, 2022

If you want to reset the storage, you can just issue a frame_system::set_storage call on the chain. There you can overwrite all the storage items as you do it here.

@serban300
Copy link
Contributor

@Lederstrumpf Thanks for checking !

My best take for now is that we should reset both mmr_gadget::offchain_mmr::OffchainMmr::{first_mmr_block, best_canonicalized} when resetting the pallet storage. Or restarting the MMR gadget on the runtime upgrade where the MMR pallet storage is reset should also do it, fwict.

I'm not sure what happens when restarting the client gadget. If it starts processing finality notifications generated after the pallet reset, it should work. If it starts again processing from the first block it probably won't work. Anyway, I posted this PR that adds support for pallet reset in the client gadget. If we merge this before reseting the pallet, it should be ok.

@acatangiu
Copy link
Contributor

If you want to reset the storage, you can just issue a frame_system::set_storage call on the chain. There you can overwrite all the storage items as you do it here.

Basti is right! We're only resetting runtime storage in this migration, and that can be done with a root origin call on the chain 👍

@acatangiu
Copy link
Contributor

acatangiu commented Dec 22, 2022

The keys at canonical paths will be overwritten indeed, but the ones at temp paths will remain. Would it be possible to empty everything that we have in the offchain db related to MMR ?

Following this reset we are orphaning a bit of both canon and temp data in offchain db, but this will only happen on Rococo right now and the leaked data is negligible in size, so just don't worry about it. I wouldn't add and maintain pallet complexity just to cleanup mistakes we did during development on test networks 😄

@Lederstrumpf
Copy link
Contributor Author

I'm not sure what happens when restarting the client gadget. If it starts processing finality notifications generated after the pallet reset, it should work. If it starts again processing from the first block it probably won't work. Anyway, I posted #12999 that adds support for pallet reset in the client gadget. If we merge this before reseting the pallet, it should be ok.

Can confirm the issue's resolved with #12999 on master now (canonicalization issue was masked with my archival validators stalling due to #13028 on master prior to #13051, so required a cherry), thanks!

If you want to reset the storage, you can just issue a frame_system::set_storage call on the chain. There you can overwrite all the storage items as you do it here.

Thanks for the heads-up! I tried that too (LeafCount reset: system -> setStorage {0xa8c65209d47ee80f56b0011e8fd91f508156209906244f2341137c136774c91d, 0x0000000000000000}, Nodes reset: system -> killPrefix {0xa8c65209d47ee80f56b0011e8fd91f50519dfc7fdad21b84f64a5310fa178ef2, 0} ), and can also confirm it works, so long as Nodes is reset prior or simultaneous to NumberOfLeaves (if Nodes is not reset at all, it will only be overwritten gradually, and if it's not reset simultaneously end up with some InconsistentStore issues for the interim blocks, which is fine):


block=39; bestblock=39; stateblock=4134; stateblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$stateblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$stateblockhash']}' localhost:9933

| jsonrpc":"2.0 | blockHash":"0x0ba17e697afc47c18922208090f09d9ba81b940996e91da8f2915e294a1bde63 | 0x04c50100161100009b8aadf2ad4600fa398bf78f279db693585762564360550037141f0ca3de861eb60100000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000 | 0x04260000000000000027000000000000000c91a6e8b95dee47e8f567c3a8d8d9d6c3618b688f085e53e87a8ec358bca688c8bffb6f42455d4fb543aa961f7565dc8a713dd4a8416f3b9d6ed1e449c4c0a594b272e1c37172cdb50a676f4d593b7c318bb5ac0b9c44025ff52bed9988089e3b | id:1} |

State reset around block 4500:

block=39; bestblock=39; stateblock=4600; stateblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$stateblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$stateblockhash']}' localhost:9933

| jsonrpc":"2.0 | code | Error performing numeric op | Error::InvalidNumericOp | id:1} |
block=5000; bestblock=6000; stateblock=$bestblock; stateblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$stateblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$stateblockhash']}' localhost:9933

| jsonrpc":"2.0 | blockHash":"0x307c4556783472b38f52891b13e8ac610c9c9a3c251d948ef25374100c01ec61 | 0x04c50100871300002bf5047c693985cad65ba6a21b281397835ec95548b68a1a2bc9067cb7bab1bdf40100000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000 | 0x04970200000000000080060000000000002c76dcc61da626c163ff32e2cf7a19ec180c6f51499363b6521f612864280bea3a01c232aef4c50707c4854d787091bc8122e2fd50bccaed7d4b574f6960f50dc9ab4cc624880ab9e799b01f96ddbf3a55f155c7c081a78fd67dd90554955957f5e9e93b25914eb66d4ba14cb872ed1628e79c5160c1839fd355b00ae98e48f59849af9551f99f17320d5df1d8dc2103e20ec691081e3139ef64d1e3e535e2b92f4fca8fb21b938862a50052d7700c031350f9e1a10ea9fbda648947e1ad053070dee72706207f99a129bbb94e242efdf931053040ba5ce67117d90dd1d3d4447ad41971ca59ddb9311962b99b7eb90f4aa7b54faaa1e68d4721bb5b562cd7a42f6f2bd7de45444217444feffc117751fb9ce5cef74a7cc235ad673482f9a6bc4f62f97bf30f7ae3b8a1b88e35cacdcde6feeab28d6897fee671910ac3df3fe1a9422148d83ce12ca0c98c53aaff8293bdee7e542e9335497eaa10f5911144b8f7 | id:1} |
block=5000; bestblock=6000; stateblock=12340; stateblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$stateblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$stateblockhash']}' localhost:9933

| jsonrpc":"2.0 | blockHash":"0x5dee75a8c958ac1dbd052c5dd815ff6be616dc30ac7dce7b8f5dfd080ecfab86 | 0x04c50100871300002bf5047c693985cad65ba6a21b281397835ec95548b68a1a2bc9067cb7bab1bdf40100000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000 | 0x04970200000000000080060000000000002c76dcc61da626c163ff32e2cf7a19ec180c6f51499363b6521f612864280bea3a01c232aef4c50707c4854d787091bc8122e2fd50bccaed7d4b574f6960f50dc9ab4cc624880ab9e799b01f96ddbf3a55f155c7c081a78fd67dd90554955957f5e9e93b25914eb66d4ba14cb872ed1628e79c5160c1839fd355b00ae98e48f59849af9551f99f17320d5df1d8dc2103e20ec691081e3139ef64d1e3e535e2b92f4fca8fb21b938862a50052d7700c031350f9e1a10ea9fbda648947e1ad053070dee72706207f99a129bbb94e242efdf931053040ba5ce67117d90dd1d3d4447ad41971ca59ddb9311962b99b7eb90f4aa7b54faaa1e68d4721bb5b562cd7a42f6f2bd7de45444217444feffc117751fb9ce5cef74a7cc235ad673482f9a6bc4f62f97bf30f7ae3b8a1b88e35cacdcde6feeab28d6897fee671910ac3df3fe1a9422148d83ce12ca0c98c53aaff8293bdee7e542e9335497eaa10f5911144b8f7 | id:1} |

Equivalent runtime upgrade with storage migration at block 12540:

block=5000; bestblock=6000; stateblock=12540; stateblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$stateblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$stateblockhash']}' localhost:9933

| jsonrpc":"2.0 | code | Error performing numeric op | Error::InvalidNumericOp | id:1} |
block=12540; bestblock=12545; stateblock=12545; stateblockhash=`curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlockHash", "params":['$stateblock']}' localhost:9933 | jq '.result'`; curl -H "Content-Type: application/json" -d '{"id":1, "jsonrpc":"2.0", "method": "mmr_generateProof", "params":[['$block'], '$bestblock', '$stateblockhash']}' localhost:9933

| jsonrpc":"2.0 | blockHash":"0x9cd31b8b50b851a9569c6a1a601b649e2d0716fc0bdbee634a0962cd1c91b2af | 0x04c50100fb3000006b2717d45be0c5f9ad43bcf71c4b589bebbbdd50e8e9ae8e199c4a4832c39b9ae60400000000000002000000697ea2a8fe5b03468548a7a413424a6292ab44a82a6f5cc594c3fa7dda7ce4020000000000000000000000000000000000000000000000000000000000000000 | 0x0409000000000000000f0000000000000010c60c3637f0847223d92ceded923f4eaa6c87dea6a56ec81bf3a6f77758794649078ccacba26413652fa90ff041cb716d8e2f637a2e9fe69922119946664b5fc19ef4e4d9e9b28fc738cc44bf34419bc96b74bef6ab1a3d20cd183b6f93702d4e893b280fec97d1d5546ff160e491b28ed7bcd4d4e60d9441a20157ad79e7e63a | id:1} |

So given the sudo luxury on rococo and its lower baggage vs. runtime upgrade, let's manually reset storage as suggested by @bkchr and close this PR once it's worked. I'll reach out to devops regarding this once I've addressed #11797.

@stale
Copy link

stale bot commented Feb 3, 2023

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the A5-stale Pull request did not receive any updates in a long time. No review needed at this stage. Close it. label Feb 3, 2023
@acatangiu
Copy link
Contributor

Closed in favor of https://github.com/paritytech/devops/issues/2287

@acatangiu acatangiu closed this Feb 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A3-in_progress Pull request is in progress. No review needed at this stage. A5-stale Pull request did not receive any updates in a long time. No review needed at this stage. Close it. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D2-notlive 💤 PR contains changes in a runtime directory that is not deployed to a chain that requires an audit.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants