-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: tx-pause
and safe-mode
pallets integration
#1388
Conversation
tx-pause
and safe-mode
pallets integration
runtime/shibuya/src/lib.rs
Outdated
// System and Timestamp are required for block production | ||
| RuntimeCall::System(_) | ||
| RuntimeCall::Timestamp(_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question - aren't these calls inherent though?
If such calls also need to be filtered, parachain system should also be included here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove Timestamp
block production stops:
❌️ Mandatory inherent extrinsic returned error. Block cannot be produced.
But, I removed SafeMode
since the safe-mode pallet cannot disable it's own calls and I added ParachainSystem
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm strange, have to admit that I'm surprised :)
runtime/shibuya/src/lib.rs
Outdated
// Adjusting the base extrinsic weight to account for the additional database | ||
// read introduced by the `tx-pause` pallet during extrinsic filtering. | ||
weights.base_extrinsic = ExtrinsicBaseWeight::get().saturating_add(RocksDbWeight::get().reads(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the way to do it 🙂
RocksDb
weights is a config param of theframe_system
and we should use it from there- That aside, this shouldn't be necessarily hardcoded but it should be benchmarked. There is a command to measure base weight and perhaps now is the time to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean using the /bench
command from the workflows? Should I run it specifically for frame-system
and pallet-tx-pause
, or should I include others as well? I’m thinking of pallets that involve dispatches, such as proxy, collective-proxy, utility, and scheduler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a specific command to measure overhead, or base extrinsic weight.
E.g. using the frame-omni-bencher, it should be frame-omni-bencher v1 benchmark overhead
.
I guess we'd need to adapt the CI a bit 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this command to me; I was not aware of its existence! The overhead command is not yet available in frame-omni-bencher
v0.7.0 but should be released in the next Polkadot SDK stable2412 version. I've adjusted the CI to include it alongside our existing benchmarking approach. I will run it below using the /bench
command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, very neat!
pub type HostFunctions = ( | ||
cumulus_client_service::ParachainHostFunctions, | ||
moonbeam_primitives_ext::moonbeam_ext::HostFunctions, | ||
); | ||
|
||
/// Host functions required for kitchensink runtime and Substrate node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this. The benchmark host functions is injected by benchmarking-cli itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am getting this error if I remove benchmarking host functions:
Finished release [optimized] target(s) in 3m 14s
Running `target/release/astar-collator benchmark overhead --dev --header=./.github/license-check/headers/HEADER-GNUv3`
Error: Service(Client(VersionInvalid("Other error happened while constructing the runtime: runtime requires function imports which are not present on the host: 'env:ext_benchmarking_proof_size_version_1', 'env:ext_benchmarking_add_to_whitelist_version_1', 'env:ext_benchmarking_current_time_version_1', 'env:ext_benchmarking_commit_db_version_1', 'env:ext_benchmarking_read_write_count_version_1', 'env:ext_benchmarking_wipe_db_version_1', 'env:ext_benchmarking_get_read_and_written_keys_version_1', 'env:ext_benchmarking_reset_read_write_count_version_1', 'env:ext_benchmarking_set_whitelist_version_1'")))
2024-12-02 13:52:50 Cannot create a runtime error=Other("runtime requires function imports which are not present on the host: 'env:ext_benchmarking_proof_size_version_1', 'env:ext_benchmarking_add_to_whitelist_version_1', 'env:ext_benchmarking_current_time_version_1', 'env:ext_benchmarking_commit_db_version_1', 'env:ext_benchmarking_read_write_count_version_1', 'env:ext_benchmarking_wipe_db_version_1', 'env:ext_benchmarking_get_read_and_written_keys_version_1', 'env:ext_benchmarking_reset_read_write_count_version_1', 'env:ext_benchmarking_set_whitelist_version_1'")
PS: I will fix the comment (copied)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/paritytech/polkadot-sdk/blob/0845044454c005b577eab7afaea18583bd7e3dd3/substrate/utils/frame/benchmarking-cli/src/pallet/command.rs#L66
Not sure why you're getting this error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like overhead doesn't include it, wonder why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make an issue or PR for it, probably it's small effort.
runtime/shibuya/src/lib.rs
Outdated
@@ -248,7 +249,7 @@ parameter_types! { | |||
pub RuntimeBlockWeights: BlockWeights = BlockWeights::builder() | |||
.base_block(BlockExecutionWeight::get()) | |||
.for_class(DispatchClass::all(), |weights| { | |||
weights.base_extrinsic = ExtrinsicBaseWeight::get(); | |||
weights.base_extrinsic = weights::base_extrinsic::ExtrinsicBaseWeight::get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this. Why not use ExtrinsicBaseWeight and add 2 extra reads. We don't have to maintain this benchmark. Also make sure the fee calculation reflects this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommended to use this so I can answer - we benchmark everything else essentially, relying on automated measurements when possible.
It's easier to have something like this, then to hardcode it, IMO.
More correct & consistent also.
Shouldn't be that much of a maintenance since only an additional command is executed within an existing loop, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dinonard if that's the case then benchmark is missing reads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ipapandinas @Dinonard this overhead benchmark doesn't count any IO. It just measures the overhead of an extrinsic. So there's no use for this. We just need to manually add 2 reads and it should be fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't checked the code, only the description of the command.
According to it, it should measure everything?
Maybe it doesn't measure IO operations discretely, but it accounts for them as part of the overall measurement.
Running benchmarks before & after inclusion of the new check could show this I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it doesn't, we should make an issue about it.
It should also measure PoV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked the overhead benchmark command code.
Remark extrinsics pushed into the block used during the benchmark are executed in the client runtime via the FrameExecutive::apply_extrinsic
. These extrinsics are dispatched, and the FrameSystem::BaseCallFilter
is checked alongside the origin. The new SafeMode
& TxPause
filters are then measured.
When running the command locally, here are the results I obtained:
Without base call filtering:
type BaseCallFilter = frame_support::traits::Everything;
Average Result: 95_705 nanoseconds
With base call filtering:
type BaseCallFilter = InsideBoth<SafeMode, TxPause>;
Average Result: 103_702 nanoseconds
This represents an 8% increase.
PoV is also measured - reference: paritytech/substrate#11637
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However I have the following error when performing the overhead benchmark command via shibuya-dev
in the Github workflow:
Finished release [optimized] target(s) in 38m 52s
2024-12-02 14:51:56 Using the chain spec instead of the runtime to generate the genesis state is deprecated. Please remove the `--chain`/`--dev`/`--local` argument, point `--runtime` to your runtime blob and set `--genesis-builder=runtime`. This warning may become a hard error any time after December 2024.
2024-12-02 14:51:57 assembling new collators for new session 0 at #0
2024-12-02 14:51:57 assembling new collators for new session 1 at #0
2024-12-02 14:51:57 Loading WASM from genesis state
[+] Benchmarking 1 Astar collator pallets.
[+] Benchmarking pallet_tx_pause with weight file ./benchmark-results/shibuya-dev/tx_pause_weights.rs
[+] Benchmarking block and extrinsic overheads...
2024-12-02 14:52:03 assembling new collators for new session 0 at #0
2024-12-02 14:52:03 assembling new collators for new session 1 at #0
2024-12-02 14:52:05 🔨 Initializing Genesis block/state (state: 0x81fd…3aae, header-hash: 0x69b1…ac53)
2024-12-02 14:52:05 panicked at /home/astar-admin/.cargo/git/checkouts/polkadot-sdk-cff69157b985ed76/b98e0b3/cumulus/pallets/parachain-system/src/lib.rs:1297:30:
included head not present in relay storage proof
Error: Client(RuntimeApiError(Application(Execution(AbortedDueToTrap(MessageWithBacktrace { message: "wasm trap: wasm `unreachable` instruction executed", backtrace: Some(Backtrace { backtrace_string: "error while executing at wasm backtrace:\n 0: 0x1680a - <unknown>!rust_begin_unwind\n 1: 0x52f5 - <unknown>!core::panicking::panic_fmt::hc677f413e2f2c888\n 2: 0x6c74b0 - <unknown>!frame_support::storage::transactional::with_transaction::hdd4b19daa5635076\n 3: 0x19b9ee - <unknown>!<cumulus_pallet_parachain_system::pallet::Call<T> as frame_support::traits::dispatch::UnfilteredDispatchable>::dispatch_bypass_filter::{{closure}}::hf5e7b72922057598\n 4: 0x15bf8b - <unknown>!<shibuya_runtime::RuntimeCall as frame_support::traits::dispatch::UnfilteredDispatchable>::dispatch_bypass_filter::hb7c74cdeb42c4a6a\n 5: 0x154f5b - <unknown>!<shibuya_runtime::RuntimeCall as sp_runtime::traits::Dispatchable>::dispatch::he9dab724d3fc532c\n 6: 0x5d1038 - <unknown>!frame_executive::Executive<System,Block,Context,UnsignedValidator,AllPalletsWithSystem,COnRuntimeUpgrade>::apply_extrinsic::hfe4db20b64a1ce4c\n 7: 0x61f955 - <unknown>!BlockBuilder_apply_extrinsic" }) })))))
2024-12-02 14:52:05 1 storage transactions are left open by the runtime. Those will be rolled back.
2024-12-02 14:52:05 1 storage transactions are left open by the runtime. Those will be rolled back.
[-] Failed to benchmark the block and extrinsic overheads. Error written to ./benchmark-results/shibuya-dev/bench_errors.txt; continuing...
[+] Benchmarking the machine...
[-] Benchmarks failed: ./benchmark-results/shibuya-dev/bench_errors.txt
/bench shibuya-dev pallet_tx_pause |
Benchmarks job is scheduled at https://github.com/AstarNetwork/Astar/actions/runs/12121139663. |
Benchmarks have been finished. |
I would like to have some e2e tests for this, especially what will happen when chain receives a xcm call. Maybe we can use chopsticks to test different scenarios |
Don't the existing tests cover this? |
@Dinonard No, I mean to test xcm when chain is on restriction mode |
/bench astar-dev pallet_balances |
Benchmarks job is scheduled at https://github.com/AstarNetwork/Astar/actions/runs/12158641670. |
Benchmarks have been finished. |
Co-authored-by: Ermal Kaleci <ermalkaleci@gmail.com>
Co-authored-by: Ermal Kaleci <ermalkaleci@gmail.com>
Minimum allowed line rate is |
Pull Request Summary
This PR integrates the SafeMode and TxPause pallets into the Shibuya runtime. Key configuration decisions are as follows:
SafeMode Configuration
EnterDepositAmount
andExtendDepositAmount
are set to None, making safe-mode non-permissionless.EnterDuration
is set to 4 hours, with extensions allowed in increments of 2 hours.EnsureRootOrHalfTechnicalCommittee
origin can force enter or extend safe-modeEnsureRootOrTwoThirdsTechnicalCommittee
origin can force exit safe-modeNotify
when entering/exiting safe-mode.TxPause Configuration
EnsureRootOrHalfTechnicalCommittee
origin. The same origin is used by the dAppStaking maintenance manager.These pallets are designed to complement each other effectively, for example if halting block production after entering safe-mode is required, this can be performed by pausing
System
orTimestamp
calls by using tx-pause.Benchmarks
The additional DB read introduced by the
TxPause
base filter in the extrinsic base weight is hardcoded with 1 db read for now. This is a temporary fix, a proper benchmark must be done in the future.Check list