-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update elastic scaling guide #6739
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,11 +12,6 @@ | |
//! to lower the latency between a transaction being submitted and it getting built in a parachain | ||
//! block. | ||
//! | ||
//! At present, with Asynchronous Backing enabled, a parachain can only include a block on the relay | ||
//! chain every 6 seconds, irregardless of how many cores the parachain acquires. Elastic scaling | ||
//! builds further on the 10x throughput increase of Async Backing, enabling collators to submit up | ||
//! to 3 parachain blocks per relay chain block, resulting in a further 3x throughput increase. | ||
//! | ||
//! ## Current limitations of the MVP | ||
//! | ||
//! The full implementation of elastic scaling spans across the entire relay/parachain stack and is | ||
|
@@ -41,18 +36,15 @@ | |
//! (measured up to 10 collators) is utilising 2 cores with authorship time of 1.3 seconds per | ||
//! block, which leaves 400ms for networking overhead. This would allow for 2.6 seconds of | ||
//! execution, compared to the 2 seconds async backing enabled. | ||
//! [More experiments](https://github.com/paritytech/polkadot-sdk/issues/4696) are being | ||
//! conducted in this space. | ||
//! 3. **Trusted collator set.** The collator set needs to be trusted until there’s a mitigation | ||
//! that would prevent or deter multiple collators from submitting the same collation to multiple | ||
//! backing groups. A solution is being discussed | ||
//! [here](https://github.com/polkadot-fellows/RFCs/issues/92). | ||
//! 4. **Fixed scaling.** For true elasticity, the parachain must be able to seamlessly acquire or | ||
//! The development required for lifting this limitation is tracked by | ||
//! [this issue](https://github.com/paritytech/polkadot-sdk/issues/5190) | ||
//! 2. **Fixed scaling.** For true elasticity, the parachain must be able to seamlessly acquire or | ||
//! sell coretime as the user demand grows and shrinks over time, in an automated manner. This is | ||
//! currently lacking - a parachain can only scale up or down by “manually” acquiring coretime. | ||
//! This is not in the scope of the relay chain functionality. Parachains can already start | ||
//! implementing such autoscaling, but we aim to provide a framework/examples for developing | ||
//! autoscaling strategies. | ||
//! Tracked by [this issue](https://github.com/paritytech/polkadot-sdk/issues/1487). | ||
//! | ||
//! Another hard limitation that is not envisioned to ever be lifted is that parachains which create | ||
//! forks will generally not be able to utilise the full number of cores they acquire. | ||
|
@@ -66,9 +58,10 @@ | |
//! - Ensure the `AsyncBackingParams.max_candidate_depth` value is configured to a value that is at | ||
//! least double the maximum targeted parachain velocity. For example, if the parachain will build | ||
//! at most 3 candidates per relay chain block, the `max_candidate_depth` should be at least 6. | ||
//! - Use a trusted single collator for maximum throughput. | ||
//! - Ensure enough coretime is assigned to the parachain. For maximum throughput the upper bound is | ||
//! 3 cores. | ||
//! - Ensure the `CandidateReceiptV2` node feature is enabled on the relay chain configuration (node | ||
//! feature bit number 3). | ||
//! | ||
//! <div class="warning">Phase 1 is NOT needed if using the <code>polkadot-parachain</code> or | ||
//! <code>polkadot-omni-node</code> binary, or <code>polkadot-omni-node-lib</code> built from the | ||
|
@@ -89,17 +82,63 @@ | |
#![doc = docify::embed!("../../cumulus/polkadot-omni-node/lib/src/nodes/aura.rs", slot_based_colator_import)] | ||
//! | ||
//! 2. In `start_consensus()` | ||
//! - Remove the `overseer_handle` param (also remove the | ||
//! - Remove the `overseer_handle` and `relay_chain_slot_duration` params (also remove the | ||
//! `OverseerHandle` type import if it’s not used elsewhere). | ||
//! - Rename `AuraParams` to `SlotBasedParams`, remove the `overseer_handle` field and add a | ||
//! `slot_drift` field with a value of `Duration::from_secs(1)`. | ||
//! - Replace the single future returned by `aura::run` with the two futures returned by it and | ||
//! spawn them as separate tasks: | ||
//! - Rename `AuraParams` to `SlotBasedParams`, remove the `overseer_handle` and | ||
//! `relay_chain_slot_duration` fields and add a `slot_drift` field with a value of | ||
//! `Duration::from_secs(1)`. Also add a `spawner` field initialized to | ||
//! `task_manager.spawn_handle()`. | ||
//! - Replace the `aura::run` with the `slot_based::run` call and remove the explicit task | ||
//! spawn: | ||
#![doc = docify::embed!("../../cumulus/polkadot-omni-node/lib/src/nodes/aura.rs", launch_slot_based_collator)] | ||
//! | ||
//! 3. In `start_parachain_node()` remove the `overseer_handle` param passed to `start_consensus`. | ||
//! | ||
//! ### Phase 2 - Activate fixed factor scaling in the runtime | ||
//! 3. In `start_parachain_node()` remove the `overseer_handle` and `relay_chain_slot_duration` | ||
//! params passed to `start_consensus`. | ||
//! | ||
//! ### Phase 2 - Configure core selection policy in the parachain runtime | ||
//! | ||
//! With the addition of [RFC-103](https://polkadot-fellows.github.io/RFCs/approved/0103-introduce-core-index-commitment.html), | ||
//! the parachain runtime has the responsibility of selecting which of the assigned cores to build | ||
//! on. It does so by implementing the `SelectCore` trait. | ||
Comment on lines
+100
to
+102
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would rephrase this to smth: |
||
#![doc = docify::embed!("../../cumulus/pallets/parachain-system/src/lib.rs", SelectCore)] | ||
//! | ||
//! For the vast majority of use cases though, you will not need to implement a custom core | ||
//! selector. There are two core selection policies to choose from (without implementing your own) | ||
//! `DefaultCoreSelector` and `LookaheadCoreSelector`. | ||
//! | ||
//! - The `DefaultCoreSelector` implements a round-robin selection on the cores that can be | ||
//! occupied by the parachain at the very next relay parent. This is the equivalent to what all | ||
//! parachains on production networks have been using so far. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. Shall we rename this as part of this PR? It seems like LookaheadCoreSelector should be the "default" as we expect any new parachain to use asynchronous backing? |
||
//! | ||
//! - The `LookaheadCoreSelector` also does a round robin on the assigned cores, but not those that | ||
//! can be occupied at the very next relay parent. Instead, it uses the ones after. In other words, | ||
//! the collator gets more time to build and advertise a collation for an assignment. This makes no | ||
//! difference in practice if the parachain is continuously scheduled on the cores. This policy is | ||
//! especially desirable for parachains that are sharing a core or that use on-demand coretime. | ||
//! | ||
//! In your /runtime/src/lib.rs, define a `SelectCore` type and use this to set the `SelectCore` | ||
//! property (overwrite it with the chosen policy type): | ||
#![doc = docify::embed!("../../templates/parachain/runtime/src/lib.rs", default_select_core)] | ||
//! ```ignore | ||
//! impl cumulus_pallet_parachain_system::Config for Runtime { | ||
//! ... | ||
//! type SelectCore = SelectCore<Runtime>; | ||
//! ... | ||
//! } | ||
//! ``` | ||
//! | ||
//! Next, we need to implement the `GetCoreSelector` runtime API. In the `impl_runtime_apis` block | ||
//! for your runtime, add the following code: | ||
//! | ||
//! ```ignore | ||
//! impl cumulus_primitives_core::GetCoreSelectorApi<Block> for Runtime { | ||
//! fn core_selector() -> (cumulus_primitives_core::CoreSelector, cumulus_primitives_core::ClaimQueueOffset) { | ||
//! ParachainSystem::core_selector() | ||
//! } | ||
//! } | ||
//! ``` | ||
//! | ||
//! ### Phase 3 - Configure fixed factor scaling in the runtime | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This paragraph seems fuzzy. It is not really clear what fixed factor scaling is and where the elasticity comes from. I think what we are configuring here is the maximum scaling factor. If it is fixed or elastic, that is a concern which is not yet implemented. Currently we rely on an external party to provision arbitrary amounts of cores to the parachain at each relay chain block via either bulk or on-demand. |
||
//! | ||
//! This phase consists of a couple of changes needed to be made to the parachain’s runtime in order | ||
//! to utilise fixed factor scaling. | ||
|
@@ -108,8 +147,10 @@ | |
//! produce per relay chain block (in direct correlation with the number of acquired cores). This | ||
//! should be either 1 (no scaling), 2 or 3. This is called the parachain velocity. | ||
//! | ||
//! If you configure a velocity which is different from the number of assigned cores, the measured | ||
//! velocity in practice will be the minimum of these two. | ||
//! <div class="warning">If you configure a velocity which is different from the number of assigned | ||
//! cores, the measured velocity in practice will be the minimum of these two. However, be mindful | ||
//! that if the velocity is higher than the number of assigned cores, it's possible that | ||
//! <a href="https://github.com/paritytech/polkadot-sdk/issues/6667"> only a subset of the collator set will be authoring blocks.</a></div> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The question is why do we need to configure a velocity at all, seems redundant. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Once the slot based collator can produce multiple blocks per slot we should also add that we recommend slot durations of at least 6s, preferably even 12. (better censorship resistance) |
||
//! | ||
//! The chosen velocity will also be used to compute: | ||
//! - The slot duration, by dividing the 6000 ms duration of the relay chain slot duration by the | ||
|
@@ -127,10 +168,10 @@ | |
//! const MAX_BLOCK_PROCESSING_VELOCITY: u32 = 3; | ||
//! ``` | ||
//! | ||
//! 2. Set the `MILLISECS_PER_BLOCK` to the desired value. | ||
//! 2. Set the `MILLI_SECS_PER_BLOCK` to the desired value. | ||
//! | ||
//! ```ignore | ||
//! const MILLISECS_PER_BLOCK: u32 = | ||
//! const MILLI_SECS_PER_BLOCK: u32 = | ||
//! RELAY_CHAIN_SLOT_DURATION_MILLIS / MAX_BLOCK_PROCESSING_VELOCITY; | ||
//! ``` | ||
//! Note: for a parachain which measures time in terms of its own block number, changing block | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,12 @@ | ||||||
title: update elastic scaling guide for untrusted collator set | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
doc: | ||||||
- audience: [Node Dev, Runtime Dev] | ||||||
description: | | ||||||
"Updates the elastic scaling guide for parachains, taking into consideration the completed implementation of | ||||||
[RFC-103](https://github.com/polkadot-fellows/RFCs/pull/103), which enables an untrusted collator set for | ||||||
elastic scaling. Adds the necessary instructions for configuring the parachain so that it can leverage this implementation." | ||||||
|
||||||
crates: | ||||||
- name: cumulus-pallet-parachain-system | ||||||
bump: none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would name it an optimization rather than limitation.