Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update elastic scaling guide #6739

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions cumulus/pallets/parachain-system/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ workspace = true
[dependencies]
bytes = { workspace = true }
codec = { features = ["derive"], workspace = true }
docify = { workspace = true }
environmental = { workspace = true }
impl-trait-for-tuples = { workspace = true }
log = { workspace = true }
Expand Down
1 change: 1 addition & 0 deletions cumulus/pallets/parachain-system/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,7 @@ pub mod ump_constants {
}

/// Trait for selecting the next core to build the candidate for.
#[docify::export]
pub trait SelectCore {
/// Core selector information for the current block.
fn selected_core() -> (CoreSelector, ClaimQueueOffset);
Expand Down
91 changes: 66 additions & 25 deletions docs/sdk/src/guides/enable_elastic_scaling_mvp.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,6 @@
//! to lower the latency between a transaction being submitted and it getting built in a parachain
//! block.
//!
//! At present, with Asynchronous Backing enabled, a parachain can only include a block on the relay
//! chain every 6 seconds, irregardless of how many cores the parachain acquires. Elastic scaling
//! builds further on the 10x throughput increase of Async Backing, enabling collators to submit up
//! to 3 parachain blocks per relay chain block, resulting in a further 3x throughput increase.
//!
//! ## Current limitations of the MVP
//!
//! The full implementation of elastic scaling spans across the entire relay/parachain stack and is
Expand All @@ -41,18 +36,15 @@
//! (measured up to 10 collators) is utilising 2 cores with authorship time of 1.3 seconds per
//! block, which leaves 400ms for networking overhead. This would allow for 2.6 seconds of
//! execution, compared to the 2 seconds async backing enabled.
//! [More experiments](https://github.com/paritytech/polkadot-sdk/issues/4696) are being
//! conducted in this space.
//! 3. **Trusted collator set.** The collator set needs to be trusted until there’s a mitigation
//! that would prevent or deter multiple collators from submitting the same collation to multiple
//! backing groups. A solution is being discussed
//! [here](https://github.com/polkadot-fellows/RFCs/issues/92).
//! 4. **Fixed scaling.** For true elasticity, the parachain must be able to seamlessly acquire or
//! The development required for lifting this limitation is tracked by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name it an optimization rather than limitation.

//! [this issue](https://github.com/paritytech/polkadot-sdk/issues/5190)
//! 2. **Fixed scaling.** For true elasticity, the parachain must be able to seamlessly acquire or
//! sell coretime as the user demand grows and shrinks over time, in an automated manner. This is
//! currently lacking - a parachain can only scale up or down by “manually” acquiring coretime.
//! This is not in the scope of the relay chain functionality. Parachains can already start
//! implementing such autoscaling, but we aim to provide a framework/examples for developing
//! autoscaling strategies.
//! Tracked by [this issue](https://github.com/paritytech/polkadot-sdk/issues/1487).
//!
//! Another hard limitation that is not envisioned to ever be lifted is that parachains which create
//! forks will generally not be able to utilise the full number of cores they acquire.
Expand All @@ -66,9 +58,10 @@
//! - Ensure the `AsyncBackingParams.max_candidate_depth` value is configured to a value that is at
//! least double the maximum targeted parachain velocity. For example, if the parachain will build
//! at most 3 candidates per relay chain block, the `max_candidate_depth` should be at least 6.
//! - Use a trusted single collator for maximum throughput.
//! - Ensure enough coretime is assigned to the parachain. For maximum throughput the upper bound is
//! 3 cores.
//! - Ensure the `CandidateReceiptV2` node feature is enabled on the relay chain configuration (node
//! feature bit number 3).
//!
//! <div class="warning">Phase 1 is NOT needed if using the <code>polkadot-parachain</code> or
//! <code>polkadot-omni-node</code> binary, or <code>polkadot-omni-node-lib</code> built from the
Expand All @@ -89,17 +82,63 @@
#![doc = docify::embed!("../../cumulus/polkadot-omni-node/lib/src/nodes/aura.rs", slot_based_colator_import)]
//!
//! 2. In `start_consensus()`
//! - Remove the `overseer_handle` param (also remove the
//! - Remove the `overseer_handle` and `relay_chain_slot_duration` params (also remove the
//! `OverseerHandle` type import if it’s not used elsewhere).
//! - Rename `AuraParams` to `SlotBasedParams`, remove the `overseer_handle` field and add a
//! `slot_drift` field with a value of `Duration::from_secs(1)`.
//! - Replace the single future returned by `aura::run` with the two futures returned by it and
//! spawn them as separate tasks:
//! - Rename `AuraParams` to `SlotBasedParams`, remove the `overseer_handle` and
//! `relay_chain_slot_duration` fields and add a `slot_drift` field with a value of
//! `Duration::from_secs(1)`. Also add a `spawner` field initialized to
//! `task_manager.spawn_handle()`.
//! - Replace the `aura::run` with the `slot_based::run` call and remove the explicit task
//! spawn:
#![doc = docify::embed!("../../cumulus/polkadot-omni-node/lib/src/nodes/aura.rs", launch_slot_based_collator)]
//!
//! 3. In `start_parachain_node()` remove the `overseer_handle` param passed to `start_consensus`.
//!
//! ### Phase 2 - Activate fixed factor scaling in the runtime
//! 3. In `start_parachain_node()` remove the `overseer_handle` and `relay_chain_slot_duration`
//! params passed to `start_consensus`.
//!
//! ### Phase 2 - Configure core selection policy in the parachain runtime
//!
//! With the addition of [RFC-103](https://polkadot-fellows.github.io/RFCs/approved/0103-introduce-core-index-commitment.html),
//! the parachain runtime has the responsibility of selecting which of the assigned cores to build
//! on. It does so by implementing the `SelectCore` trait.
Comment on lines +100 to +102
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rephrase this to smth: RFC-103 enables parachain runtimes to constrain the execution of each block to a specified core. This ensures better security and liveness properties as described in the RFC. To make use of this feature ...

#![doc = docify::embed!("../../cumulus/pallets/parachain-system/src/lib.rs", SelectCore)]
//!
//! For the vast majority of use cases though, you will not need to implement a custom core
//! selector. There are two core selection policies to choose from (without implementing your own)
//! `DefaultCoreSelector` and `LookaheadCoreSelector`.
//!
//! - The `DefaultCoreSelector` implements a round-robin selection on the cores that can be
//! occupied by the parachain at the very next relay parent. This is the equivalent to what all
//! parachains on production networks have been using so far.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Shall we rename this as part of this PR? It seems like LookaheadCoreSelector should be the "default" as we expect any new parachain to use asynchronous backing?

//!
//! - The `LookaheadCoreSelector` also does a round robin on the assigned cores, but not those that
//! can be occupied at the very next relay parent. Instead, it uses the ones after. In other words,
//! the collator gets more time to build and advertise a collation for an assignment. This makes no
//! difference in practice if the parachain is continuously scheduled on the cores. This policy is
//! especially desirable for parachains that are sharing a core or that use on-demand coretime.
//!
//! In your /runtime/src/lib.rs, define a `SelectCore` type and use this to set the `SelectCore`
//! property (overwrite it with the chosen policy type):
#![doc = docify::embed!("../../templates/parachain/runtime/src/lib.rs", default_select_core)]
//! ```ignore
//! impl cumulus_pallet_parachain_system::Config for Runtime {
//! ...
//! type SelectCore = SelectCore<Runtime>;
//! ...
//! }
//! ```
//!
//! Next, we need to implement the `GetCoreSelector` runtime API. In the `impl_runtime_apis` block
//! for your runtime, add the following code:
//!
//! ```ignore
//! impl cumulus_primitives_core::GetCoreSelectorApi<Block> for Runtime {
//! fn core_selector() -> (cumulus_primitives_core::CoreSelector, cumulus_primitives_core::ClaimQueueOffset) {
//! ParachainSystem::core_selector()
//! }
//! }
//! ```
//!
//! ### Phase 3 - Configure fixed factor scaling in the runtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph seems fuzzy. It is not really clear what fixed factor scaling is and where the elasticity comes from. I think what we are configuring here is the maximum scaling factor.

If it is fixed or elastic, that is a concern which is not yet implemented. Currently we rely on an external party to provision arbitrary amounts of cores to the parachain at each relay chain block via either bulk or on-demand.

//!
//! This phase consists of a couple of changes needed to be made to the parachain’s runtime in order
//! to utilise fixed factor scaling.
Expand All @@ -108,8 +147,10 @@
//! produce per relay chain block (in direct correlation with the number of acquired cores). This
//! should be either 1 (no scaling), 2 or 3. This is called the parachain velocity.
//!
//! If you configure a velocity which is different from the number of assigned cores, the measured
//! velocity in practice will be the minimum of these two.
//! <div class="warning">If you configure a velocity which is different from the number of assigned
//! cores, the measured velocity in practice will be the minimum of these two. However, be mindful
//! that if the velocity is higher than the number of assigned cores, it's possible that
//! <a href="https://github.com/paritytech/polkadot-sdk/issues/6667"> only a subset of the collator set will be authoring blocks.</a></div>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is why do we need to configure a velocity at all, seems redundant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the slot based collator can produce multiple blocks per slot we should also add that we recommend slot durations of at least 6s, preferably even 12. (better censorship resistance)

//!
//! The chosen velocity will also be used to compute:
//! - The slot duration, by dividing the 6000 ms duration of the relay chain slot duration by the
Expand All @@ -127,10 +168,10 @@
//! const MAX_BLOCK_PROCESSING_VELOCITY: u32 = 3;
//! ```
//!
//! 2. Set the `MILLISECS_PER_BLOCK` to the desired value.
//! 2. Set the `MILLI_SECS_PER_BLOCK` to the desired value.
//!
//! ```ignore
//! const MILLISECS_PER_BLOCK: u32 =
//! const MILLI_SECS_PER_BLOCK: u32 =
//! RELAY_CHAIN_SLOT_DURATION_MILLIS / MAX_BLOCK_PROCESSING_VELOCITY;
//! ```
//! Note: for a parachain which measures time in terms of its own block number, changing block
Expand Down
12 changes: 12 additions & 0 deletions prdoc/pr_6739.prdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
title: update elastic scaling guide for untrusted collator set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: update elastic scaling guide for untrusted collator set
title: elastic scaling documentation changes for RFC103


doc:
- audience: [Node Dev, Runtime Dev]
description: |
"Updates the elastic scaling guide for parachains, taking into consideration the completed implementation of
[RFC-103](https://github.com/polkadot-fellows/RFCs/pull/103), which enables an untrusted collator set for
elastic scaling. Adds the necessary instructions for configuring the parachain so that it can leverage this implementation."

crates:
- name: cumulus-pallet-parachain-system
bump: none
19 changes: 8 additions & 11 deletions templates/parachain/node/src/service.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@ use polkadot_sdk::*;
use cumulus_client_cli::CollatorOptions;
use cumulus_client_collator::service::CollatorService;
#[docify::export(lookahead_collator)]
use cumulus_client_consensus_aura::collators::lookahead::{self as aura, Params as AuraParams};
use cumulus_client_consensus_aura::collators::slot_based::{
self as slot_based, Params as SlotBasedParams,
};
use cumulus_client_consensus_common::ParachainBlockImport as TParachainBlockImport;
use cumulus_client_consensus_proposer::Proposer;
use cumulus_client_service::{
Expand All @@ -28,7 +30,7 @@ use cumulus_primitives_core::{
relay_chain::{CollatorPair, ValidationCode},
ParaId,
};
use cumulus_relay_chain_interface::{OverseerHandle, RelayChainInterface};
use cumulus_relay_chain_interface::RelayChainInterface;

// Substrate Imports
use frame_benchmarking_cli::SUBSTRATE_REFERENCE_HARDWARE;
Expand Down Expand Up @@ -178,10 +180,8 @@ fn start_consensus(
relay_chain_interface: Arc<dyn RelayChainInterface>,
transaction_pool: Arc<sc_transaction_pool::TransactionPoolHandle<Block, ParachainClient>>,
keystore: KeystorePtr,
relay_chain_slot_duration: Duration,
para_id: ParaId,
collator_key: CollatorPair,
overseer_handle: OverseerHandle,
announce_block: Arc<dyn Fn(Hash, Option<Vec<u8>>) + Send + Sync>,
) -> Result<(), sc_service::Error> {
let proposer_factory = sc_basic_authorship::ProposerFactory::with_proof_recording(
Expand All @@ -201,7 +201,7 @@ fn start_consensus(
client.clone(),
);

let params = AuraParams {
let params = SlotBasedParams {
create_inherent_data_providers: move |_, ()| async move { Ok(()) },
block_import,
para_client: client.clone(),
Expand All @@ -213,17 +213,16 @@ fn start_consensus(
keystore,
collator_key,
para_id,
overseer_handle,
relay_chain_slot_duration,
slot_drift: Duration::from_secs(1),
proposer,
collator_service,
authoring_duration: Duration::from_millis(2000),
reinitialize: false,
spawner: task_manager.spawn_handle(),
};
let fut = aura::run::<Block, sp_consensus_aura::sr25519::AuthorityPair, _, _, _, _, _, _, _, _>(
slot_based::run::<Block, sp_consensus_aura::sr25519::AuthorityPair, _, _, _, _, _, _, _, _, _>(
params,
);
task_manager.spawn_essential_handle().spawn("aura", None, fut);

Ok(())
}
Expand Down Expand Up @@ -398,10 +397,8 @@ pub async fn start_parachain_node(
relay_chain_interface,
transaction_pool,
params.keystore_container.keystore(),
relay_chain_slot_duration,
para_id,
collator_key.expect("Command line arguments do not allow this. qed"),
overseer_handle,
announce_block,
)?;
}
Expand Down
6 changes: 6 additions & 0 deletions templates/parachain/runtime/src/apis.rs
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,12 @@ impl_runtime_apis! {
}
}

impl cumulus_primitives_core::GetCoreSelectorApi<Block> for Runtime {
fn core_selector() -> (cumulus_primitives_core::CoreSelector, cumulus_primitives_core::ClaimQueueOffset) {
ParachainSystem::core_selector()
}
}

impl sp_api::Core<Block> for Runtime {
fn version() -> RuntimeVersion {
VERSION
Expand Down
8 changes: 4 additions & 4 deletions templates/parachain/runtime/src/configs/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,9 @@ use super::{
weights::{BlockExecutionWeight, ExtrinsicBaseWeight, RocksDbWeight},
AccountId, Aura, Balance, Balances, Block, BlockNumber, CollatorSelection, ConsensusHook, Hash,
MessageQueue, Nonce, PalletInfo, ParachainSystem, Runtime, RuntimeCall, RuntimeEvent,
RuntimeFreezeReason, RuntimeHoldReason, RuntimeOrigin, RuntimeTask, Session, SessionKeys,
System, WeightToFee, XcmpQueue, AVERAGE_ON_INITIALIZE_RATIO, EXISTENTIAL_DEPOSIT, HOURS,
MAXIMUM_BLOCK_WEIGHT, MICRO_UNIT, NORMAL_DISPATCH_RATIO, SLOT_DURATION, VERSION,
RuntimeFreezeReason, RuntimeHoldReason, RuntimeOrigin, RuntimeTask, SelectCore, Session,
SessionKeys, System, WeightToFee, XcmpQueue, AVERAGE_ON_INITIALIZE_RATIO, EXISTENTIAL_DEPOSIT,
HOURS, MAXIMUM_BLOCK_WEIGHT, MICRO_UNIT, NORMAL_DISPATCH_RATIO, SLOT_DURATION, VERSION,
};
use xcm_config::{RelayLocation, XcmOriginToTransactDispatchOrigin};

Expand Down Expand Up @@ -204,7 +204,7 @@ impl cumulus_pallet_parachain_system::Config for Runtime {
type ReservedXcmpWeight = ReservedXcmpWeight;
type CheckAssociatedRelayNumber = RelayNumberMonotonicallyIncreases;
type ConsensusHook = ConsensusHook;
type SelectCore = cumulus_pallet_parachain_system::DefaultCoreSelector<Runtime>;
type SelectCore = SelectCore<Runtime>;
}

impl parachain_info::Config for Runtime {}
Expand Down
4 changes: 4 additions & 0 deletions templates/parachain/runtime/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,10 @@ type ConsensusHook = cumulus_pallet_aura_ext::FixedVelocityConsensusHook<
UNINCLUDED_SEGMENT_CAPACITY,
>;

#[docify::export(default_select_core)]
/// Core selection policy
type SelectCore<Runtime> = cumulus_pallet_parachain_system::DefaultCoreSelector<Runtime>;

/// The version information used to identify this runtime when compiled natively.
#[cfg(feature = "std")]
pub fn native_version() -> NativeVersion {
Expand Down
Loading