Elastic scaling: use up to 12 cores #6858

sandreim · 2024-12-12T12:17:27Z

We need to run some tests with 12 cores and 500ms blocks.

There are no explicit limits on how many cores can be assigned to a parachain in the runtime, but on the node side there is at least this one

alexggh · 2024-12-12T12:52:38Z

A first try that doesn't completely work could be found here: master...alexggh/hackaton.

alindima · 2024-12-16T10:31:15Z

The current status with this branch is that the collator builds 12 blocks but is not capable to use async backing (somehow does not end up backing 12 new candidates right on the next block, so it ends up getting 12 blocks and then waiting 6 seconds to back another set of 12).

I managed to debug this and it boils down to this constant not being big enough to handle this many unincluded candidates. Therefore, the following parent searches after the first set of unincluded candidates are built, will keep returning the same candidate until the previous set is included. And because the core selector uses the block number for choosing the core, all further candidates would use the same core, so the slot based collator skips them.
Bumping this constant to 24 fixed it. We should have this limit dynamically configured based on the maximum parachain velocity.

alindima · 2024-12-16T10:44:37Z

CC: @skunert 🔝

alindima · 2024-12-23T08:54:26Z

I discovered another issue that surfaces from time to time which makes the throughput drop for a relay chain block:

We're using a u8 for the core selector, so it will wrap around every 256th parahain block. To choose the core index to build on, we modulo the core selector with the number of assigned cores (in this case 12). In this specific scenario of using 12 blocks, it can happen that we get to the 0 core index multiple times in the same relay chain block. Example:

block number	core index
250	10
251	11
252	0
253	1
254	2
255	3
0 (wrapped around)	0

Once we wrapped around, we will not build any more blocks on this relay chain block, because we already built a block for core 0 (block number 252).

I think the impact of this is pretty low and acceptable, since it only happens once every 256 blocks.

alindima · 2024-12-23T09:07:01Z

And also seen the occasional known throughput decrease when building on relay chain forks: #4547

alindima · 2024-12-23T09:50:12Z

PR that bumps this limit and adds a zombienet-sdk test: #6983

…aling (#6983) On top of #6757 Fixes #6858 by bumping the `PARENT_SEARCH_DEPTH` constant to a larger value (30) and adds a zombienet-sdk test that exercises the 12-core scenario. This is a node-side limit that restricts the number of allowed pending availability candidates when choosing the parent parablock during authoring. This limit is rather redundant, as the parachain runtime already restricts the unincluded segment length to the configured value in the [FixedVelocityConsensusHook](https://github.com/paritytech/polkadot-sdk/blob/88d900afbff7ebe600dfe5e3ee9f87fe52c93d1f/cumulus/pallets/aura-ext/src/consensus_hook.rs#L35) (which ideally should be equal to this `PARENT_SEARCH_DEPTH`). For 12 cores, a value of 24 should be enough, but I bumped it to 30 to have some extra buffer. There are two other potential ways of fixing this: - remove this constant altogether, as the parachain runtime already makes those guarantees. Chose not to do this, as it can't hurt to have an extra safeguard - set this value to be equal to the uninlcuded segment size. This value however is not exposed to the node-side and would require a new runtime API, which seems overkill for a redundant check. --------- Co-authored-by: Javier Viola <javier@parity.io>

alindima self-assigned this Dec 16, 2024

alindima added this to parachains team board Dec 16, 2024

github-project-automation bot moved this to Backlog in parachains team board Dec 16, 2024

alindima moved this from Backlog to In Progress in parachains team board Dec 16, 2024

alindima mentioned this issue Dec 23, 2024

cumulus: bump PARENT_SEARCH_DEPTH and add test for 12-core elastic scaling #6983

Merged

alindima closed this as completed in #6983 Jan 28, 2025

github-project-automation bot moved this from In Progress to Completed in parachains team board Jan 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elastic scaling: use up to 12 cores #6858

Elastic scaling: use up to 12 cores #6858

sandreim commented Dec 12, 2024

alexggh commented Dec 12, 2024

alindima commented Dec 16, 2024

alindima commented Dec 16, 2024

alindima commented Dec 23, 2024

alindima commented Dec 23, 2024

alindima commented Dec 23, 2024

Elastic scaling: use up to 12 cores #6858

Elastic scaling: use up to 12 cores #6858

Comments

sandreim commented Dec 12, 2024

alexggh commented Dec 12, 2024

alindima commented Dec 16, 2024

alindima commented Dec 16, 2024

alindima commented Dec 23, 2024

alindima commented Dec 23, 2024

alindima commented Dec 23, 2024