-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict best LC update collection to canonical blocks #3553
Conversation
Currently, the best LC update for a sync committee period may refer to blocks that have later been orphaned, if they rank better than canonical blocks according to `is_better_update`. This was done because the most important task of the light client sync protocol is to track the correct `next_sync_committee`. However, practical implementation is quite tricky because existing infrastructure such as fork choice modules can only be reused in limited form when collecting light client data. Furthermore, it becomes impossible to deterministically obtain the absolute best LC update available for any given sync committee period, because orphaned blocks may become unavailable. For these reasons, `LightClientUpdate` should only be served if they refer to data from the canonical chain as selected by fork choice. This also assists efforts for a reliable backward sync in the future.
Simplify best `LightClientUpdate` collection by tracking only canonical data instead of tracking the best update across all branches within the sync committee period. - ethereum/consensus-specs#3553
Simplify best `LightClientUpdate` collection by tracking only canonical data instead of tracking the best update across all branches within the sync committee period. - ethereum/consensus-specs#3553
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This complicates implementations for little to no benefit, for that reason I oppose this somewhat. However I see the need for canonical updates to have a backfill protocol
Could you please elaborate on the complication aspect? It led to quite a bit of simplification in case of Nimbus: One notable aspect is, that even with the old system, you'd need to track separate branches in a proper implementation, they are just on a per-period basis. As in, track the best With the new system, that remains the same, but you track the best So, similar to regular fork choice (which is already present):
Regarding "little to no benefit", I think having canonical data made available on the network allows better reasoning.
|
From offline chat, would be great to define a direction for a backfill spec to make the motivation for this PR stronger |
|
Introduce a test runner for upcoming EF test suites related to canonical light client data collection. - ethereum/consensus-specs#3553
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this restriction is sensible to unlock backfilling in the future. SHOULD NOT
lang is okay to give time to Lodestar to migrate and Lighthouse to catch up.
Thanks for the thorough test! Def helpful to develop this
minimal.zip |
Beacon nodes can only compute light client data locally if they have the corresponding `BeaconState` available. This is not the case for blocks before the initially synced checkpoint state. The p2p-interface defines endpoints to sync light client data, but it only supports forward sync. To enable beacon nodes to backfill light client data, we must ensure that a malicious peer cannot convince us of fraudulent data. While it is possible to verify light client data against the locally backfilled blocks, blocks are not necessarily available anymore on libp2p as they are subject to `MIN_EPOCHS_FOR_BLOCK_REQUESTS`. Light client data stays relevant for more than 5 months, and without validating it against local block data it is impossible to distinguish canonical light client data from fraudulent light client data that eventually culminates in a shared history; the old periods in that case could still be manipulated. Furthermore, agreeing on canonical data improves caching performance and is relevant, e.g., for the portal network. To support efficient proof that a `LightClientUpdate` is canonical, it is proposed to minimally extend the `BeaconState` to track the best `SyncAggregate` of the current and previous sync committee period, according to an implementation-independent ranking function. The proposed ranking function is compatible with what consensus nodes implementing ethereum#3553 are already making available across libp2p and REST transports. It is based on and compatible with the `is_better_update` function in `specs/altair/light-client/sync-protocol.md`. There are three minor differences to `is_better_update`: 1. `is_better_update` runs in the LC, so runs without fork choice. It needs extra conditions to prefer older data over newer data. The `BeaconState` ranking function can use simpler logic. 2. The LC is always initialized from a post-Altair finalized checkpoint. This assumption does not hold in theoretical edge cases, requiring an extra guard for `ALTAIR_FORK_EPOCH` in the `BeaconState` function. 3. `is_better_update` has to deal with BNs serving incomplete data while they are still backfilling. This is not the case with `BeaconState`. Once the data is available in the `BeaconState`, a light client data backfill protocol could be defined that serves, for past periods: 1. A `LightClientUpdate` from requested `period` + 1 that proves that the entirety of `period` is finalized. 2. `BeaconState.historical_summaries[period].block_summary_root` at (1)'s `attested_header.beacon.state_root` + Merkle proof. 3. For each epoch's slot 0 block within requested `period`, the corresponding `LightClientHeader` + Merkle multi-proof for the block's inclusion into (2)'s `block_summary_root`. 4. For each of the entries from (3) with `beacon.slot` within `period`, the `current_sync_committee_branch` + Merkle proof for constructing `LightClientBootstrap`. 5. If (4) is not empty, the requested `period`'s `current_sync_committee`. 6. The best `LightClientUpdate` from `period`, if one exists, + Merkle proof that its `sync_aggregate` + `signature_slot` is selected as the canonical best one in (1)'s `attested_header.beacon.state_root`. Only the proof in (6) depends on `BeaconState` tracking the best light client data. This modification would enshrine the logic of a subset of `is_better_update`, but does not require adding any `LightClientXyz` data structures to the `BeaconState`.
✅ Nimbus 24.2.2 passing the additional tests. |
Beacon nodes can only compute light client data locally if they have the corresponding `BeaconState` available. This is not the case for blocks before the initially synced checkpoint state. The p2p-interface defines endpoints to sync light client data, but it only supports forward sync. To enable beacon nodes to backfill light client data, we must ensure that a malicious peer cannot convince us of fraudulent data. While it is possible to verify light client data against the locally backfilled blocks, blocks are not necessarily available anymore on libp2p as they are subject to `MIN_EPOCHS_FOR_BLOCK_REQUESTS`. Light client data stays relevant for more than 5 months, and without validating it against local block data it is impossible to distinguish canonical light client data from fraudulent light client data that eventually culminates in a shared history; the old periods in that case could still be manipulated. Furthermore, agreeing on canonical data improves caching performance and is relevant, e.g., for the portal network. To support efficient proof that a `LightClientUpdate` is canonical, it is proposed to minimally extend the `BeaconState` to track the best `SyncAggregate` of the current and previous sync committee period, according to an implementation-independent ranking function. The proposed ranking function is compatible with what consensus nodes implementing ethereum#3553 are already making available across libp2p and REST transports. It is based on and compatible with the `is_better_update` function in `specs/altair/light-client/sync-protocol.md`. There are three minor differences to `is_better_update`: 1. `is_better_update` runs in the LC, so runs without fork choice. It needs extra conditions to prefer older data over newer data. The `BeaconState` ranking function can use simpler logic. 2. The LC is always initialized from a post-Altair finalized checkpoint. This assumption does not hold in theoretical edge cases, requiring an extra guard for `ALTAIR_FORK_EPOCH` in the `BeaconState` function. 3. `is_better_update` has to deal with BNs serving incomplete data while they are still backfilling. This is not the case with `BeaconState`. Once the data is available in the `BeaconState`, a light client data backfill protocol could be defined that serves, for past periods: 1. A `LightClientUpdate` from requested `period` + 1 that proves that the entirety of `period` is finalized. 2. `BeaconState.historical_summaries[period].block_summary_root` at (1)'s `attested_header.beacon.state_root` + Merkle proof. 3. For each epoch's slot 0 block within requested `period`, the corresponding `LightClientHeader` + Merkle multi-proof for the block's inclusion into (2)'s `block_summary_root`. 4. For each of the entries from (3) with `beacon.slot` within `period`, the `current_sync_committee_branch` + Merkle proof for constructing `LightClientBootstrap`. 5. If (4) is not empty, the requested `period`'s `current_sync_committee`. 6. The best `LightClientUpdate` from `period`, if one exists, + Merkle proof that its `sync_aggregate` + `signature_slot` is selected as the canonical best one in (1)'s `attested_header.beacon.state_root`. Only the proof in (6) depends on `BeaconState` tracking the best light client data. This modification would enshrine the logic of a subset of `is_better_update`, but does not require adding any `LightClientXyz` data structures to the `BeaconState`.
Beacon nodes can only compute light client data locally if they have the corresponding `BeaconState` available. This is not the case for blocks before the initially synced checkpoint state. The p2p-interface defines endpoints to sync light client data, but it only supports forward sync. To enable beacon nodes to backfill light client data, we must ensure that a malicious peer cannot convince us of fraudulent data. While it is possible to verify light client data against the locally backfilled blocks, blocks are not necessarily available anymore on libp2p as they are subject to `MIN_EPOCHS_FOR_BLOCK_REQUESTS`. Light client data stays relevant for more than 5 months, and without validating it against local block data it is impossible to distinguish canonical light client data from fraudulent light client data that eventually culminates in a shared history; the old periods in that case could still be manipulated. Furthermore, agreeing on canonical data improves caching performance and is relevant, e.g., for the portal network. To support efficient proof that a `LightClientUpdate` is canonical, it is proposed to minimally extend the `BeaconState` to track the best `SyncAggregate` of the current and previous sync committee period, according to an implementation-independent ranking function. The proposed ranking function is compatible with what consensus nodes implementing ethereum#3553 are already making available across libp2p and REST transports. It is based on and compatible with the `is_better_update` function in `specs/altair/light-client/sync-protocol.md`. There are three minor differences to `is_better_update`: 1. `is_better_update` runs in the LC, so runs without fork choice. It needs extra conditions to prefer older data over newer data. The `BeaconState` ranking function can use simpler logic. 2. The LC is always initialized from a post-Altair finalized checkpoint. This assumption does not hold in theoretical edge cases, requiring an extra guard for `ALTAIR_FORK_EPOCH` in the `BeaconState` function. 3. `is_better_update` has to deal with BNs serving incomplete data while they are still backfilling. This is not the case with `BeaconState`. Once the data is available in the `BeaconState`, a light client data backfill protocol could be defined that serves, for past periods: 1. A `LightClientUpdate` from requested `period` + 1 that proves that the entirety of `period` is finalized. 2. `BeaconState.historical_summaries[period].block_summary_root` at (1)'s `attested_header.beacon.state_root` + Merkle proof. 3. For each epoch's slot 0 block within requested `period`, the corresponding `LightClientHeader` + Merkle multi-proof for the block's inclusion into (2)'s `block_summary_root`. 4. For each of the entries from (3) with `beacon.slot` within `period`, the `current_sync_committee_branch` + Merkle proof for constructing `LightClientBootstrap`. 5. If (4) is not empty, the requested `period`'s `current_sync_committee`. 6. The best `LightClientUpdate` from `period`, if one exists, + Merkle proof that its `sync_aggregate` + `signature_slot` is selected as the canonical best one in (1)'s `attested_header.beacon.state_root`. Only the proof in (6) depends on `BeaconState` tracking the best light client data. This modification would enshrine the logic of a subset of `is_better_update`, but does not require adding any `LightClientXyz` data structures to the `BeaconState`.
Introduce a test runner for upcoming EF test suites related to canonical light client data collection. - ethereum/consensus-specs#3553
@hwwhww anything still blocking this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, as @etan-status 's usual solid work 🙏
only some suggestions on the test format and naming.
tests/core/pyspec/eth2spec/test/altair/light_client/test_data_collection.py
Outdated
Show resolved
Hide resolved
@with_phases(phases=[CAPELLA], other_phases=[DENEB, ELECTRA]) | ||
@spec_test | ||
@with_config_overrides({ | ||
'DENEB_FORK_EPOCH': 1 * 8 + 4, # SyncCommitteePeriod 1 (+ 4 epochs) | ||
'ELECTRA_FORK_EPOCH': 3 * 8 + 4, # SyncCommitteePeriod 3 (+ 4 epochs) | ||
}, emit=False) | ||
@with_state | ||
@with_matching_spec_config(emitted_fork=ELECTRA) | ||
@with_presets([MINIMAL], reason="too slow") | ||
def test_deneb_electra_reorg_unaligned(spec, phases, state): | ||
yield from run_test_multi_fork(spec, phases, state, DENEB, ELECTRA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should be moved to capella/light_client/
folder, as well as test test_sync.py
ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's handle this in a separate PR to be included in the next release. I started to make these changes but realized it wasn't as straightforward as I thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have applied the changes to both data collection and test_sync as requested.
tests/core/pyspec/eth2spec/test/altair/light_client/test_data_collection.py
Outdated
Show resolved
Hide resolved
#### `new_head` execution step | ||
|
||
The given block (previously imported) should become head, leading to potential updates to: | ||
|
||
- The best `LightClientUpdate` for non-finalized sync committee periods. | ||
- The latest `LightClientFinalityUpdate` and `LightClientOptimisticUpdate`. | ||
- The latest finalized `Checkpoint` (across all branches). | ||
- The available `LightClientBootstrap` instances for newly finalized `Checkpoint`s. | ||
|
||
```yaml | ||
{ | ||
head_block_root: Bytes32 -- string, hex encoded, with 0x prefix | ||
checks: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about aligning with fork choice test format more?
- Add "Checks step" description.
- Make
head_block_root
as one of the checking items.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The head_block_root
is the actual operation that is done here (that's the new head to select, and the checks describe the result of updating the head).
This reverts commit aff4e34.
@hwwhww Have split up the test_sync and test_data_collection into per-fork files. I'll cross-check the PR against Nimbus test runner to see if all's well and then update with another comment. |
Updated generated files in PR description for 09e8f01 (alpha.9 base) and confirmed to pass in Nimbus ✅ |
Updated for
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. This is quite a large PR, so I'm unable to thoroughly review everything here but from what I can tell it all seems reasonable. I see no obvious issues. Thank you @etan-status!
Currently, the best LC update for a sync committee period may refer to blocks that have later been orphaned, if they rank better than canonical blocks according to
is_better_update
. This was done because the most important task of the light client sync protocol is to track the correctnext_sync_committee
. However, practical implementation is quite tricky because existing infrastructure such as fork choice modules can only be reused in limited form when collecting light client data. Furthermore, it becomes impossible to deterministically obtain the absolute best LC update available for any given sync committee period, because orphaned blocks may become unavailable.For these reasons,
LightClientUpdate
should only be served if they refer to data from the canonical chain as selected by fork choice. This also assists efforts for a reliable backward sync in the future.v1.5.0-alpha.10
)