Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate UnknownTargetRoot slasher errors #4972

Closed
michaelsproul opened this issue Dec 4, 2023 · 4 comments
Closed

Investigate UnknownTargetRoot slasher errors #4972

michaelsproul opened this issue Dec 4, 2023 · 4 comments
Labels

Comments

@michaelsproul
Copy link
Member

Description

We have quite a few minor errors on the slasher like:

Dec 03 23:37:38.417 DEBG Unable to obtain indexed form of attestation for slasher, error: UnknownTargetRoot(0x72eff6e4116a060b103f0731aa7590c96a3b8cd23441144aa6d9d214019e04d9), attestation_root: 0x2aa4ab96fd4d42386c23a5988f9d6955b98edc9d804026f451a783c5b3675d8f, service: beacon

I think this error means:

  • We processed an attestation for which we did know the head block (beacon_block_root).
  • However, we didn't know the target_root. This is an invalid type of attestation, as it implies the target isn't an ancestor of the head.
  • Because the target is unknown, the beacon node fails to load the beacon state for the target block, which the slasher needs to verify the signature. Therefore the attestation gets discarded without being checked for slashability (a potential blind spot!).

Steps to resolve

  1. Confirm that this is what's happening, e.g. invalid attestations where the beacon_block_root does not descend from the target_root.
  2. Decide on a mitigation strategy:
    a) Use the attestation's head state to verify the signatures when the target block is unknown. This should not result in any errors, as attestations are only sent for processing once their head block is known (otherwise they get requeued).
    b) Add a different queue for the slasher which waits for the target root to be known. From the logs it seems that many of the target blocks are actually valid (and canonical), so it might just be a race where a peer attests head=N - 1 and target=N at the start of slot N before we've had time to import block N.
  3. Work out which peers are sending these attestations, as it probably indicates a bug in some client software, or (less likely) someone experimenting with attack strategies.

I think I prefer 2a for its simplicity. We just have to make sure we don't get DoS'd by state loads, as attestation head states likely don't lie on epoch boundaries and are more expensive to fetch than target states in general. Tree states will help with this ;)

Version

Lighthouse v4.5.0

@michaelsproul michaelsproul added enhancement New feature or request slasher labels Dec 4, 2023
@michaelsproul
Copy link
Member Author

How to test

  1. Run a slasher on Goerli. Use the flags --network goerli --checkpoint-sync-url "https://beaconstate-goerli.chainsafe.io" --slasher --slasher-history-length 256 .
  2. Look for slasher-related logs with tail -f ~/.lighthouse/goerli/beacon/logs/beacon.log | grep slasher.
  3. Wait for the error message to appear (seems to happen once every ~15 minutes)

How to confirm the issue

Edit the log message for Unable to obtain indexed form of attestation for slasher so that it prints out the beacon_block_root and the target.root for the attestation. Re-compile Lighthouse (using make), re-run it and wait for the error to appear. When the error shows up again with the extra information, look up the block roots on the Goerli block explorer, like this:

If my theory is correct then we should have target_slot > head_slot.

@michaelsproul
Copy link
Member Author

^ @armaganyildirak

@michaelsproul
Copy link
Member Author

How to fix

I think the attestation probably fails verification here:

verify_attestation_target_root::<T::EthSpec>(&head_block, attestation)?;

It then gets passed to the slasher via process_slash_info, which uses obtain_indexed_attestation_and_committees_per_slot to convert the attestation into an IndexedAttestation:

let (indexed_attestation, check_signature, err) = match slash_info {
SignatureNotChecked(attestation, err) => {
match obtain_indexed_attestation_and_committees_per_slot(chain, attestation) {
Ok((indexed, _)) => (indexed, true, err),

The IndexedAttestation has the aggregation_bitfield from the original attestation replaced by the list of validators who signed the attestation. The slasher needs an IndexedAttestation because it does all its calculations in terms of validator indices.

The process of converting the aggregation_bitfield to a list of validators is handled by map_attestation_committee which looks at the attester shuffling from a BeaconState (in each epoch the validators are shuffled into committees. We use the aggregation bits to select the validators from a specific committee in a specific epoch + state).

Presently the map_attestation_committee function uses the target.root from the attestation to look up the committee and fails if that target root is not known, here:

// Attestation target must be for a known block.
//
// We use fork choice to find the target root, which means that we reject any attestation
// that has a `target.root` earlier than our latest finalized root. There's no point in
// processing an attestation that does not include our latest finalized block in its chain.
//
// We do not delay consideration for later, we simply drop the attestation.
if !chain
.canonical_head
.fork_choice_read_lock()
.contains_block(&target.root)
&& !chain.early_attester_cache.contains_block(target.root)
{
return Err(Error::UnknownTargetRoot(target.root));
}

To fix the issue we want to prevent this error from being returned, either by:

  1. Always using the attestation.data.beacon_block_root instead of attestation.data.target.root when looking up the committees. This might be the simplest fix, but would have worse performance in case of a cache miss in with_committee_cache.
  2. Using the attestation.data.beacon_block_root only if the attestation.data.target.root is not found in fork choice. This should fix the issue while not regressing performance for any existing cases.

@michaelsproul
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant