Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid empty attestation signatures consensus split #1713

Closed
protolambda opened this issue Apr 8, 2020 · 18 comments
Closed

Invalid empty attestation signatures consensus split #1713

protolambda opened this issue Apr 8, 2020 · 18 comments
Labels
general:bug Something isn't working phase0 scope:BLS

Comments

@protolambda
Copy link
Contributor

Some context:

  • v0.11.1 had a test helper function change an empty signature from 0xc000... to 0x0000. Completely zero, instead of valid but empty.
  • Up to v0.11.0 attestations with no participants were considered valid for some reason, and we had a testcase covering it. (I don't agree with it being a valid case though)
  • v0.11.1 also misses the post-state file for some reason, making the empty aggregation bits an anvalid case. And clients pass it, not because they added the rule, but because the 0x0000 signature is invalid.
  • I stumbled on this after testing it in a 3rd implementation (ZRNT, Go still, but with different state representation, more like remerkleable)

The scope of the bug:

  • The BLS spec has a pre-condition that states that verifying with 0 pubkeys are undefined behavior. Precondition: n >= 1, otherwise return INVALID. (Thanks @hwwhww for retrieving this from the specs)
  • The previous Eth2 spec versions considered empty attestations included in blocks valid. There was an empty_aggregation_bits test case that described it as valid: https://github.com/ethereum/eth2.0-spec-tests/tree/v0.11.0/tests/minimal/phase0/operations/attestation/pyspec_tests/empty_aggregation_bits
  • Py-ecc previously handled the 0 pubkeys case as a valid case; the 0xc00000.... signature is valid for no pubkeys.
  • The network spec requires attestations on gossip to have exactly 1 bit set, so the network spec is not affected
  • The beacon spec just verifies the empty signature. Not all empty signatures are valid. 0x000... is not. 0xc000... is.
  • Prysm and ZRNT ignore the signature if no bits are set. (ZRNT with a TODO statement): Prysm ZRNT
  • Lighthouse puts the attestation signatures in a SignatureSet, and then verifies it as a whole. This would allow the 0xc000 signature to pass without pubkeys I think. But not other invalid signatures.
  • Other clients have to be checked, there could be more edge cases. If we can agree on what it should be, I will write new spec tests to cover it.

Action points:

  • Empty signatures are a dangerous new edge case to attestation processing. Requiring the BLS library to handle the undefined behavior well, or the client to implement the edge case correctly (not like Prysm and ZRNT). We need to be more explicit about this behavior, and possible make it invalid. There should be no reason to include no-participant attestations on chain. Size is not the problem as proposers can already fill their block with duplicates, but just having the empty-signature case at all is bad.
  • A chain split could be caused between Prysm and Lighthouse if someone mines a block with an attestation containing an empty bitfield, but an invalid signature. (Which could be a valid looking one still, but not valid for empty set of pubkeys). Prysm will stay on that chain, lighthouse will split away from that chain.

Labeling it as a bug, although the spec is not necessarily wrong, it is definitely a case that should get attention to avoid clients splitting chains in the future.

@djrtwo @CarlBeek

@protolambda protolambda added the general:bug Something isn't working label Apr 8, 2020
@vbuterin
Copy link
Contributor

vbuterin commented Apr 8, 2020

There are going to be places in the protocol in the future where there will be a single aggregate signature, and not a list of aggregates; for example even in phase 1 there's the light client aggregate. And so in such a situation, the degenerate case is not going to be an empty list of attestations as it is elsewhere, it will be a single aggregate signature that is empty. To avoid requiring higher-layer extra logic around that case, it seems to me reasonable to make zero a valid signature for an empty list of pubkeys and messages. This case would get actually triggered when eg. somehow no one from the light client committee shows up to get included in a block.

We could add a separate rule banning empty attestations from phase 0 if desired; seems pointless to allow them.

@djrtwo
Copy link
Contributor

djrtwo commented Apr 8, 2020

We could add a separate rule banning empty attestations from phase 0 if desired; seems pointless to allow them.

But leaving them in does not break anything, and introducing breaking changes at this point should imo only be done to fix serious issues.


Preventing 0 attestations barely affects the scope of behavior and attack surface.

A rational actor would find more valuable attestations anyway, and not want to pack the block overly full, increasing the size of the block and time to compute.

As for the argument of “packing” the block with crap, the proposer could easily just put duplicate attestations (of high participation and thus more computational load) or just put a bunch of single bit attestations.

@protolambda
Copy link
Contributor Author

protolambda commented Apr 8, 2020

I am not concerned about the spam as much, but about it being a weird edge case that already has clients to disagree now, and likely in the future if we leave it as-is. Being more explicit about validity conditions would be good, especially because the BLS spec itself has its own requirements (the >= 1 case). And many libraries, such as the two Go version, simply crash or give undefined answers on empty pubkey list inputs. Having clients improvise to interpret the spec is what caused the current potential chain split.

@vbuterin
Copy link
Contributor

vbuterin commented Apr 8, 2020

But leaving them in does not break anything, and introducing breaking changes at this point should imo only be done to fix serious issues.

That's fair. Especially with compression, an empty attestation is close to zero overhead anyway. So I'd be happy with specifying that BLS aggregate signatures with 0 pubkeys are in general allowed.

@paulhauner
Copy link
Contributor

Giving some feedback as per @protolambda's request :)

it seems to me reasonable to make zero a valid signature for an empty list of pubkeys and messages

This also seems reasonable to me. I assume the intention is that only 0x00..00 is valid for 0 pubkeys? I prefer this otherwise we have some weird form of signature malleability.

I am not concerned about the spam

Same for me.

Being more explicit about validity conditions would be good

Agreed.

@nisdas
Copy link
Contributor

nisdas commented Apr 15, 2020

Being more explicit about validity conditions would be good, especially because the BLS spec itself has its own requirements (the >= 1 case).

This would make sense and I would agree with this, being explicit here is a definite plus. I would be ok with verifying that a signature was 0xc00... if no bits were set in the accompanying bitfield.

@mratsim
Copy link
Contributor

mratsim commented Apr 23, 2020

There are 2 layers where invalid signatures are handled:

  1. Low-level: Milagro, MCL
  2. High-level: The client

Currently due to testing, fuzzing (and invalid signatures in deposits?) we have a wrapper to handle valid signatures at a low-level (i.e. point on the G2 curve) and invalid signatures but that we accept because of protocol or test/fuzzing requirements. We may add in the future another case with lazy loaded signatures (i.e. we stored it in a database on shutdown and reload it in the future).

type
  BlsValueType* = enum
    Real
    OpaqueBlob

  BlsValue*[N: static int, T] = object
    # N = 48 for public keys and N = 96 for Signature
    case kind*: BlsValueType
    of Real:
      blsValue*: T
    of OpaqueBlob:
      blob*: array[N, byte]

In the low-level layer, it is very easy to check if a signature is valid or not: it's not a point on the curve and the underlying library will not parse it. For example 0x0000... (full-zero) is invalid while 0xc000... is valid and a point at infinity. This is derived from elliptic curve maths and so does not need to be specified.

At the high-level layer, there is a huge potential for divergence and as @nisdas said, being explicit is important as are test cases.
In particular what is the zero signature:

  • a point at infinity 0xc000
  • a binary zero blob 0x0000
  • in which case is it valid?
    • when accompanied by an empty aggregation bitset?

@kirk-baird
Copy link
Contributor

Have we reached a conclusion yet?

From my point of view I think we have two options which I prefer option 1:

Option 1

Request a change to the BLS specs such that if n = 0 i.e. PublicKeys = [] then [] aggregates to the identity. As it makes aggregation simpler i.e.

aggregate_public_key = 0
for key in public_keys:
    aggregate_public_key += key

Rather than specifically handling the n = 0 case.

If this were the case we would simply have

  • AggregateSignature = 0x00... always return false as it is not a valid point on curve.
  • AggregateSignature = 0xc00... return true for PublicKeys [] and [0xc00...] (i.e. PublicKey at point 0)

Option 2

Leave BLS spec as is and handle in the Ethereum layer the case where AggregateSignature = 0x00 and publickeys = [] specifically to return true before interacting with the BLS layer.

@protolambda
Copy link
Contributor Author

I am opposed to Option 1 as it requires changes in the BLS spec (big or not, it's friction), and it doesn't solve for the real issue: clients have to get this right and not rely on libraries to work-around. If one library accepts 0 pubkeys, the other doesn't, and yet another is somewhere in between, then clients have to track those edge cases to stay in consensus. And properly implement work arounds.

Option 2 is better, as it does not require anything from the BLS spec, and is super straigtforward to implement and test:

  • 0 participants is invalid case, unless signature is 0x000...
  • > 0 participants is valid only if the signature is valid for those pubkeys.

And better (IMHO) would be an Option 3, to remove one branch of decision making:

  • 0 participants is an invalid case, no exceptions
  • > 0 participants is valid only if the signature is valid for those pubkeys.

@kirk-baird
Copy link
Contributor

I am opposed to Option 1 as it requires changes in the BLS spec (big or not, it's friction), and it doesn't solve for the real issue: clients have to get this right and not rely on libraries to work-around. If one library accepts 0 pubkeys, the other doesn't, and yet another is somewhere in between, then clients have to track those edge cases to stay in consensus. And properly implement work arounds.

From this option all logic would now be in the BLS layer and none in the Ethereum layer which is appealing. However, as you mentioned you would like public keys [] to return false which this will not.

I like this option mostly as I think the BLS spec is not implementation friendly in its current form and I would like to change it anyway.

And better (IMHO) would be an Option 3, to remove one branch of decision making:

0 participants is an invalid case, no exceptions
> 0 participants is valid only if the signature is valid for those pubkeys.

I do see the appeal of option 3 over option 2 in avoiding changes to BLS spec. And in that AggregateSignature 0x00.. will always be false and as the BLS spec is PublicKeys [] will also always be false.

@nisdas
Copy link
Contributor

nisdas commented Apr 29, 2020

From this option all logic would now be in the BLS layer and none in the Ethereum layer which is appealing. However, as you mentioned you would like public keys [] to return false which this will not.

I like this option mostly as I think the BLS spec is not implementation friendly in its current form and I would like to change it anyway.

Do you anticipate that it would be easy enough to get this included in the spec ? I would prefer to have everything under our control if possible,and the best way to do that would be to handle this in the application layer.

I would prefer option 3 as it is mostly straightforward to implement and I do not see any valid case for having an attestation with 0 participants. Its effectively just junk in the block that client's will have to process and verify.

@kirk-baird
Copy link
Contributor

Do you anticipate that it would be easy enough to get this included in the spec ? I would prefer to have everything under our control if possible,and the best way to do that would be to handle this in the application layer.

I'm confident at this stage if we wished to change the BLS spec it would get done. Though we are looking to freeze phase 0 and BLS spec is a bottleneck so maybe it's best to go with option 3 which as you say will also give us more control.

Additionally, if we go with option 3 and handle this case without interacting with BLS then it will be irrelevant if BLS makes any changes in this regard.

@protolambda
Copy link
Contributor Author

Ok, I'm making a PR for option 3 (empty set of participants is invalid as attestation to go onchain). If we need it to behave differently for some unforeseen phase 1+ case, we can simply start allowing the type of attestation we need, with backwards compatibility in regards to attestations already on-chain. Also, we indeed avoid being affected by BLS changes and library issues in any way, which is great.

@djrtwo
Copy link
Contributor

djrtwo commented May 4, 2020

What about the issue brought up by vitalik early in the thread? We know of cases where empty signatures will have to be included on chain.

In such a case, I suppose we have to be explicit -- e.g. if light_sig == 0x00: return at the top of process light sigs function.

EDIT: The fix in #1780 is specific to this one issue. If we go Option 3, we'll need to address each type of potentially empty aggregate specifically. This is probably fine

djrtwo added a commit that referenced this issue May 7, 2020
Handle empty aggregation bits as discussed in #1713
@hwwhww
Copy link
Contributor

hwwhww commented May 14, 2020

Copy-paste the discussion on #1799 https://github.com/ethereum/eth2.0-specs/pull/1799/files#r424421674

@mratsim:
Do we define the empty signature somewhere? Is that 0xc000...

@protolambda:
The python here would mean all zeroes (every ssz type is all zeroes by default). But agree that this is way to vague and can be confused with an empty signature.

@hwwhww:
I remember we had EMPTY_SIGNATURE constant in 2019. I would be happy to add it back.

@mratsim:
We can have, NO_SIGNATURE(0x0000...) and EMPTY_SIGNATURE (0xc000...) for clarity

@kirk-baird:
I've been treating EMPTY_SIGNATURE as (0x0000...) as I believe that's what we called in it the spec in 2019. I'm happy to do a re-name and make

  • NO_SIGNATURE (0x0000...) (always verifies false)
  • EMPTY_SIGNATURE (0xc000...)

However, we have to be careful here as AggregatePublicKeys([]) will always verify false according to the BLS Standard.

So EMPTY_SIGNATURE will only verify true against a list of EMPTY_PUBLIC_KEYS. We must have 1..* EMPTY_PUBLIC_KEYS verifying true. (Note: or some other combination of public keys that add to 0 mod r).

@mratsim:
Or EMPTY_SIGNATURE(0x0000) and ZERO_SIGNATURE (0xc000) I'm not married to any name.


So far, we have two non-signature cases in phase 1:

  1. BeaconBlockBody.light_client_signature, when len(signer_pubkeys) == 0
        if len(signer_pubkeys) == 0:
            assert block_body.light_client_signature == BLSSignature()
            return
        else:
            assert bls.FastAggregateVerify(signer_pubkeys, signing_root, signature=block_body.light_client_signature)
  2. ShardTransition.proposer_signature_aggregate, when len(shard_blocks) == 0
    • It’s currently assert bls.AggregateVerify(pubkeys, signing_roots, signature=transition.proposer_signature_aggregate)

My 2 cents

Optional 3 is good for attestation aggregation, however, for phase 1 cases, IMO I'm inclined to apply option 2 for these cases.

  • It's clear that AggregateVerify(PKs, messages, signature)/FastAggregateVerify(PKs, message, signature) is invalid when PKs = [] according to the BLS standard, so we cannot set the non-signature to 0xc0... and make it pass the verification.
  • 0x00.. is an invalid signature in BLS verification, and it can remind us to explicitly take care of the non-signature case. We can define some Eth2-only helpers:
def optional_fast_aggregate_verify(pubkeys, message, signature) -> bool:
    if len(pubkeys) == 0:
        return signature == NO_SIGNATURE
    else:
        return bls.FastAggregateVerify(pubkeys, message, signature)
def optional_aggregate_verify(pubkeys, messages, signature) -> bool:
    if len(pubkeys) == 0:
        return signature == NO_SIGNATURE
    else:
        return bls.AggregateVerify(pubkeys, messages, signature)

So it would be crystal clear that PKs == [] is possible. 🙂

@protolambda
Copy link
Contributor Author

Yes, being explicit like that is great, and avoids weird edge cases between implementations in the BLS spec, as readers have many different assumptions about 0 pubkeys, and the BLS spec does not cover it. With phase0 I prefer option 3, as it is easier to open it back up, than restrict it (since existing signatures still would need to be valid). For phase 1 that approach looks good.

mratsim added a commit to status-im/nim-blscurve that referenced this issue May 28, 2020
* dup the draft 5 to allow having both at the same time

* Move to a stashe folder

* implement expand_message_xmd

* implement the new hashToField

* And done? (besides tests)

* Add Eth2 suite

* Change pseudo_random_bytes to uniform_bytes - https://github.com/cfrg/draft-irtf-cfrg-hash-to-curve/pull/263/files

* For now use Py-ECC as reference (cfrg/draft-irtf-cfrg-hash-to-curve#265)

* Add expandMessage official test suite + fix termination detection cfrg/draft-irtf-cfrg-hash-to-curve#266

* Introduce official vectors in H2C

* Fix hash2curve sign0

* Pass all H2C test vectors current and futures

* Setup eth v12 test vectors

* Pass tests and properly handle aggregation with zero signature - ethereum/consensus-specs#1713

* Introduce a switch to toggle v0.11 or v0.12 signatures

* Fix domain sepratation tag for v0.11.x

* Re-skip the buggy v0.11.x aggregation tests

* change default to 0.11.x
@hwwhww
Copy link
Contributor

hwwhww commented Jul 20, 2020

@protolambda I think it's okay to close this issue?

@hwwhww hwwhww closed this as completed Jul 20, 2020
@protolambda
Copy link
Contributor Author

@hwwhww yes, this was resolved. There are still surprises with the infinity-point pubkey/signature for implementers though, especially when changing to new BLS libraries (BLST). @paulhauner was looking into this. If you've any edge cases or things that should be documented/tested, please make a new issue and we'll look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:bug Something isn't working phase0 scope:BLS
Projects
None yet
Development

No branches or pull requests

9 participants