Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP4844 precompile: Strict checks when parsing scalar field elements from the wire #3138

Merged
merged 4 commits into from
Dec 1, 2022

Conversation

asn-d6
Copy link
Contributor

@asn-d6 asn-d6 commented Nov 29, 2022

verify_kzg_proof() currently reduces inputs to be below the modulus. This can end up in surprising side effects.

In this PR we make bytes_to_bls_field() assert out of its inputs are not canonical.

For the purposes of Fiat-Shamir and hash-to-field, we introduce a new function hash_to_bls_field() which does the modular reduction on its own.

Commit messages include explanation on what's happening.

bytes_to_bls_field() will be used in the precompile and hence it should error out when provided with malicious inputs.
The previous commit made bytes_to_bls_field() be strict about its inputs. However in compute_challenges() we are
dealing with Fiat-Shamir and hash outputs that could be innocuously higher than the modulus. For this reason we add the
hash_to_bls_field() helper for use in compute_challenges().
@kevaundray
Copy link
Contributor

I think you can also bytes_to_bls_field in blob_to_polynomial :

value = int.from_bytes(blob[i * BYTES_PER_FIELD_ELEMENT: (i + 1) * BYTES_PER_FIELD_ELEMENT], ENDIANNESS)
assert value < BLS_MODULUS

The output is not uniform over the BLS field.
"""
hashed_data = hash(data)
return int.from_bytes(hashed_data, ENDIANNESS) % BLS_MODULUS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this:

BLSFieldElement(int.from_bytes(hashed_data, ENDIANNESS) % BLS_MODULUS)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 69bafc0

return int.from_bytes(b, ENDIANNESS) % BLS_MODULUS
field_element = int.from_bytes(b, ENDIANNESS)
assert field_element < BLS_MODULUS
return field_element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, can you make this BLSFieldElement(field_element)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 69bafc0

@asn-d6
Copy link
Contributor Author

asn-d6 commented Nov 29, 2022

Thanks for the review. I pushed fixes to your suggestions.

@asn-d6
Copy link
Contributor Author

asn-d6 commented Nov 29, 2022

I think you can also bytes_to_bls_field in blob_to_polynomial :

value = int.from_bytes(blob[i * BYTES_PER_FIELD_ELEMENT: (i + 1) * BYTES_PER_FIELD_ELEMENT], ENDIANNESS)
assert value < BLS_MODULUS

Done in 69bafc0

Copy link
Contributor

@kevaundray kevaundray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kevaundray
Copy link
Contributor

Minor note:

In compute_challenges a comment about hashed_data = hash(data) might help future readers, as its possible to not hash the data first and just call hash_to_bls_field on the full data twice.

For reference: the reason we do the initial hash first is to reduce the amount of memory needed to compute the subsequent hashes. Without it we would need to copy the whole data to make the two hash_to_bls_field calls, whereas with this approach, we only need to copy the hashed_data which is only 32 bytes. Noting that data grows with the number of blobs.

@asn-d6 asn-d6 force-pushed the strict_modulus_check branch from 69bafc0 to e4df4ba Compare November 30, 2022 14:59
@asn-d6
Copy link
Contributor Author

asn-d6 commented Nov 30, 2022

Minor note:

In compute_challenges a comment about hashed_data = hash(data) might help future readers, as its possible to not hash the data first and just call hash_to_bls_field on the full data twice.

For reference: the reason we do the initial hash first is to reduce the amount of memory needed to compute the subsequent hashes. Without it we would need to copy the whole data to make the two hash_to_bls_field calls, whereas with this approach, we only need to copy the hashed_data which is only 32 bytes. Noting that data grows with the number of blobs.

I agree this is useful info but not sure if the spec code is the best place to document it. There are so many design decisions taken during writing this spec that are not documented in the spec, and that makes a comment with this particular piece of info feel out of place.

I don't have a very strong opinion on this, so feel free to take a stab at it and push to this PR. If not, let's move forward with the current state of this PR.

@asn-d6 asn-d6 requested review from hwwhww and dankrad December 1, 2022 10:05
Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the conflict. LGTM!

@asn-d6 asn-d6 merged commit 23d3aee into ethereum:dev Dec 1, 2022
@hwwhww hwwhww added the Deneb was called: eip-4844 label Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deneb was called: eip-4844
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants