-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOS style layout #47
SOS style layout #47
Conversation
dc446a7
to
8072a1b
Compare
pytest.ini
Outdated
@@ -1,5 +1,5 @@ | |||
[pytest] | |||
addopts= --showlocals --durations 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to revert this.
ssz/sedes/base.py
Outdated
) | ||
|
||
TSerializable = TypeVar("TSerializable") | ||
TDeserialized = TypeVar("TDeserialized") | ||
TCanonical = TypeVar("TCanonical") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to revert this
@jannikluhn can you give this a preliminary review. Not really a merge candidate until ethereum/consensus-specs#787 has been merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like that Container
, Vector
, and List
have separate implementations of deserialize
, even though they all more or less do the same thing. Is there a reason why we can't just put it in CompositeSedes
, maybe relying on some abstract methods implemented in the subclasses such as _iter_sedes
?
ssz/utils.py
Outdated
def read_exact(num_bytes: int, stream: IO[bytes]) -> bytes: | ||
data = stream.read(num_bytes) | ||
if len(data) != num_bytes: | ||
raise DeserializationError(f"Tried to read {num_bytes}. Only got {len(data)} bytes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise DeserializationError(f"Tried to read {num_bytes}. Only got {len(data)} bytes") | |
raise DeserializationError(f"Tried to read {num_bytes}. Only got {len(data)} bytes") |
ssz/utils.py
Outdated
f"Content size is too large to be encoded in a {SIZE_PREFIX_SIZE} byte prefix", | ||
) | ||
def encode_offset(offset: int) -> bytes: | ||
return offset.to_bytes(4, 'little') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return offset.to_bytes(4, 'little') | |
return offset.to_bytes(OFFSET_SIZE, 'little') |
ssz/sedes/list.py
Outdated
if len(value): | ||
raise ValueError("Cannot compute trie hash for non-empty value using `EmptyList` sedes") | ||
# TODO: This can just be a constant value, no need to compute at runtime | ||
return merkleize(pack([])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return merkleize(pack([])) | |
return mix_in_length(merkleize(pack([])), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this resolved on purpose?
ssz/sedes/base.py
Outdated
# | ||
# Tree hashing | ||
# | ||
def hash_tree_root(self, value: Sequence[Any]) -> bytes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is only suitable for Vector
, but not for List
and it would make more sense to have this in Vector
directly.
22bc497
to
dd709d6
Compare
@jannikluhn I addressed everything with the following exceptions.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 looks good
There seem to be a few comments from the previous review that have been marked as resolved, presumably by accident? I reopened them just for double checking, most of them are minor, but at least the mix_in_length
one for empty_list
seems to be worth fixing.
ssz/sedes/container.py
Outdated
def _validate_serializable(self, value: Sequence[Any]) -> bytes: | ||
if len(value) != len(self.field_sedes): | ||
raise SerializationError( | ||
f"Not enough elements. Expected {len(self.field_sedes)}, Got {len(value)}" |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
ssz/sedes/list.py
Outdated
if len(value): | ||
raise ValueError("Cannot compute trie hash for non-empty value using `EmptyList` sedes") | ||
# TODO: This can just be a constant value, no need to compute at runtime | ||
return merkleize(pack([])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this resolved on purpose?
I'll update those. I think I force pushed over top of the commits that were made by the github UI in accepting your recommended changes. |
5293899
to
34a330c
Compare
@jannikluhn I've revived the fixes that were squashed in my force pushes. I believe this is ready for final review and merge once any issues have been addressed. |
For syncing the spec versioning, the procedure we need:
|
ssz/sedes/container.py
Outdated
if not offset_pairs: | ||
return fixed_size_values | ||
|
||
dynamic_size_values = self.deserialize_section_1(offset_pairs, stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variable
v.s. dynamic
I'm fine with both! But we can sync it by either changing the spec or changing py-ssz
side.
ssz/sedes/container.py
Outdated
if next_field_start_index <= field_start_index: | ||
raise Exception("Invariant: must always make progress") | ||
field_start_index = next_field_start_index | ||
# Verify that bothe iterables have been fully consumed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo?
# Verify that bothe iterables have been fully consumed. | |
# Verify that both iterables have been fully consumed. |
ssz/sedes/base.py
Outdated
pass | ||
|
||
|
||
def _compute_section_0_size(element_sedes: Iterable[TSedes]) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO section_0
and section_1
are confusing names. fixed_parts
and variable_parts
from the current ethereum/consensus-specs#787 PR are better.
ssz/sedes/list.py
Outdated
raise Exception("Invariant: must always make progress") | ||
element_start_index = next_element_start_index | ||
num_remaining_offset_bytes = first_offset - stream.tell() | ||
if num_remaining_offset_bytes % 4 != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @jannikluhn mentioned: https://github.com/ethereum/py-ssz/pull/47/files#r272979440
Should we use OFFSET_SIZE
instead of 4
?
if num_remaining_offset_bytes % 4 != 0: | |
if num_remaining_offset_bytes % OFFSET_SIZE != 0: |
ssz/sedes/list.py
Outdated
num_remaining_offset_bytes = first_offset - stream.tell() | ||
if num_remaining_offset_bytes % 4 != 0: | ||
raise DeserializationError( | ||
f"Offset bytes was not a multiple of 4. Got {num_remaining_offset_bytes}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
f"Offset bytes was not a multiple of 4. Got {num_remaining_offset_bytes}" | |
f"Offset bytes was not a multiple of {OFFSET_SIZE}. Got {num_remaining_offset_bytes}" |
ssz/sedes/list.py
Outdated
|
||
yield element | ||
num_remaining_offsets = num_remaining_offset_bytes // 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
num_remaining_offsets = num_remaining_offset_bytes // 4 | |
num_remaining_offsets = num_remaining_offset_bytes // OFFSET_SIZE |
34a330c
to
4ee2470
Compare
THis has now been rebased on top of #58 |
ea4e18c
to
69f09ea
Compare
This PR should be ready to merge once the two dependent PR's are merged. |
69f09ea
to
8574b7b
Compare
8574b7b
to
3a871a4
Compare
@jannikluhn I've rebased. All tests are passing. This should get one more review pass to be sure everything is fine with you and/or @hwwhww as I had to resolve a number of non-trivial merge conflicts and then resulting test failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nitpicks and a potential bug in list deserialization, other than that LGTM.
ssz/constants.py
Outdated
@@ -9,3 +9,9 @@ | |||
MAX_CONTENT_SIZE = 2 ** (SIZE_PREFIX_SIZE * 8) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like SIZE_PREFIX_SIZE
and MAX_CONTENT_SIZE
could be removed.
ssz/sedes/base.py
Outdated
TDeserializedElement = TypeVar("TDeserializedElement") | ||
|
||
|
||
class HomogenousSequence(CompositeSedes[Sequence[Any], Tuple[Any]]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class HomogenousSequence(CompositeSedes[Sequence[Any], Tuple[Any]]): | |
class HomogenousSequence(CompositeSedes[Sequence[Any], Tuple[Any, ...]]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this class doesn't seem to be used (Vector
, List
, and EmptyList
are just CompositeSedes
).
ssz/sedes/container.py
Outdated
raise ValidationError( | ||
f"The following fields are duplicated {','.join(sorted(duplicate_field_names))}" | ||
) | ||
from ssz.sedes.base import BaseSedes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
ssz/sedes/list.py
Outdated
) | ||
|
||
TSerializable = TypeVar("TSerializable") | ||
TDeserialized = TypeVar("TDeserialized") | ||
|
||
EMPTY_LIST_HASH_TRIES_ROOT = mix_in_length(merkleize(pack([])), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EMPTY_LIST_HASH_TRIES_ROOT = mix_in_length(merkleize(pack([])), 0) | |
EMPTY_LIST_HASH_TREE_ROOT = mix_in_length(merkleize(pack([])), 0) |
As it's just a normal Merkle tree, not a trie.
Also, this could be moved to constants.py
(but I guess here would be fine too as it's very specific).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping it here because I don't want to import merkleize
or pack
from within constants.py
ssz/sedes/list.py
Outdated
|
||
def hash_tree_root(self, value: Sequence[TSerializable]) -> bytes: | ||
if len(value): | ||
raise ValueError("Cannot compute trie hash for non-empty value using `EmptyList` sedes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValueError("Cannot compute trie hash for non-empty value using `EmptyList` sedes") | |
raise ValueError("Cannot compute tree hash for non-empty value using `EmptyList` sedes") |
ssz/sedes/list.py
Outdated
element_size = self.element_sedes.get_fixed_size() | ||
data = stream.read() | ||
if len(data) % element_size != 0: | ||
raise DeserializationError("TODO: INVALID LENGTH") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover TODO
yield self.element_sedes.deserialize(segment) | ||
else: | ||
try: | ||
first_offset = s_decode_offset(stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also fail if there are more than zero but less than 4 bytes in which case we should not return the empty list but raise an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed and added a test for this case.
b5849cc
to
2cb7311
Compare
2cb7311
to
9cafc75
Compare
@hwwhww I just realized I merged when this had "DO NOT MERGE" label on it... feel free to revert or force push master back to previous state. |
Better error if bump missing in make notes/release
WIP with the SOS style layout as specified in ethereum/consensus-specs#787
Depends on #58
What was wrong?
The spec for SSZ has been updated to a new serialization layout.
How was it fixed?
Updated the serialization and deserialization code for all Composite types to comply with the spec.
Some API updates:
Container
sedes no longer operate on dictionaries or have any concept of fields namesio.ByteIO
streams for some decoding rather than tracking the current content index and end index.Cute Animal Picture