Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Spec update #46

Closed
wants to merge 39 commits into from
Closed

Conversation

jannikluhn
Copy link
Contributor

Implement changes in the latest spec update.

Main changes:

  • remove LengthPrefixedSedes and FixedSizedSedes base classes in favor of BasicSedes and CompositeSedes (note that they don't overlap fully) and update sub classes accordingly.
  • Add Bytes and Tuple class.
  • Get rid of lots of UInts as they are explicitly not part of the spec anymore
  • Make BytesSedes a list of bytes and BytesN a tuple of bytes (but with an optimized implementation that doesn't unnecessarily encodes every single byte separately)
  • use the opportunity and rename bytes_sedes to byte_list and BytesN to ByteTuple
  • some small cleanups I found on the way

What's left to do:

  • fix, organize, and add some tests
  • update test runner
  • tree hashing

Cute Animal Picture

put a cute animal picture link inside the parentheses

@hwwhww
Copy link
Contributor

hwwhww commented Mar 12, 2019

Also, SSZ_CHUNK_SIZE was renamed to BYTES_PER_CHUNK and changed to 32.

@jannikluhn
Copy link
Contributor Author

Implemented tree hashing. Not really convinced that it's correct yet, as tests are still missing.

ssz/sedes/tuple.py Outdated Show resolved Hide resolved
ssz/sedes/tuple.py Outdated Show resolved Hide resolved
ssz/utils.py Outdated Show resolved Hide resolved
ssz/utils.py Outdated Show resolved Hide resolved
ssz/utils.py Show resolved Hide resolved
@pipermerriam
Copy link
Member

I think there's still an error with containers that contain dynamic sized elements:

../py-ssz/ssz/codec.py:71: in decode
    value = sedes.deserialize(ssz)
../py-ssz/ssz/sedes/serializable.py:350: in deserialize
    deserialized_field_dict = cls._meta.container_sedes.deserialize(data)
../py-ssz/ssz/sedes/base.py:59: in deserialize
    value, end_index = self.deserialize_segment(data, 0)
../py-ssz/ssz/sedes/base.py:173: in deserialize_segment
    return self.deserialize_content(content), continuation_index
../../python-environments/bimini/lib/python3.6/site-packages/eth_utils/functional.py:46: in inner
    return callback(fn(*args, **kwargs))
../py-ssz/ssz/sedes/container.py:89: in deserialize_content
    field_start_index,
../py-ssz/ssz/sedes/serializable.py:358: in deserialize_segment
    start_index,
../py-ssz/ssz/sedes/base.py:166: in deserialize_segment
    content_size = self.get_static_size()
../py-ssz/ssz/sedes/container.py:57: in get_static_size
    raise ValueError("Container contains dynamically sized elements")
E   ValueError: Container contains dynamically sized elements

@jannikluhn
Copy link
Contributor Author

Do you have an example? I tried the following and it seemed to work:

In [3]: class Test(ssz.Serializable): 
   ...:     fields = (("a", List(uint8)),) 
   ...:                                                

In [4]: ssz.encode(Test([1, 2, 3]))                    
Out[4]: b'\x07\x00\x00\x00\x03\x00\x00\x00\x01\x02\x03'

In [5]: ssz.decode(Out[4], Test)                       
Out[5]: <__main__.Test at 0x7fa8fa3007b8>

@pipermerriam
Copy link
Member

Example here:

https://github.com/ethereum/bimini/blob/master/tests/test_sss_vs_rlp.py

You probably have to manually install some of those dependencies like numpy, tabulate, rlp, and your local version of ssz

ssz/hash.py Outdated
@@ -33,18 +33,18 @@ def merkle_hash(input_items: Sequence[Any]) -> Hash32:
data_length = len(input_items).to_bytes(32, "little")
if len(input_items) == 0:
# Handle empty list case
chunks = (b'\x00' * SSZ_CHUNK_SIZE,)
elif len(input_items[0]) < SSZ_CHUNK_SIZE:
chunks = (b'\x00' * CHUNK_SIZE,)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use EMPTY_CHUNK here?

ssz/hash.py Outdated
@@ -54,7 +54,7 @@ def merkle_hash(input_items: Sequence[Any]) -> Hash32:
# Tree-hash
while len(chunks) > 1:
if len(chunks) % 2 == 1:
chunks += (b'\x00' * SSZ_CHUNK_SIZE, )
chunks += (b'\x00' * CHUNK_SIZE, )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto EMPTY_CHUNK.

prefix, content_start_index = self.consume_bytes(data, start_index, SIZE_PREFIX_SIZE)
length = int.from_bytes(prefix, "little")
content, continuation_index = self.consume_bytes(data, content_start_index, length)
return self.deserialize_content(content), continuation_index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 173 is identical to line 168, maybe just one return at the end of the function?

@jannikluhn
Copy link
Contributor Author

Example here:

https://github.com/ethereum/bimini/blob/master/tests/test_sss_vs_rlp.py

You probably have to manually install some of those dependencies like numpy, tabulate, rlp, and your local version of ssz

Sorry for the delay, only got round to test it now. This takes a while but passes on my machine. The latest relevant commit should be this one: 6f6e3bb

Are you sure you're on the latest version?

@ChihChengLiang
Copy link
Contributor

This might also relevant ethereum/consensus-specs#794 (review)

@jannikluhn
Copy link
Contributor Author

@pipermerriam Found another bug while writing tests, so likely that was the one you encountered as well. Fixed now.

@pipermerriam
Copy link
Member

@jannikluhn I actually built #47 off of your branch. I suspect we have overlapping changes but I made sufficient modifications that they're likely to be heavily obscured or just completely overwritten. Anyways, I think this branch and #47 might need to be merged in some fashion if the SOS+SSZ spec update gets accepted.

@jannikluhn
Copy link
Contributor Author

This got out of hand a little, sorry for this. Wanted to finish this to finally move on to more important stuff. Most noteworthy changes are the added tests in tests/sedes and tests/tree_hashing.

I also got rid of lots of old tests and cleaned up test_serializable, even though it's unrelated to the rest of the PR.

Some minor fixes in the tree hashing functions (the helpers that contain most of the complexity haven't changed though).

Finally, I removed the YAML tests as they are outdated according to the latest spec (no support for uint24, etc.).

For some reason, pytest and isort don't work on CircleCI anymore (even though they pass locally). Will look into this tomorrow, but it shouldn't affect the rest of the PR.

@pipermerriam
Copy link
Member

ONce you get everything green I'd advocate for breaking this up into smaller pieces. Largely unreviewable as-is.

Without them, pytest will fail as we've got test files with the same
name in different directories.
@jannikluhn
Copy link
Contributor Author

As suggested, I'm going to try to split this up into smaller pieces to make it reviewable.

Note to self: The tree hashing benchmark fails by a mile. I think (hope) this is because tree hashing of byte lists and vectors works with the bytes individually, but this should be fairly easy to optimize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants