Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New pyln modules. #3733

Merged
merged 18 commits into from
Jun 12, 2020
Merged

Conversation

rustyrussell
Copy link
Contributor

These will almost certainly see more work as I port the protocol tests across, but since I've never published a python module before, and have no idea what I'm doing, it's probably best to get some review!

Help!

This runs flake8 and the python tests.  Helps me, at least!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Copy link
Collaborator

@darosior darosior left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just quickly went through the code, this looks really nice !

Aside question : I once gave the protocol tests a shot (ElementsProject/lightning-rfc-protocol-test#13) which then seem discontinued, are we going to merge them here ?

contrib/pyln-proto/pyln/proto/message/array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/message.py Outdated Show resolved Hide resolved
@rustyrussell
Copy link
Contributor Author

Just quickly went through the code, this looks really nice !

Aside question : I once gave the protocol tests a shot (ElementsProject/lightning-rfc-protocol-test#13) which then seem discontinued, are we going to merge them here ?

My final step will be to port the tests to this framework, then I will go through those PRs and merge. Could be a while yet though!

This supports infrasructure for creating messages.  In particular, it
can be fed CSV from the spec's `tools/extract-formats.py` and then convert
them all to and from strings and binary formats.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: pyln: new module pyln.proto.message
@rustyrussell
Copy link
Contributor Author

Minor rebase:

  1. Cleaned up commit msgs and added Changelog-Added lines.
  2. Export SubtypeType and fix length check in case of missing fields.

Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this PR is pretty meta :-) Apart from a few C-isms, some things that could be more pythonic and a couple of nits, it looks good to me.

  • I'd prefer having a file-like interface on the serialization and deserialization primitives, since that allows us not to care about returning the remainder of the parsed string all over the place.
  • I'm curious on how this continues in the next couple of PRs, but if it generates the type classes from the bolts I'd suggest adding them to the __all__ field which describes the exports. That'd just make the types importable by name. There's a bit of a roundabout way to do it, but in user ergonomics it works quite nicely:
import sys

class Message(object):
    def __init__(self, messagetype, **kwargs):
        self.messagetype = messagetype
        self.fields = {}

## Get the current module so we can append newly generated classes to it
mod = sys.modules[Message.__module__]

for name in ['A', 'B', 'C']:
    setattr(mod, name, Message(name))

This will create an instance of Message and register it as an importable object with the correct name. From the user of the module you can then just do the following:

from test import A, B, C

print(A)

contrib/pyln-proto/pyln/proto/message/array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/fundamental_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/fundamental_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/fundamental_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/tests/test_array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/tests/test_array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/tests/test_array_types.py Outdated Show resolved Hide resolved
contrib/pyln-proto/pyln/proto/message/bolts/bolts.py Outdated Show resolved Hide resolved
@rustyrussell
Copy link
Contributor Author

rustyrussell commented Jun 3, 2020

Wow, this PR is pretty meta :-) Apart from a few C-isms, some things that could be more pythonic and a couple of nits, it looks good to me.

  • I'd prefer having a file-like interface on the serialization and deserialization primitives, since that allows us not to care about returning the remainder of the parsed string all over the place.

That works well for the binary ones (done), but it's less of a win for the string ops: it's more convenient to have them return a str. They're mainly used for debugging. Using a io.TextIOBase() for string input would be mainly an internal change, but probably even less efficient given how we split fields.

  • I'm curious on how this continues in the next couple of PRs, but if it generates the type classes from the bolts I'd suggest adding them to the __all__ field which describes the exports. That'd just make the types importable by name. There's a bit of a roundabout way to do it, but in user ergonomics it works quite nicely:

Hmm, that's actually a nice trick. It's a bit tricky because message types exist in a "namespace" for good reasons (users, like protocol tests, can add/remove/modify the message types).

So it actually makes sense then to have boltX as individual modules, containing all the Message() types by name.

The problem is, for lnprototest, we actually replace the 'signature' type: since bolts are simply a csv which gets interpreted, we do the following:

ns = MessageNamespace()
# Replace fundamental type
ns.subtypes['signature'] = Sig
ns.add_csv(bolt1_csv + bolt2_csv + bolt7_csv)

I guess we could override messages.signature before importing... let me try.

@cdecker
Copy link
Member

cdecker commented Jun 3, 2020

Forgot to mention this yesterday, but this is rather impressive code 👍

That works well for the binary ones (done), but it's less of a win for the string ops: it's more convenient to have them return a str. They're mainly used for debugging. Using a io.TextIOBase() for string input would be mainly an internal change, but probably even less efficient given how we split fields.

Good point, I was mainly thinking about the binary representations. The string and hex representations can then be wrappers around the binary ones, e.g., hex-decoding a given string and wrapping it in a BytesIO instance before passing it to the binary deserializer.

The problem is, for lnprototest, we actually replace the 'signature' type: since bolts are simply a csv which gets interpreted, we do the following:

ns = MessageNamespace()
# Replace fundamental type
ns.subtypes['signature'] = Sig
ns.add_csv(bolt1_csv + bolt2_csv + bolt7_csv)

I guess we could override messages.signature before importing... let me try.

Getting deep into the monkey-patching weeds here, but it is possible to replace after importing the module quite easily as well, as long as we get all the aliasing right (import the dependency, and then monkey patch its imported fields, hoping they didn't stash away an alias to the original and are using it somewhere).

Instead of val_to_bin/val_from_bin which deal with bytes, we implement
read and write which use streams.  This simplifies the API. 

Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Suggested-by: Christian Decker
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
These are (probably) going away soon, but just tag them for now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This will be useful for the next patch, which introduces per-bolt
modules.  This makes it easier for them generate variables for each
field type they parse (they don't want to export u16, for example)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This contains the CSVs for the current bolts (autogenerated).  It's a
separate module because I expect it to be updated alongside the spec.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Added: pyln: new module pyln.proto.message.bolts
These are autogenerated, but now they export their own
MessageNamespace, as well as the raw csv.

They also expose their SubtypeTypes, MessageTypes and TlvStreamTypes,
though in theory these could clash (they don't for now, and it'd be
kinda awkward if they did).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They must not have duplicate names!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
FieldType lets you make new field types, and split_field helps with
parsing.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Other changes along the way:

1. In a couple of places we passed None as a dummy for for
   `otherfields` where {} is just as good.
2. Turned bytes into hex for errors.
3. Remove nonsensical (unused) get_tlv_by_number() function from MessageNamespace
4. Renamed unrelated-but-overlapping `field_from_csv` and
   `type_from_csv` static methods, since mypy thought they should have
   the same type.
5. Unknown tlv fields are placed in dict as strings, not ints, for
   type simplicity.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Exposing the array types is required for our dummyrunner in the lnprototest suite, since
it wants to be able to generate fake fields.

The set_field is similarly useful.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
If they don't exist, that's OK. These will eventually be going away
from the spec, but there are still some in gossip messages for now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanups are very nice, thanks for the fixups. I just noted some minor stylistic things, that can also be addressed at a later point in time.

One thing that threw me off in the first review round is that I usually use
the class as the type, doing things like the following to create a new
instance from a serialized string:

class Pubkey:
    def __init__(self, raw: bytes) -> None:
        self.raw: bytes = raw

    @classmethod
    def from_hex(cls: Type[T], h: Union[str, bytes]) -> T:
        if isinstance(h, str):
            h = h.encode('ASCII')

        return cls(
            raw=binascii.unhexlify(h),
        )

    def __str__(self) -> str:
        return binascii.hexlify(self.raw).decode('ASCII')

    def __repr__(self) -> str:
        return self.__str__()

Notice the from_hex method which is a class method, i.e., it receives the
class and not an instance as its first argument. This allows writing generic
methods that return the desired type, without having to copy-paste the
implementation every time.

In your code there is an explicit distinction between type, expressed as a
distinct class, and the instance class. This is fine, I was just totally
confused by it :-)

Comment on lines +12 to +19
If need_all, never return None, otherwise returns None if EOF."""
b = io_out.read(struct.calcsize(structfmt))
if len(b) == 0 and empty_ok:
return None
elif len(b) < struct.calcsize(structfmt):
raise ValueError("{}: not enough bytes", name)

return struct.unpack(structfmt, b)[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage over the following?

try:
    struct.unpack_from(structfmt, io)
except struct.error:
    if empty_ok:
        return None
    else:
        raise

Presumably you want to avoid raising the exception at all when it can be avoided?

@@ -26,6 +27,10 @@ def add_subtype(self, t):
return ValueError('Already have {}'.format(prev))
self.subtypes[t.name] = t

def add_fundamentaltype(self, t):
assert not self.get_type(t.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be weakened a bit to be

assert(self.get_type(t.name) is None or self.get_type(t.name) == t)

Which would no longer fail outright if we happen to add the same type multiple times.

Comment on lines +60 to +62
if name in self.fundamentaltypes:
return self.fundamentaltypes[name]
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idiomatic way of doing this is with the getter and default value:

return self.fundamentaltypes.get(name, None)

contrib/pyln-proto/pyln/proto/message/bolts/Makefile Outdated Show resolved Hide resolved
Comment on lines 296 to 298
if val is None:
raise ValueError("{}.{}: short read".format(self, field))
# Might only exist with certain options available
if field.fieldtype.option is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two if-statements can be joined.

@@ -8,11 +12,11 @@ class ArrayType(FieldType):
wants an array of some type.

"""
def __init__(self, outer, name, elemtype):
def __init__(self, outer: 'SubtypeType', name: str, elemtype: FieldType):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you were far quicker than me finding out about recursive type definitions being possible via string annotations 😉

I got quite lost when trying to figure out how to annotate something with a type that'd be defined later...

Comment on lines +14 to +16
"SizedArrayType",
"DynamicArrayType",
"EllipsisArrayType",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these only required for unit tests? In that case we could also not expose them here, and have the unit tests reach directly into .array_types

from pyln.proto.message.array_types import SizedArrayType, DynamicArrayType, EllipsisArrayType

If that helps keep the interface clean for the actual protocol tests.

Comment on lines 547 to 548
for field in kwargs:
f = self.messagetype.find_field(field)
if f is None:
raise ValueError("Unknown field {}".format(field))

v = kwargs[field]
if isinstance(v, str):
v, remainder = f.fieldtype.val_from_str(v)
if remainder != '':
raise ValueError('Unexpected {} at end of initializer for {}'.format(remainder, field))
self.fields[field] = v
self.set_field(field, kwargs[field])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be achieved with this:

for key, val in kwargs.items():
    self.set_field(key, val)

@cdecker
Copy link
Member

cdecker commented Jun 12, 2020

After going through the comments once more I think we're actually good with this PR.

ACK 5dbec8f

@cdecker cdecker merged commit aaefbe2 into ElementsProject:master Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants