New pyln modules. #3733

rustyrussell · 2020-05-18T07:12:04Z

These will almost certainly see more work as I port the protocol tests across, but since I've never published a python module before, and have no idea what I'm doing, it's probably best to get some review!

Help!

This runs flake8 and the python tests. Helps me, at least! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

darosior

Just quickly went through the code, this looks really nice !

Aside question : I once gave the protocol tests a shot (ElementsProject/lightning-rfc-protocol-test#13) which then seem discontinued, are we going to merge them here ?

contrib/pyln-proto/pyln/proto/message/array_types.py

contrib/pyln-proto/pyln/proto/message/message.py

rustyrussell · 2020-05-19T01:58:48Z

Just quickly went through the code, this looks really nice !

Aside question : I once gave the protocol tests a shot (ElementsProject/lightning-rfc-protocol-test#13) which then seem discontinued, are we going to merge them here ?

My final step will be to port the tests to this framework, then I will go through those PRs and merge. Could be a while yet though!

This supports infrasructure for creating messages. In particular, it can be fed CSV from the spec's `tools/extract-formats.py` and then convert them all to and from strings and binary formats. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Added: pyln: new module pyln.proto.message

rustyrussell · 2020-05-28T00:09:18Z

Minor rebase:

Cleaned up commit msgs and added Changelog-Added lines.
Export SubtypeType and fix length check in case of missing fields.

cdecker

Wow, this PR is pretty meta :-) Apart from a few C-isms, some things that could be more pythonic and a couple of nits, it looks good to me.

I'd prefer having a file-like interface on the serialization and deserialization primitives, since that allows us not to care about returning the remainder of the parsed string all over the place.
I'm curious on how this continues in the next couple of PRs, but if it generates the type classes from the bolts I'd suggest adding them to the __all__ field which describes the exports. That'd just make the types importable by name. There's a bit of a roundabout way to do it, but in user ergonomics it works quite nicely:

import sys

class Message(object):
    def __init__(self, messagetype, **kwargs):
        self.messagetype = messagetype
        self.fields = {}

## Get the current module so we can append newly generated classes to it
mod = sys.modules[Message.__module__]

for name in ['A', 'B', 'C']:
    setattr(mod, name, Message(name))

This will create an instance of Message and register it as an importable object with the correct name. From the user of the module you can then just do the following:

from test import A, B, C

print(A)

contrib/pyln-proto/pyln/proto/message/array_types.py

contrib/pyln-proto/pyln/proto/message/fundamental_types.py

contrib/pyln-proto/pyln/proto/message/array_types.py

contrib/pyln-proto/tests/test_array_types.py

contrib/pyln-proto/pyln/proto/message/bolts/bolts.py

rustyrussell · 2020-06-03T04:19:32Z

Wow, this PR is pretty meta :-) Apart from a few C-isms, some things that could be more pythonic and a couple of nits, it looks good to me.

I'd prefer having a file-like interface on the serialization and deserialization primitives, since that allows us not to care about returning the remainder of the parsed string all over the place.

That works well for the binary ones (done), but it's less of a win for the string ops: it's more convenient to have them return a str. They're mainly used for debugging. Using a io.TextIOBase() for string input would be mainly an internal change, but probably even less efficient given how we split fields.

I'm curious on how this continues in the next couple of PRs, but if it generates the type classes from the bolts I'd suggest adding them to the __all__ field which describes the exports. That'd just make the types importable by name. There's a bit of a roundabout way to do it, but in user ergonomics it works quite nicely:

Hmm, that's actually a nice trick. It's a bit tricky because message types exist in a "namespace" for good reasons (users, like protocol tests, can add/remove/modify the message types).

So it actually makes sense then to have boltX as individual modules, containing all the Message() types by name.

The problem is, for lnprototest, we actually replace the 'signature' type: since bolts are simply a csv which gets interpreted, we do the following:

ns = MessageNamespace()
# Replace fundamental type
ns.subtypes['signature'] = Sig
ns.add_csv(bolt1_csv + bolt2_csv + bolt7_csv)

I guess we could override messages.signature before importing... let me try.

cdecker · 2020-06-03T13:48:12Z

Forgot to mention this yesterday, but this is rather impressive code 👍

That works well for the binary ones (done), but it's less of a win for the string ops: it's more convenient to have them return a str. They're mainly used for debugging. Using a io.TextIOBase() for string input would be mainly an internal change, but probably even less efficient given how we split fields.

Good point, I was mainly thinking about the binary representations. The string and hex representations can then be wrappers around the binary ones, e.g., hex-decoding a given string and wrapping it in a BytesIO instance before passing it to the binary deserializer.

The problem is, for lnprototest, we actually replace the 'signature' type: since bolts are simply a csv which gets interpreted, we do the following:
ns = MessageNamespace()
# Replace fundamental type
ns.subtypes['signature'] = Sig
ns.add_csv(bolt1_csv + bolt2_csv + bolt7_csv)
I guess we could override messages.signature before importing... let me try.

Getting deep into the monkey-patching weeds here, but it is possible to replace after importing the module quite easily as well, as long as we get all the aliasing right (import the dependency, and then monkey patch its imported fields, hoping they didn't stash away an alias to the original and are using it somewhere).

Instead of val_to_bin/val_from_bin which deal with bytes, we implement read and write which use streams. This simplifies the API. Suggested-by: Christian Decker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Suggested-by: Christian Decker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

These are (probably) going away soon, but just tag them for now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

This will be useful for the next patch, which introduces per-bolt modules. This makes it easier for them generate variables for each field type they parse (they don't want to export u16, for example) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

This contains the CSVs for the current bolts (autogenerated). It's a separate module because I expect it to be updated alongside the spec. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Added: pyln: new module pyln.proto.message.bolts

These are autogenerated, but now they export their own MessageNamespace, as well as the raw csv. They also expose their SubtypeTypes, MessageTypes and TlvStreamTypes, though in theory these could clash (they don't for now, and it'd be kinda awkward if they did). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

They must not have duplicate names! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

FieldType lets you make new field types, and split_field helps with parsing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Other changes along the way: 1. In a couple of places we passed None as a dummy for for `otherfields` where {} is just as good. 2. Turned bytes into hex for errors. 3. Remove nonsensical (unused) get_tlv_by_number() function from MessageNamespace 4. Renamed unrelated-but-overlapping `field_from_csv` and `type_from_csv` static methods, since mypy thought they should have the same type. 5. Unknown tlv fields are placed in dict as strings, not ints, for type simplicity. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Exposing the array types is required for our dummyrunner in the lnprototest suite, since it wants to be able to generate fake fields. The set_field is similarly useful. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

If they don't exist, that's OK. These will eventually be going away from the spec, but there are still some in gossip messages for now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

cdecker

The cleanups are very nice, thanks for the fixups. I just noted some minor stylistic things, that can also be addressed at a later point in time.

One thing that threw me off in the first review round is that I usually use
the class as the type, doing things like the following to create a new
instance from a serialized string:

class Pubkey:
    def __init__(self, raw: bytes) -> None:
        self.raw: bytes = raw

    @classmethod
    def from_hex(cls: Type[T], h: Union[str, bytes]) -> T:
        if isinstance(h, str):
            h = h.encode('ASCII')

        return cls(
            raw=binascii.unhexlify(h),
        )

    def __str__(self) -> str:
        return binascii.hexlify(self.raw).decode('ASCII')

    def __repr__(self) -> str:
        return self.__str__()

Notice the from_hex method which is a class method, i.e., it receives the
class and not an instance as its first argument. This allows writing generic
methods that return the desired type, without having to copy-paste the
implementation every time.

In your code there is an explicit distinction between type, expressed as a
distinct class, and the instance class. This is fine, I was just totally
confused by it :-)

cdecker · 2020-06-12T12:32:47Z

contrib/pyln-proto/pyln/proto/message/fundamental_types.py

+If need_all, never return None, otherwise returns None if EOF."""
+    b = io_out.read(struct.calcsize(structfmt))
+    if len(b) == 0 and empty_ok:
+        return None
+    elif len(b) < struct.calcsize(structfmt):
+        raise ValueError("{}: not enough bytes", name)
+
+    return struct.unpack(structfmt, b)[0]


What's the advantage over the following?

try: struct.unpack_from(structfmt, io) except struct.error: if empty_ok: return None else: raise

Presumably you want to avoid raising the exception at all when it can be avoided?

cdecker · 2020-06-12T12:54:52Z

contrib/pyln-proto/pyln/proto/message/message.py

@@ -26,6 +27,10 @@ def add_subtype(self, t):
            return ValueError('Already have {}'.format(prev))
        self.subtypes[t.name] = t

+    def add_fundamentaltype(self, t):
+        assert not self.get_type(t.name)


This could be weakened a bit to be

assert(self.get_type(t.name) is None or self.get_type(t.name) == t)

Which would no longer fail outright if we happen to add the same type multiple times.

cdecker · 2020-06-12T12:56:23Z

contrib/pyln-proto/pyln/proto/message/message.py

+        if name in self.fundamentaltypes:
+            return self.fundamentaltypes[name]
+        return None


An idiomatic way of doing this is with the getter and default value:

return self.fundamentaltypes.get(name, None)

contrib/pyln-proto/pyln/proto/message/bolts/Makefile

cdecker · 2020-06-12T13:17:33Z

contrib/pyln-proto/pyln/proto/message/message.py

            if val is None:
-                raise ValueError("{}.{}: short read".format(self, field))
+                # Might only exist with certain options available
+                if field.fieldtype.option is None:


These two if-statements can be joined.

cdecker · 2020-06-12T13:21:33Z

contrib/pyln-proto/pyln/proto/message/array_types.py

@@ -8,11 +12,11 @@ class ArrayType(FieldType):
 wants an array of some type.

    """
-    def __init__(self, outer, name, elemtype):
+    def __init__(self, outer: 'SubtypeType', name: str, elemtype: FieldType):


Seems you were far quicker than me finding out about recursive type definitions being possible via string annotations 😉

I got quite lost when trying to figure out how to annotate something with a type that'd be defined later...

cdecker · 2020-06-12T13:26:31Z

contrib/pyln-proto/pyln/proto/message/__init__.py

+    "SizedArrayType",
+    "DynamicArrayType",
+    "EllipsisArrayType",


Are these only required for unit tests? In that case we could also not expose them here, and have the unit tests reach directly into .array_types

from pyln.proto.message.array_types import SizedArrayType, DynamicArrayType, EllipsisArrayType

If that helps keep the interface clean for the actual protocol tests.

cdecker · 2020-06-12T13:27:55Z

contrib/pyln-proto/pyln/proto/message/message.py

        for field in kwargs:
-            f = self.messagetype.find_field(field)
-            if f is None:
-                raise ValueError("Unknown field {}".format(field))
-
-            v = kwargs[field]
-            if isinstance(v, str):
-                v, remainder = f.fieldtype.val_from_str(v)
-                if remainder != '':
-                    raise ValueError('Unexpected {} at end of initializer for {}'.format(remainder, field))
-            self.fields[field] = v
+            self.set_field(field, kwargs[field])


This can also be achieved with this:

for key, val in kwargs.items(): self.set_field(key, val)

cdecker · 2020-06-12T13:35:26Z

After going through the comments once more I think we're actually good with this PR.

ACK 5dbec8f

pyln: add Makefile

ca71845

This runs flake8 and the python tests. Helps me, at least! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

rustyrussell requested a review from cdecker as a code owner May 18, 2020 07:12

darosior reviewed May 18, 2020

View reviewed changes

contrib/pyln-proto/pyln/proto/message/array_types.py Outdated Show resolved Hide resolved

contrib/pyln-proto/pyln/proto/message/array_types.py Outdated Show resolved Hide resolved

contrib/pyln-proto/pyln/proto/message/message.py Outdated Show resolved Hide resolved

rustyrussell force-pushed the guilt/pyln-messages branch from f810f47 to f94213d Compare May 28, 2020 00:08

cdecker reviewed Jun 2, 2020

View reviewed changes

patch message-export-types.patch

16297af

rustyrussell added 7 commits June 4, 2020 12:09

pyln.proto.message: use BufferedIOBase instead of bytes for binary ops.

1bff330

Instead of val_to_bin/val_from_bin which deal with bytes, we implement read and write which use streams. This simplifies the API. Suggested-by: Christian Decker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

pyln.proto.message: expose fundamental MessageTypes as variables.

46151ed

Suggested-by: Christian Decker Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

pyln: add (undocumented) u8 fundamental type.

e274226

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

message: support option fields.

59e2064

These are (probably) going away soon, but just tag them for now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

rustyrussell force-pushed the guilt/pyln-messages branch from c797870 to e847beb Compare June 4, 2020 03:53

rustyrussell added 5 commits June 4, 2020 13:43

pyln.proto.message: support adding two namespaces.

3ed6831

They must not have duplicate names! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

pyln.proto.message: python-fluency feedback from @darosior

360bcaf

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

pyln.proto.message: export more.

4a336db

FieldType lets you make new field types, and split_field helps with parsing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

pyln.proto.message: allow fields with options to be missing.

efd38a4

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

pyln.proto.message.*: Add Makefile to do mypy checks.

bef68d3

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

rustyrussell force-pushed the guilt/pyln-messages branch from e847beb to dd1aab8 Compare June 11, 2020 06:48

rustyrussell added 3 commits June 12, 2020 13:23

pyln.proto.message: fix handling of missing optional fields.

5dbec8f

If they don't exist, that's OK. These will eventually be going away from the spec, but there are still some in gossip messages for now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

rustyrussell force-pushed the guilt/pyln-messages branch from dd1aab8 to 5dbec8f Compare June 12, 2020 03:56

cdecker approved these changes Jun 12, 2020

View reviewed changes

cdecker merged commit aaefbe2 into ElementsProject:master Jun 12, 2020

cdecker mentioned this pull request Jun 13, 2020

Missing pyln/proto/message/bolts in pyln-proto #3767

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New pyln modules. #3733

New pyln modules. #3733

rustyrussell commented May 18, 2020

darosior left a comment

rustyrussell commented May 19, 2020

rustyrussell commented May 28, 2020

cdecker left a comment •

edited

Loading

rustyrussell commented Jun 3, 2020 •

edited

Loading

cdecker commented Jun 3, 2020

cdecker left a comment

cdecker Jun 12, 2020

cdecker Jun 12, 2020

cdecker Jun 12, 2020

cdecker Jun 12, 2020

cdecker Jun 12, 2020

cdecker Jun 12, 2020

cdecker Jun 12, 2020

cdecker commented Jun 12, 2020

New pyln modules. #3733

New pyln modules. #3733

Conversation

rustyrussell commented May 18, 2020

darosior left a comment

Choose a reason for hiding this comment

rustyrussell commented May 19, 2020

rustyrussell commented May 28, 2020

cdecker left a comment • edited Loading

Choose a reason for hiding this comment

rustyrussell commented Jun 3, 2020 • edited Loading

cdecker commented Jun 3, 2020

cdecker left a comment

Choose a reason for hiding this comment

cdecker Jun 12, 2020

Choose a reason for hiding this comment

cdecker Jun 12, 2020

Choose a reason for hiding this comment

cdecker Jun 12, 2020

Choose a reason for hiding this comment

cdecker Jun 12, 2020

Choose a reason for hiding this comment

cdecker Jun 12, 2020

Choose a reason for hiding this comment

cdecker Jun 12, 2020

Choose a reason for hiding this comment

cdecker Jun 12, 2020

Choose a reason for hiding this comment

cdecker commented Jun 12, 2020

cdecker left a comment •

edited

Loading

rustyrussell commented Jun 3, 2020 •

edited

Loading