-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add to/from_pydict methods #203
Conversation
Hi, sorry for the slow response on this. I like this idea, however. I'm not a fan of the large amounts of duplication this brings, for which I may have a solution. Would it be possible to add an internal generator method that yields the keys and values for to/from_(py)dict to then process in their own ways? |
Voted +1 |
If you'd be willing to pick this up with the feedback I've suggested, I'd be very grateful. |
This does in fact satisfy my request from #316 (after the addition of a @Gobot1234, in regards to minimizing the code duplication... I took a stab at your generator idea. I don't think it was as clean as you imagined (or maybe I'm missing something). I also tried another approach of merging the two to/from_dict methods together into one and conditionally doing things depending on whether or not it is a to/from_pydict. The generator approach is here: https://github.com/tcamise-gpsw/python-betterproto/commit/e1cb903f882446124a68bb7e54f234dda614504d Do either of these look acceptable? If so, i can open a PR and try to set up the poetry environment for more testing. |
Hi, sorry for the late reply, either looks fine to me. Thank you for looking into this |
Any ETA on when this might be released? |
I'm going to try and get this into b5, but no promises. |
# Conflicts: # src/betterproto/__init__.py # tests/test_features.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! |
This PR adds
to_pydict()
andfrom_pydict()
to parse and serialize messages into python dictionaries, alternative to the currentto/from_dict()
methods which cater to JSON serialization.In addition to defining messages, our protobuf definitions double as spark table schemas. Ingesting messages requires declaring the spark schema explicitly, and so we do that by mapping the betterproto message dataclass field types into spark sql types. The ingestion itself is done by decoding into a betterproto message, calling
to_dict()
, and then adding additional business logic to transform the JSON-prepared datetime strings into datetime objects. To do this we have to write a bit of extra code that looks like this:With this change our code would look like this, and the output dictionary value types match those declared in the dataclasses:
An example copied from the test:
This relates to some of the discussion happening in #36 with respect to datetimes. Personally this is the behavior that I expected for the
to_dict()
function, expecting it to act more likedataclasses.asdict
but omitting defaults and handling casing.Thanks a ton for
betterproto
. I'm happy to iterate on this if there are any requested changes.