-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata API: Store signatures as dict #1424
Conversation
The reasons this is a draft is that it's a prime example of the clash between trying to to keep the API "close to the file format" and making it safe and easy to use:
I think I'll manage to live with this design: At least static type check will complain about appending to signatures: The alternative I guess would be to just have |
Similar design is very likely to be a solution to #1426 (delegation roles should be unique WRT role name). So the decision on "do we want to preserve List-type access to attributes that have a matching json array in the file format" is a generic one. I do wonder, does the List-access really help anyone or are we just making things more complex for no benefit... |
Based on feedback I was bold and just changed the API keeping it simple: signatures is now a dict of keyid to Signature: this ensures we will always only have one signature per keyid (and makes signature lookups much nicer looking). It does mean a slight difference between file format and the API but I believe positives outweigh that. This will slightly conflict with #1423 but in the end they will fit nicely together: the resulting |
760d453
to
388d296
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -89,7 +92,7 @@ def from_dict(cls, metadata: Dict[str, Any]) -> "Metadata": | |||
raise ValueError(f'unrecognized metadata type "{_type}"') | |||
|
|||
# Make sure signatures are unique | |||
signatures: Dict[str, Signature] = {} | |||
signatures: "OrderedDict[str, Signature]" = OrderedDict() | |||
for sig_dict in metadata.pop("signatures"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not really related to my commit here but if we really want to be deterministic, I think we 'd have to handle the original json->dict case as well (I don't think anything is currently guaranteeing in python 3.6 that metadata["signatures"] will iterate in order). json does allow that with
json.loads(data, object_pairs_hook=collections.OrderedDict)
(of course in practice It Just Works because cpython does order dict items in 3.6)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it's likely worth being explicit here and changing how we load the JSON in the deserialiser.
As indicated, this somewhat dilutes the mapping from the metadata API to the specification and on-disk format. The container format in the specification and on-disk have signatures as a list of dictionaries, which is how we have implemented I think the trade-off here makes sense, and think it's the same decision @trishankatdatadog came to for tuf-on-a-plane, but would appreciate hearing from others cc @mnm678 @trishankatdatadog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I'm in favour of this change but would like to hear from others too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this choice. Using a dictionary will make some of the mistakes with unique signatures a lot easier to avoid. This is something I'd eventually like to see in the specification as well (ie theupdateframework/specification#155).
store signatures in a Dict of keyid to Signature. This ensures signature uniqueness. Raise in from_dict() if input contains multiple different signatures for a keyid. This changes Metadata object API, and makes it slightly different from the file format: this is justified by making the API safer to use and easier to validate. Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
Dict ordering is part of regular Dict from Python 3.7: Use OrderedDict for signatures to make sure signatures are serialized in a reproducible order even on 3.6. The added benefit is that reader will immediately understand that the order has some significance. The actual type annotations are a bit convoluted because: * typing does not include OrderedDict before 3.7 so can't use that * Annotating inner types does not work for collections.OrderedDict in older pythons (so have to use the "stringified annotations") Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
rebased on develop (the key lookup is now much more readable after the keyid change in #1417) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebase LGTM, thanks.
Clearing the OrderedDict makes it easier to see what happens and avoids having to call OrderedDict() again. Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
Added a small tweak which makes code easier to read and avoids calling OrderedDict constructor twice. This is very minor and the appending / not appending is tested quite well so I'm not waiting for yet another review: will merge when tests are good. |
store signatures in a Dict of keyid to Signature. This ensures
signature uniqueness. Raise in from_dict() if input contains multiple
different signatures for a keyid.
This changes Metadata object API, and makes it slightly different from
the file format: this is justified by making the API safer to use and
easier to validate.
Signed-off-by: Jussi Kukkonen jkukkonen@vmware.com
Fixes #1422