Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata API: Simplify unrecognized fields testing #1466

Merged
merged 1 commit into from
Jul 7, 2021

Conversation

MVrachev
Copy link
Collaborator

Fixes #1464

Description of the changes being introduced by the pull request:

We have merged ADR 8 allowing for unrecognized fields everywhere.
After a discussion with Jussi, we realized that there are a couple of
places where we don't want to allow unrecognized fields because they
are sensitive or there are limited acceptable values for them.
The places where we don't want to allow unrecognized fields are
keys, roles, meta, hashes or targets.
If we allow unrecognized fields in them we won't follow the spec or
even open the door for possible security vulnerability.

Second, the test we added for unrecognized fields
added tests for that which are too specific and not scalable.
Now, I use table testing which we have used initially in #1416
to test unrecognized fields support in a cleaner and much more readable
way.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@trishankatdatadog
Copy link
Member

Directly relevant to theupdateframework/specification#169

@asraa @mnm678 wdyt?

@mnm678
Copy link
Contributor

mnm678 commented Jun 23, 2021

I think there's a difference between unrecognized and unauthorized fields. Unrecognized fields in the signature wrapper (which are also unauthorized) could be dangerous as they could add alg=none style vulnerabilities. However, I think there is less danger in unrecognized fields within signed metadata. Yes, and attacker could add something malicious to one of these fields, but if they can do so in the signed metadata, they could also just replace keys, etc in the existing fields of the metadata.

Therefore, I prefer the simplicity of allowing these unrecognized fields anywhere within the signed metadata to allow for extensions to the specification to be backwards compatible.

@MVrachev
Copy link
Collaborator Author

Therefore, I prefer the simplicity of allowing these unrecognized fields anywhere within the signed metadata to allow for extensions to the specification to be backwards compatible.

We cannot allow unrecognized fields everywhere if we follow the spec.
For example, ROLE in root has a limited number of allowed values: https://theupdateframework.github.io/specification/latest/index.html#root-role
The same can be said for meta in timestamp.
For hashes there are concrete requirements

A dictionary that specifies one or more hashes of the target file at TARGETPATH, 
with a string describing the cryptographic hash function as key and HASH as defined for METAFILES.

and if a in hashes is a dictionary following the format described above we don't consider it as an unrecognized field.
Same argument for keys.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the ADR change could be another PR? I'm not as invested in tweaking the ADR text but would like to see the tests land...

Anyway for ADR text:

  • I wouldn't explain the reasons further than "don't store unrecognized fields when it is not allowed by specification: keys, roles, meta, hashes and targets are actual dictionaries (vs json objects that most structures in the format are) where 'unrecognised field' is not meaningful concept". This seems to cover it?
  • there's a lot of added text that is IMO not necessary and that would require review: security claims (like "possibility to inject keys, compromise trust") do not seem useful when they refer to metadata that is signed and thus by definition able to add keys and compromise trust.

For the tests:

  • I think this is a good way to do this (better than writing code anyway), I have no major comments
  • Question: Have you reviewed the file format and made sure we are testing every possible dictionary?
  • nitpick: Always adding a trailing comma after the last item in the test dataset would make the next PR easier to read

@@ -51,8 +51,11 @@ def wrapper(test_cls: "TestSerialization"):
class TestSerialization(unittest.TestCase):

valid_keys: DataSet = {
# Do we want to add unrecognized fields in the Key class?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Do we want to add unrecognized fields in the Key class?

this may be a valid question but maybe not useful in test code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it, but wanted to ask raise that question.
Do we want to allow unrecognized fields in the Key class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a desire to be able to store unrecognised fields in a keyval, see theupdateframework/specification#163 and theupdateframework/go-tuf#133

@MVrachev MVrachev force-pushed the unrecognized_fields branch 3 times, most recently from f90184f to d0a25bf Compare June 30, 2021 15:59
@MVrachev
Copy link
Collaborator Author

I updated this pr to include only the changes related to the tests.

After the @joshuagl comment, I included a test that we do support unrecognized fields in keyval.

  • Question: Have you reviewed the file format and made sure we are testing every possible dictionary?

Yes, I reviewed the file formats yet again and it seems okay. The only places where we don't allow unrecognized fields is where I described above.

  • nitpick: Always adding a trailing comma after the last item in the test dataset would make the next PR easier to read

Fixed that.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the commented question is still in the test data?

anyway, looks fine, thanks

@MVrachev MVrachev changed the title Metadata API: Simplify unrecognized fields testing and change ADR 8 Metadata API: Simplify unrecognized fields testing Jul 1, 2021
We have merged ADR 8 allowing for unrecognized fields and we have
added tests for that which are too specific and not scalable.

Now, I use table testing which we have used initially in theupdateframework#1416
to test unrecognized fields support in a cleaner and much more readable
way.

Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
@MVrachev
Copy link
Collaborator Author

MVrachev commented Jul 1, 2021

the commented question is still in the test data?

anyway, looks fine, thanks

Removed the question, but still - it's okay to have unrecognized fields there, right?

@jku
Copy link
Member

jku commented Jul 1, 2021

Removed the question, but still - it's okay to have unrecognized fields there, right?

I don't see why not -- keyval is even designed for that, AFAICT (as the contents are not defined for keys in general).

@jku jku merged commit a91a1ab into theupdateframework:develop Jul 7, 2021
@MVrachev MVrachev deleted the unrecognized_fields branch July 7, 2021 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata API: Test unrecognized fields
5 participants