-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hash and length verification to MetaFile and TargetFile #1437
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was left wondering about two things:
do we actually want to raise on missing length/hashes? and related, do we need separate checks at all? I'm asking because it seems that the repository decides whether there are lengths/hashes in the metadata or not, and (assuming the spec doesn't demand them in this situation) it seems to me the client shouldn't really do anything about missing length/hashes. So if client shouldn't do anything, why do we raise?
Sure, I can imagine someone implementing TUF in a way that requires hashes for metadata files... but that's not what the spec says so I don't see why the metadata API should implement that case.
Second thing is, I wonder if you experimented with a common parent class instead of a Mixin? It would probably work better with type annotations (defining a class variable of same name seems like a hack that has some runtime effects as well, I think the proper way to do Mixins with type annotations is Protocols: https://mypy.readthedocs.io/en/latest/more_types.html#mixin-classes, but no idea if they're worth the trouble)
tuf/api/metadata.py
Outdated
data.seek(0, io.SEEK_END) | ||
observed_length = data.tell() | ||
|
||
if self.length != observed_length: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this uses self.length but it's marked Optional (see other comments about related things though)... not a bug I guess but a bit weird
tuf/api/metadata.py
Outdated
if self.hashes is None: | ||
raise ValueError("Missing expected hashes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_verify_hashes() does this already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add one of those checks to silence mypy
.. but I was planning to revise them :)
tuf/exceptions.py
Outdated
# TODO: LengthMismatchError fully duplicates DownloadLengthMismatchError | ||
# except that it is not derived from Download. Revise and remove duplicates | ||
# with the progress of the client refactor. | ||
class LengthMismatchError(Error): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see why you're doing this but let's see if the other questions change the situation at all, maybe we can figure out something else... If I call verify_length_and_hashes() I don't think I would ever want to handle the two errors separately: that implies this maybe should be just a single error.
Let's at least not add the functions: I have never seen the data in TUF custom errors being used for anything, I don't expect anyone to use these either -- so a plain message should be fine
Sounds right that the repository makes the decision and the client cannot do anything about it. Based on this I think an API like this should look neater and it'd be easier to use:
|
I agree that I started including too many "hackish" lines of code because of type checking and the mixin. I guess I have to define the BaseClass as an ABC to help mypy? I will try and see if the members being optional in only one of the classes causes troubles again. |
On the other hand, I'm sure the Python annotation story still has some holes in it (we're definitely seeing them in python < 3.7): so if annotation is preventing a good design, let's also consider disabling annotation for that bit... But in this case I think the baseclass would probably work fine. Some options:
|
I hesitated because the first option reduces duplicated code shared between the two classes ( Same for declaring length/hashes in the base class as optional. It feels misleading for
I think I will try this one which is also close to my initial idea but now the verification methods can be just class/static methods of the base class. |
befbab5
to
587b12f
Compare
I believe I took into account all comments from the discussion and it should be ready for a review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. I think you could have squeezed the exceptions to a line shorter but that's so minor I'm not asking to change that.
So the only real question is about validation: since you made the mistake of adding some, I'm of course asking for more... I'm fine with filing an issue on that though, and getting this merged without solving hashes validation.
# Do some basic input validation | ||
if version <= 0: | ||
raise ValueError(f"Metafile version must be > 0, got {version}") | ||
if length is not None and length <= 0: | ||
raise ValueError(f"Metafile length must be > 0, got {length}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is where we get to decide what to do with input data that we don't have a "natural" check for: length and version get validated "enough" like this IMO but hashes could still be anything...
For the values we could run
if not all(isinstance(h, str) for h in hashes.values()):
raise ValueError("Invalid hash value")
But maybe we don't need to? Is it enough that the hash verification will fail when it's tried on completely bogus hash value?
For the dict keys: Can we make SSLib check that the algorithms are known to SSLib?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah ... I avoided intentionally the downward validation spiral here ...
Hash values: there is even a regex about correct hash value in formats.py now but I wasn't sure if it isn't an overkill.
Dict keys: Probably we can pre-define somewhere supported algorithms by sslib because now the check is done runtime during hash calculation.
I will file an issue about it, especially hash values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it is: #1441
there's a conflict here with my changes to verify_signature()... In particular I ended making the import Otherwise LGTM |
Extend MetaFile and TargetFile classes with methods for length and hash verification. The common functionality is implemented as static methods of the base class while MetaFile and TargetFile implement the user API based on it. Define LengthOrHasheMismathError. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Add tests for hash and length verification. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Add basic checks for allowed input values during objects' serialization. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
587b12f
to
dcdd332
Compare
Rebased on develop and changed the import to |
So I approved this already but I'm now thinking if we should handle the unsupported algorithm and format error cases in this PR as well... I guess a separate issue works too? Basically as discussed in #1438 and #1441: I don't think we should try to deeply validate e.g. our hash algorithms at deserialize-time. This means when they are used (in this PR), we need to handle every error, and turn the errors into the hash mismatch errors |
I'm making an executive decision and merging, work continues in: |
Please fill in the fields below to submit a pull request. The more information
that is provided, the better.
Fixes #1361
Description of the changes being introduced by the pull request:
MetaFile
andTargetFile
need exactly the same code for length and hash verification but with different exceptionsfor the user API.
HashVerificationMixIn
class implementing the common functionality.MetaFile
andTargetFile
Please verify and check that the pull request fulfills the following
requirements: