Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TAP for keyid flexibility #112

Merged
merged 10 commits into from
Apr 21, 2020
208 changes: 208 additions & 0 deletions candidate-keyid-tap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
* TAP: TBD
* Title: Improving keyid flexibility
* Version: 1.0.0
* Last-Modified: 19-03-2020
* Author: Marina Moore
* Status: Draft
* Content-Type: markdown
* Created: 18-03-2020
* TUF-Version: 1.1.0
* Post-History: <dates of postings to the TUF mailing list>

# Abstract

Keyids are used in TUF metadata as shorthand references to identify keys. They
are used in place of keys in metadata to assign keys to roles and to identify
them in signature headers. The TUF specification requires that every keyid used
in TUF metadata be calculated using a SHA2-256 hash of the public key it
represents. This algorithm is used elsewhere in the TUF specification and so
provides an existing method for calculating unique keyids. Yet, such a rigid
requirement does not allow for the deprecation of SHA2-256. A security flaw in
SHA2-256 may be discovered, so TUF implementers may choose to deprecate this
algorithm. If SHA2-256 is deprecated in TUF, it should no longer be used to
mnm678 marked this conversation as resolved.
Show resolved Hide resolved
calculate keyids. Therefore TUF should allow more flexibility in how keyids are
determined. To this end, this TAP proposes a change to the TUF specification
that would remove the requirement that all keyids be calculated using SHA2-256.
Instead, the specification will allow metadata owners to use any method for
calculating keyids as long as each one is unique within the metadata file in
which it is defined to ensure a fast lookup of trusted signing keys. This
change will allow for the deprecation of SHA2-256 and will give metadata owners
flexibility in how they determine keyids.


# Motivation

Currently, the TUF specification requires that keyids must be the SHA2-256 hash
of the public key they represent. This algorithm ensures that keyids are unique
within a metadata file (and indeed, throughout the implementation) and creates a
short, space-saving representation. SHA2-256 also offers a number of secure
hashing properties, though these are not necessary for these purposes. In this
case SHA2-256 is simply a way to calculate a unique identifier employing an
algorithm that is already in use by the system.

The specification sets the following requirements for keyid calculation:
1. The KEYID of a key is the hexdigest of the SHA2-256 hash of the canonical JSON form of the key.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the biggest thing, but we likely should be consistent with keyid vs KEYID capitalization. If you're quoting, that's fine, perhaps be more explicit about that in your formatting though...

2. Clients MUST calculate each KEYID to verify this is correct for the associated key.
3. Clients MUST ensure that for any KEYID only one unique key has that KEYID.

## Problems with this requirement
Mandating that keyids be calculated using SHA2-256 has created a number of issues
for some implementations, such as:
* Lack of consistency in implementations that use other hash algorithms for
calculating file hashes and would prefer not to introduce SHA2-256 for this one
instance. For example, the PEP 458 implementation (https://python.zulipchat.com/#narrow/stream/223926-pep458-implementation)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: I'd either point directly to that conversation https://python.zulipchat.com/#narrow/stream/223926-pep458-implementation/topic/Timeline/near/188666946 (given that the link is permanent-ish, which I don't know), or leave the link out altogether.

will use the BLAKE2 hashing algorithm throughout the implementation.
* Incompatibility with some smart cards and PGP implementations that have their
own way of calculating keyids.
* Inability to adapt if SHA2-256 should be deprecated. In such a case, metadata
owners may decide that maintaining a deprecated algorithm for use in keyid
calculation does not make sense.
* Space concerns may require even shorter hashes than those SHA2-256 can generate,
such as an index.
In these and other cases, TUF should provide a metadata file owner with the
flexibility to use keyids that are not calculated using SHA2-256.

# Rationale

TUF uses keyids as shorthand references to identify which keys are trusted to
sign each metadata file. As they eliminate the need to list the full key every
time, they take up less space in metadata signatures than the actual signing
key, reducing bandwidth usage and download times.

The most important quality of keyids used in TUF is their uniqueness. To be
effective identifiers, all keyids defined within a metadata file must be unique.
For example, a root file that delegates trust to root, snapshot, timestamp, and
top-level targets should provide unique keyids for each key trusted to sign
metadata for these roles. By doing so, a client may check metadata signatures
in O(1) time by looking up the proper key for verification.

Failing to provide unique keyids can have consequences for both functionality
and security. These are a few attacks that are possible when keyids are not unique:
* **Invalid Signature Verification**: A client may lookup the wrong key to use
in signature verification leading to an invalid signature verification error,
even if the signature used the correct key.
* **Keyid collision**: If root metadata listed the same keyid K for different
snapshot and root keys, an attacker with access to the snapshot key would also
be able to sign valid root metadata. Using the snapshot key to sign root
metadata, the attacker could then list the signature in the header with K. A
client verifying the signature of this root metadata file, would use K to
lookup a key trusted to sign root, and would find the snapshot key and
continue the update with the malicious root metadata. To prevent this
privilege escalation attack, metadata file owners should ensure that
every keyid is associated with a single key in each metadata file.
One attack that does not need to be considered is a hash collision. Though an
attacker who is able to exploit a hash collision against the function used to
calculate the keyid will be able to identify another key that hashes to the
value of the keyid, the client will only use a key that is listed in the
metadata. The attacker would not be able to put a malicious key into the
metadata without the metadata signing key, so a hash collision cannot be used
to maliciously sign files.

# Specification

With just a few minor changes to the current TUF specification process, we can
remove the requirement that keyids must be calculated using SHA2-256. First, the
specification wording should be updated to allow the metadata owner to calculate
keyids using any method that produces a unique identifier within the metadata
file. This means replacing requirements 1 and 2 above with a description of
required keyid properties, ie “The KEYID is an identifier for the key that is
determined by the metadata owner and MUST be unique within the root or
delegating targets metadata file.” Once this keyid is determined by the metadata
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The KEYID is an identifier for the key that is determined by the metadata owner and MUST be unique within the root or delegating targets metadata file.

Now that there wouldn't be a cryptographically derived keyid for each key, it's now possible to list the same key multiple times, each with a different keyid. Can we instead impose another rule that the key must also be unique? Without this rule, an attacker could duplicate a signature to reduce the effective threshold of keys needed to validate the metadata. We do this in go-tuf and rust-tuf to avoid double counting keys due to our (temporary) support of python-tuf's keyid_hash_algorithms field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. If there are multiple delegations to the same KEYID as part of a threshold, is this allowed? If I delegate to both Alice and Bob, who both delegate to Charlie, is Charlie's approval enough? (I think I know the answer in both cases and this is mostly tangental to this issue, but think we should discuss if we think TUF is doing the right thing.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a description of this in the scenarios below. As currently written, in this situation Charlie's approval would be enough. I think that this should be sufficient to trust the metadata as a threshold of signatures using unique keys must be reached to obtain Charlie's approval.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have less practical experience with tuf than the other folks in this discussion, but I would be surprised that Charlie's approval would be enough. My understanding of the specification is that thresholds are a defence against key compromise. In the scenario described we only have to compromise a single key in order to meet a threshold that is > 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like the "Deadly Diamond of TUF" 😜

Jokes aside, I also think that Charlie's approval is enough. But in reality it required Justin's, Alice's and Bob's approval to even end up in a situation, where it only requires Charlie's approval.

I agree that it doesn't sound ideal. So either:

  • Justin reconsiders delegating trust to Alice and Bob, seeing them void the threshold,
  • or he at least does not allow them to further delegate trust (IIUC, this is what the "terminating" flag in delegations is for).
  • Alternatively, we could add a feature to the TUF specification, to disable such signature threshold regression in delegation graphs. But I think above tools are enough.

Do others agree? @JustinCappos, you really got me curious about your answers to these cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requirement: Justin requires two roles in agreement about a given target
branch search a): Justin asks Alice, Alice asks Charlie, Charlie has target, back up to Justin
branch search b): Justin asks Bob, Bob asks Charlie, Charlie has target, back up to Justin
success: Justin has two roles in agreement and releases the target

Shouldn't it be enough to, along with Charlies knowledge about the target, also pass up the public key and let Justin only release the target if two roles with different keys were in agreement?

The current algorithm returns with the first match. So suppose that Alice (or Bob) also delegated trust to Daniela, who also approved the target. You should approve it in this case, but the current algorithm would not do so. You'd need to match all parties that agree. Then you'd need to be sure that Alice and Bob each have at least one person that was not yet chosen who they could select to fulfill this need. Also, what if Alice and Bob have threshold delegations to Charlie and Daniela in this case? It's just a lot messier to deal with, in my view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this issue pertains more to the existing delegation resolution algorithm than to the keyid calculation, so I propose we open a new issue to discuss this issue further. As @lukpueh mentioned, TAP 3 is not currently included in the specification, so this issue does not exist with the current specification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a closer look at this issue, it appears that the visited flag ensures that Charlie's role will not be visited more than once in the DFS as described in 4.4.1 of the client workflow. Does this solve the issue @JustinCappos @lukpueh?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this solves it. I don't think it would handle the case above with Daniela, for example.

I do think it would be fine to move this elsewhere, but I think the most important question is what should the system actually do? I think we could code it to work however, but I think we should definitely strive to do what a user would expect first and foremost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I created a new issue to continue this discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within the root or delegating targets metadata file

Just for clarity, can we generalize this early on?

A role that authorizes the signing keys (by keyid) for another role and has its public keys == a role delegating trust to another role == a delegating role == root or delegating targets role (including top-level targets).

You do generalize it in the next line ...

will be listed in the delegating metadata file

where, I think, you are referring to root and delegating targets, right?

Anyway, it's not a big thing. I just want to make sure everyone is on the same page.

owner using their chosen method, it will be listed in the delegating metadata
file and in all signatures that use the corresponding key. When parsing metadata
signatures, the client would use the keyid(s) listed in the signature header to
find the key(s) that are trusted for that role in the delegating metadata. This
should be described in the specification by replacing requirement 3 above with
“Clients MUST use the keyids from the delegating role to look up trusted signing
keys to verify signatures.” All metadata definitions would remain the same, but
the client’s verification process would track keyids within each metadata file
instead of globally.

In order for TUF clients to adhere to these specification changes, they may have
to change the way they store and process keyids. Clients will use the keyids
from a metadata file only for all delegations defined in that metadata file. So
if a targets metadata file T delegates to A and B, the client should verify the
signatures of A and B using the trusted keyids in from T. When verifying
mnm678 marked this conversation as resolved.
Show resolved Hide resolved
signatures, clients should try all signatures that match their trusted keyid(s).
If T trusts keyid K to sign A’s metadata, the client should check all
signatures in A that list a keyid of K. This means that if another metadata file
M delegates to A, it would be able to use the same keyid with a different key.
Once the signatures for A and B have been checked, the client no longer needs to
store the keyid mapping listed in T. During the preorder depth-first search of
targets metadata, the keyids from each targets metadata file should be used in
only that stage of the depth-first search.

These changes to the specification would allow the repository to use any scheme
to determine keyids without needing to communicate it to clients. By making this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how or why the repository is determining KEYIDs. Can you clarify?

scheme independent of the client implementation, root and targets metadata may
use different methods to determine keyids, especially if they are managed by
different people (ie TAP 5). In addition, the repository may update the scheme
at any time to deprecate a hash algorithm or change to a different keyid
calculation method.

## Keyid Deprecation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why this is needed anymore. Just generate a new file with whatever new KEYID scheme you want. All consumers didn't understand how you got the KEYIDs anyways, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The process is pretty straightforward, but the metadata owner should ensure that the keyids remain unique throughout any transition. This might not need a whole section, but I think that it's important to note.

With the proposed specification changes, the method used to determine keyids
is not only more flexible, but it may be deprecated using the following process
for each key D and keyid K in the root or delegating targets metadata file:
* The owner of the metadata file determines a new keyid L for D using the new method.
* In the next version of the metadata file, the metadata owner replaces K with L
in the keyid definition for D.
* Any files previously signed by D should list L as the keyid instead of K.
These files do not need to be resigned as only the signature header will be updated.
Once this process is complete, the metadata owner is using a new method to
determine the keyids used by that metadata file.

As keyid deprecation is executed, it is important that keyids within each
metadata file remain unique. Metadata owners should only publish metadata that
contains a unique keyid to key mapping.

## Implications for complex delegation trees
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You touch upon this in the specification, but it might be helpful to also include an example here for the case where the targets metadata file A delegates to C using the key D and the keyid K, as well as another targets metadata B delegates to C using the key E and the keyid K. From what I understand, this should result in C being signed with:

{
    "sigs": [
        {
             "keyid": "K",
             "sig": "1234..."
        },
        {
             "keyid": "K",
             "sig": "abcd..."
        },
        ...
    ],
    ...
}

Both signatures will be checked if the delegation search ever ends up at A or B.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thanks for the suggestion!

Although keyids need to be unique within each metadata file, they do not need to
be unique for each delegated role. It is possible for different keyids to
represent the same key in different metadata files, even if both metadata files
delegate to the same role. Consider two delegated targets metadata files A and B
that delegate to the same targets metadata file C. If A delegates to C with
key D with keyid K and B delegates to C with key D with keyid L, both of these
delegations can be processed during the preorder depth-first search of targets
metadata as follows:
* When the search reaches A, it will look for a signature with a keyid of K in C.
If it finds this and validates it, the search will continue if a threshold of
signatures has not been reached.
* When the search reaches B, it will look for a signature with a keyid of L in C.
If it finds this and validates it, the search will continue if a threshold of
signatures has not been reached.
Once the search is complete, if a threshold of signatures is reached the
metadata in C will be used to continue the update process. Therefore, K and L
may be used as keyids for D in different metadata files. So that clients can
validate signatures using each of these keyids, they both must be used to
identify a valid signature using D in C’s header. As clients store keyids only
for use in the current delegation, this should not require a change to the
client process described in this document.

# Security Analysis

TUF clients only trust keys that are defined in signed metadata files. For this
reason, the method of calculating keyids does not allow an attacker to add
new trusted keys to the system. However, a bad keyid scheme could allow a
privilege escalation in which the client verifies one metadata file with a
key from a role not trusted to sign that metadata file. This proposal prevents
privilege escalation attacks by requiring that metadata owners use unique keyids
within each metadata file, as described in the rationale.

# Backwards Compatibility

Metadata files that are generated using SHA2-256 will be compatible with clients
that implement this change. However, clients that continue to check that
keyids are generated using SHA2-256 will not be compatible with metadata that
uses a different method for calculating keyids.

For backwards compatibility, metadata owners may choose to continue to use
SHA2-256 to calculate keyids.

# Augmented Reference Implementation

TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be finished, ofc 🙂


# Copyright

This document has been placed in the public domain.