Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signature specification #214

Merged
merged 1 commit into from
Apr 14, 2021
Merged

Signature specification #214

merged 1 commit into from
Apr 14, 2021

Conversation

dlorenc
Copy link
Member

@dlorenc dlorenc commented Apr 3, 2021

This mostly reflects the reality today. Some things I'd like to do:

  • Comply with the simple signing format a little bit closer to reduce our extra semantics. Maybe reuse "atomic container signature"?
  • Register mediaTypes with OCI artifacts and IANA?
  • Better naming for certificate and chain
  • We don't really have a spot for hash algorithm today. Do we need one?
  • Drop the .cosign suffix and replace with .sig?

Signed-off-by: Dan Lorenc dlorenc@google.com

@dlorenc
Copy link
Member Author

dlorenc commented Apr 3, 2021

Ref #212

@dlorenc
Copy link
Member Author

dlorenc commented Apr 3, 2021

cc @jonjohnsonjr please help me

@dlorenc dlorenc force-pushed the spec branch 2 times, most recently from e733d38 to 5e3f519 Compare April 3, 2021 13:30
@cpanato cpanato added this to the 0.3.0 milestone Apr 3, 2021
SPEC.md Outdated

No information about the signature scheme is included in the object.
Clients must determine the hash algorithm and signature scheme out-of-band during the verification process.
This is an intentional decision to reduce the risk of [algorithm-confusion attacks](https://news.ycombinator.com/item?id=24346317).
Copy link

@OR13 OR13 Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with this decision...

The cited issue is alg=none... which is a real JOSE issue... but that does not make JOSE by itself bad, just many, many implementations that blindly rely on it.

Above you say "ECDSA-P256 with the SHA256 Hash Algorithm" is required to be supported... why not say the signature format will be Detached JWS with Unencoded payload and one of the following alg: "ES256, EdDsa", etc...

There are a lot of different signature standards, the only thing worse is no way to tell which one is used, when someone hands you a signed payload and no context.

consider the more popular signature suite specs, you might base this spec on:

https://tools.ietf.org/html/rfc4880#section-5.2.3

https://tools.ietf.org/html/rfc7515#section-4.1.1

^ notorious...

The JWS Signature value is not
valid if the "alg" value does not represent a supported algorithm or
if there is not a key for use with that algorithm associated with the
party that digitally signed or MACed the content.

somehow ^ from this sentence, we got to "just turn off security with the alg:none"...

The commenters on the thread are correct.... if you are building a security library, don't give your callers a foot gun....

IMO, the solution is:

  • throw if algorithm is not defined
  • throw if algorithm is not from allow list
  • update the allow list regularly based on security best practices ( https://safecurves.cr.yp.to/, etc)

This way you can know if software is claiming to be signed with RSA 512 / MD5.... and not need to negotiate with someone out of band to learn that you should not trust that package.

We should assume that software will be published with crypto suites that are later broken, and make it easy to warn users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I think there are two parts to it, and we actually agree on the important parts. This spec was written to describe how things are today, not necessarily how they should be. I want to fix some of it before merging.

For the first part - we only support ecdsa-256 with sha256 right now. We should support more, especially for verification. I'm still leaning towards this implementation only supporting one type of key for signing, but we should be able to verify signatures of other types.

I'm not sure about this part though:

There are a lot of different signature standards, the only thing worse is no way to tell which one is used, when someone hands you a signed payload and no context.

The signature is definitely meaningless if all you have is the signed payload and signature. You need a public key to verify against. I think what I'm trying to say is that we should include the information on the signing algorithm used with the public key, not the payload.

We have a few components we need overall for the signature to be meaningful:

  • Payload bytes
  • Signature bytes
  • Hash algorithm (sha256, etc.)
  • Signature algorithm (ecdsa-p256, etc.)
  • Public Key Bytes

In the current implementation, we have the following breakdown:

  • Payload bytes (stored in registry, next to target)
  • Signature bytes (stored in registry, next to target)
  • Hash algorithm (sha256, etc.) (hardcoded to sha256)
  • Signature algorithm (ecdsa-p256, etc.) (stored in PKIX public key, delivered out of band)
  • Public Key Bytes (stored in PKIX public key, delivered out of band)

I agree the JOSE spec isn't bad, but I don't think it actually buys us anything. We don't need anything to be URL safe, and we don't need to concatenate everything into one token. It also would require us to duplicate the algorithm info (outside of the hash function), which is my real concern. If you have a public key you want to verify things against, you already know the algorithm.

So I guess the main question is - do we need to support multiple hash algorithms? If we do, let's figure out a place to include that info.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While its not common, there are scenarios where having the public key is not enough, for example: secp256k1 with ECDSA or Schnorr, or bls12381 with blake2b or shake256... Or Ed25519 to Curve25519 conversions, etc...

Most signatures schemes assume a hash algorithm, but it's also pretty common to hash with something different before signing.

My recommendation would be provide a data structure that allows you to specify everything needed to verify a signature without guessing.

Building on an existing signature suite and serialization scheme obviously helps make the software easier to implement in other languages with off the shelf tooling, in an interoperable way.

I would recommend building on JOSE, COSE, or PGP, and not inventing a new signature format.

But if you are inventing a new signature suite, it should cover the algorithmic agility, and assume more than one signature, hash and public key representation.

I think test vectors and a registry would be essential for this, something like:

0 ES256 (ECDSA with P256 and SHA256) on sha256(raw binary)
1 ES384 (ECDSA with P384 and SHA256) on sha256(raw binary)

^ saddly these will not produce stable test vectors due to non deterministic signatures.

You may also want to cite FIPS / NIST directly, since you can't rely on IANA JOSE / COSE registries:

Copy link
Member Author

@dlorenc dlorenc Apr 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To throw another signature spec into the mix: https://github.com/secure-systems-lab/signing-spec/blob/master/protocol.md

I think that one is closest philosophically to what I want to do here. It has support for "hinting" at the key id and algorithm alongside the payload, but it's very clear that this information can't be trusted and users are expected to obtain this information out of band.

The only thing I don't really like about that spec is the usage of PAE.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only real difference between the current cosign implementation and the SSL spec is whether or not the payload type is "protected". Cosign uses the mediaType element to store the payload type (simple signing), and this media type is not protected by a signature.

If we used the SSL spec, we would instead set a media type corresponding to the SSL spec. Then the signed component would be:

PAE(UTF8(SimpleSigning),$JSON)

I'm not sure I understand the attack/threat model on putting this payload type into the signature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I am an author of the SSL Signing Spec, and I just filed #259 related to this topic.

The only thing I don't really like about that spec is the usage of PAE.

@dlorenc Could you please explain your reservations?

I'm not sure I understand the attack/threat model on putting this payload type into the signature.

If you have a signing oracle that uses the same key for multiple message types, then an attacker could potentially get the oracle to sign a message as one type and then have a verifier interpret it as another. Here's a very detailed example showing how to do that in a CI/CD scenario:
https://colab.research.google.com/github/secure-systems-lab/signing-spec/blob/master/hypothetical_signature_attack.ipynb#scrollTo=CIF11AmzXJuQ

Happy to explain more if that's not clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern with PAE is that there's very limited library, language, and tooling support for serialization and deserialization. I haven't actually been able to find any for Golang for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PAE itself doesn't need a library. It's just a one-statement concatenation. You never deserialize it. I don't know go, but here it is in Python:

def PAE(payloadType: str, payload: bytes) -> bytes:
  return b''.join([struct.pack('<Q', 2),
                   struct.pack('<Q', len(payloadType)),
                   payloadType.encode('utf-8'),
                   struct.pack('<Q', len(payload)),
                   payload])

Now, I agree that SSL Signing Spec needs a reference implementation, but I wanted to make sure there are no security concerns with PAE. It sounds like not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No security concerns, just interoperability. It's important we can do all these verifications in as much standard tooling as possible. This includes things like bash/openssl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I filed secure-systems-lab/dsse#27 to consider a simpler PAE.

@dekkagaijin
Copy link
Member

dekkagaijin commented Apr 6, 2021

We don't really have a spot for hash algorithm today. Do we need one?

This'd probably be good, if only as a signal for implementers. e.g. SHA-256 is standard but SHA-512(-384, -512/256) is not only "more secure" but actually faster than SHA-256 on 64-bit architectures and has more-or-less universal support at this point.

That said, I'm not sure we want to "meet people where they are" as a goal, since where they are doesn't involve signing anything. If we can't offer a simple, easily-adoptable general-case end-to-end solution, there's no point to this exercise.

IMHO it's also better for both security and (sup)portability to be opinionated about algorithms- lower barriers for entry, easier maintainability, and less/no possibility of downgrade attacks. If a "practical" attack surfaces, then just increment the spec and adopt whatever is best practice at the time.

@dlorenc
Copy link
Member Author

dlorenc commented Apr 7, 2021

OK, I think I figured out the hash algorithm bit - we can lean on the registries!

By storing the raw payload as a blob, the registries already perform hashing to enter the blob into their CAS. The blobs are referenced by their sha256:$digest from the OCI manifest already, which also happens to be the algorithm we use. Registries also store the size of this object, which is what we use to reference it via descriptor. So by leaning on the registry spec we get server and client side validation of the digest/size of the payload already.

If we specify that signatures of registry objects should use the same algorithm the registry uses for references, we eliminate another moving part and get improved validation because registries and clients already understand that field.

@dlorenc
Copy link
Member Author

dlorenc commented Apr 7, 2021

Just to make that last comment clear, we have the following hierarchy of objects:

  • The actual "target" we are signing is any object in an OCI registry
  • The "payload" we sign, which includes a reference to that object by digest (sha256:) is a Red Hat Simple Signing object
  • All "signatures" for the "target" object are stored in a single, detached OCI object, referenced via naming convention
  • Each individual "signature" is embedded into that OCI object as a "layer", formatted as a "descriptor".
    • The "payload" that we sign is stored raw as "blob" in the registry
    • The "payload" is then hashed using the standard registry algorithms (today sha256) and embedded back into the "layer descriptor" by that digest (and importantly, size)
  • The hash algorithm used for the signature should match that of the registry (today sha256 but this could change in the future)
  • The rest of the signature scheme is embedded in the PKIX public key (ecdsa, rsa, ed25519, etc)

@dekkagaijin
Copy link
Member

The hash algorithm used for the signature should match that of the registry (today sha256 but this could change in the future)

I like it, makes sense

SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
* ECDSA-P256

No information about the signature scheme is included in the object.
Clients must determine the signature scheme out-of-band during the verification process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, how are they going to actually do that?

(In a single public key root of trust, I agree that supported schemes/algorithms are a property of that root of trust along with the raw public key, and should be a part of the same root-of-trust configuration set up by users. But with certificate chains that gets more difficult; x.509 certificates AFAICS include a public key signature algorithm, but not really a specific signature form/encoding scheme.)

I don’t have a better proposal, just noting that this is difficult.

Essentially this hinges on the hash algorithm choice; given that, things like the OpenPGP design include the algorithms inside the signed data, so maliciously modifying that would require breaking the signer-chosen public key algorithm or hash (both trusted by assumption), substituting the public key algorithm for a weaker one (fixed in client configuration for simple roots of trust, or included in X.509 certificates), or substituting the hash algorithm for a weaker one (no defense?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice the scheme is as you described - they're either stored a property of the public key itself or somewhere adjacent to it.

### Hashing Algorithms

Signers and verifiers must know the hash algorithm used in addition to the signature scheme.
In an attempt to avoid specifying a particular hashing algorithm, we require that digest be calculated using the SAME algorithm as the OCI registry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s practical short-term but problematic longer-term if we ever need to transition from SHA256:

  • This makes it impossible to use signatures with a stronger scheme to mitigate SHA256 weaknesses
  • If we copy an image from a SHA256 registry to a post-SHA256 registry, or vice versa, even if the manifest+layers+config were exactly identical (I guess with the manifest listing multiple hash algorithms during the transition period), a signature is only valid in one of the registries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree - we should think through and document what the migration from sha256 to something else might look like.

My understanding though is that copying an image from a sha256 registry to a post-sha256 one would result in a new image. Since there are unique images for each registry, we would need unique signatures for each registry. But exactly what that copy/migration process would look like is really a guess at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hypothesis is that, in order to avoid a flag day and breaking digest references entirely, there will be some mechanism to represent manifests that contain both the SHA256 digest and the new digest of each blob (e.g. by using SHA256 in the designated descriptor field, and the new digest in an annotation). So it should in principle be possible to have a byte-for-byte identical image in two different registries, accessible using the same @sha256:… digest on both, and using some @new-digest:… on the updated registry — and it would be nice to keep signatures working when copying the image across.

OTOH it might be possible for the signer to create two separate signatures using the two hash algorithms, and publish both along with the image, so this would not be a deal-breaker.


Another concern in such a situation is that the hash algorithm is essentially registry-chosen: given a repo:tag reference, I understand “digest be calculated using the SAME algorithm as the OCI registry” to mean “use the algorithm returned by the registry in the Docker-Content-Digest header of https://github.com/distribution/distribution/blob/main/docs/spec/api.md#existing-manifests ”, or something like that. The signature scheme should protect against malicious registries, and giving registries this point of influence feels risky.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another concern in such a situation is that the hash algorithm is essentially registry-chosen: given a repo:tag reference, I understand “digest be calculated using the SAME algorithm as the OCI registry” to mean “use the algorithm returned by the registry in the Docker-Content-Digest header of https://github.com/distribution/distribution/blob/main/docs/spec/api.md#existing-manifests ”, or something like that. The signature scheme should protect against malicious registries, and giving registries this point of influence feels risky.

This is a very very good point. I was imagining the algorithm would actually be user controlled in this scenario, but this reinforces that we need to think through it with the OCI folks a bit more. The responsibility on digest calculation is sort of split between users and registries today - the Header from the registry is one place that is registry controlled, I was imagining it would be the one set by clients on upload though: https://docs.docker.com/registry/spec/api/#pushing-an-image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quickly skimming docker/distribution , it seems that the implementation can’t look up by arbitrary digests (that would be hard), it essentially chooses its own digest algorithm during blob upload, the clients then must use that one for references.

But that’s not very relevant to the future migration scenarios; many implementations probably hard-code SHA256 in various places, and we don’t know for sure which components will be responsible for which choices in the future.

More to the point, even if the process will end up as uploader-controlled, a malicious registry can just lie and return a different digest (or an attacker with unwanted write access to the repo can upload an image with the uploader-chosen different digest).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you think of any monkey business that a registry could get up do that wouldn't be transparent to clients with a simple pull-after-push verification?

Note that a malicious registry can return different content to different victims.

Copy link
Member

@dekkagaijin dekkagaijin Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that a malicious registry can return different content to different victims.

This is mitigated by pulling by digest, which is an independent UX issue. Even pulling by tag, assuming clients verify signatures, we'd only need to ensure that the content being pushed to the registry is what we're actually signing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither the signatures nor their metadata are protected by the image digest. (And any end-user of the signed image doesn’t know the digest already, otherwise signatures wouldn’t be necessary.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And any end-user of the signed image doesn’t know the digest already, otherwise signatures wouldn’t be necessary.)

I don't think I agree with this part - signatures are useful even when pulling by digest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, the non-repudiation property of the signature (and perhaps any associated metadata) could still be useful, apart from the minimal (signer, name/purpose/identity, binding) essential to the signature.

SPEC.md Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
SPEC.md Outdated Show resolved Hide resolved
@dlorenc
Copy link
Member Author

dlorenc commented Apr 13, 2021

Dropping the WIP. I've opened tracking issues for all the current major points of discussion. This is ready for review as an initial spec that describes how things work today, with the improvements we want to make.

SPEC.md Outdated Show resolved Hide resolved
Signed-off-by: Dan Lorenc <dlorenc@google.com>
Copy link
Member

@dekkagaijin dekkagaijin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dlorenc dlorenc merged commit ec3a071 into sigstore:main Apr 14, 2021
@dlorenc dlorenc deleted the spec branch April 14, 2021 18:51
lcarva pushed a commit to lcarva/cosign that referenced this pull request Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants