Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce that identities are of an expected format #716

Closed
Tracked by #766
haydentherapper opened this issue Aug 1, 2022 · 14 comments · Fixed by #802
Closed
Tracked by #766

Enforce that identities are of an expected format #716

haydentherapper opened this issue Aug 1, 2022 · 14 comments · Fixed by #802
Assignees
Labels
enhancement New feature or request ga-candidate Proposed blocking issue for GA release

Comments

@haydentherapper
Copy link
Contributor

Description

Dex - Look like email, K8S - Look like a service account, Username - Doesn't look like an email, etc

Context: https://docs.google.com/document/d/1o8_bXIygufgiohJGlmBzqF4_BnXCTfgh4ILgJFJxYRs/edit?resourcekey=0-YEar3v67uoT31kj83dCVvA#heading=h.oiw6nn1ucgaq

@haydentherapper haydentherapper added enhancement New feature or request ga-candidate Proposed blocking issue for GA release labels Aug 1, 2022
@znewman01
Copy link
Contributor

I'll attempt to summarize the (very long) Google Doc comment thread linked in "Context" above.

Fulcio issues certs with the identity of the issuer in the Subject Alternative Name field. In X.509, the value in this field is a union type (see the definition of "GeneralName" in RFC 5280, 4.2.1.6) indicating whether the value is, for instance, an email (rfc822Name), a URI (uniformResourceIdentifier), or an IP (iPAddress).

If we have clients provide a simple "identity" field, along with an "issuer," do they need to indicate the type (for instance, writing uri:spiffe://domain/... or email:foo@bar.com)? This is pretty cumbersome.

If users don't indicate the type, we have to worry about whether this might confuse a user into accepting a value that they otherwise wouldn't. There are two ways this could happen:

  1. "Disjunction confusion:" if Fulcio issues certificates from the same issuer with different SAN types but the same string value (e.g., uri:foo and email:foo), then the client would accept either if they specify foo as the identity.

  2. "Type confusion:" can an attacker generate a cert with a SAN of one type and get the victim to verify it with another type implicitly in mind?

We don't have to worry about "disjunction confusion!" Fulcio already restricts each issuer to only one "issuer type", which in turn will only ever produce one SAN type.

That leaves "type confusion." Such attacks have two preconditions: the attacker can fool the CA into emitting a SAN whose value is "wrong" for the type (e.g. is an email instead of a URI), and that value must be controllable to a value the victim thinks they're verifying.

At the moment, Fulcio can entirely prevent the first precondition: there is no string that is both a valid URI and a valid email address1. So if Fulcio checks that URI SANs are, in fact, URIs, and the same for emails, we can eliminate these attacks. This issue tracks adding such validation (and possibly additional validation, such as "SPIFFE SANs should be URIs with the spiffe scheme").

One wrinkle: this prevents "type confusion" attacks between SAN types, but not necessarily between "Fulcio issuer types." For instance, the username type is currently a SAN email, and we write <username>@<domain that we pull out from the issuer URI>. This feels dangerously close to an email address, and would open the door to a softer "type confusion" attack: "emails" and "URIs" are the same SAN type, but very different identities: it might be fine for someone to register the username ceo on a site, but that probably shouldn't enable them to sign as ceo@example.com.

We should consider changing the format of the username type; if we do so, it should have a set of valid values that doesn't overlap with either URIs or emails (and, in general, we should try to have the set of valid values for a given issuer type be as small as possible, and as non-overlapping as possible).

Footnotes

  1. Let's imagine we had such a string. It's an email, so it must have an @ in it. It's a URI, so it must begin with some "scheme," followed by :. The @ must precede that :, as : is a disallowed character on the left side of the email address. which means that the scheme must contain @. However, @ is a disallowed character in a URI scheme; this is a contradiction.

@woodruffw
Copy link
Member

Thanks for summarizing @znewman01!

We should consider changing the format of the username type; if we do so, it should have a set of valid values that doesn't overlap with either URIs or emails (and, in general, we should try to have the set of valid values for a given issuer type be as small as possible, and as non-overlapping as possible).

Strong +1.

What do we think of a "bang path"-style username syntax, something like service.com!user? That avoids some potential ambiguity with email addresses by using the actual user ID as a suffix rather than a prefix, and means that a service where usernames are emails (but is not attesting to ownership) would have usernames like this: foo.com!joe@foo.com.

@znewman01
Copy link
Contributor

(the below might be totally misguided)

Do we need to stick the service name in the SAN at all? It should be clear from the "issuer" field in the cert. My understanding is that we did user@domain.com just to make the SAN more "email-like" when we were using the email SAN type (though @haydentherapper may have more context).

This does mean that something like (SAN=joe@foo.com, issuer=bar.com) is possible. To me, that has a sensible interpretation: "this user logged into bar.com using their email, joe@foo.com."

This potentially opens up the door to some kinds of type confusion attacks (does the above imply ownership of the joe@foo.com email?), but I'm not that concerned overall. We want it to be fairly rare that a user has to eyeball a cosign CLI invocation and decide whether it looks trustworthy.

However, it is important that we avoid the URI type confusion by performing some validation. Perhaps a "username" has an email SAN type, but we allow "invalid" emails that only have the username part, not the "@Domain" part.

@woodruffw
Copy link
Member

Do we need to stick the service name in the SAN at all? It should be clear from the "issuer" field in the cert. My understanding is that we did user@domain.com just to make the SAN more "email-like" when we were using the email SAN type (though @haydentherapper may have more context).

Hmm, maybe not. I think I was underthinking it 🙂

However, it is important that we avoid the URI type confusion by performing some validation. Perhaps a "username" has an email SAN type, but we allow "invalid" emails that only have the username part, not the "@Domain" part.

I think we want to avoid marking non-email SANs with the email type, since a strict x509v3 parser would be correct to reject "email" SANs that aren't consistent with RFC822:

When the subjectAltName extension contains an Internet mail address,
the address MUST be stored in the rfc822Name. The format of an
rfc822Name is a "Mailbox" as defined in Section 4.1.2 of [RFC2821].
A Mailbox has the form "Local-part@Domain". Note that a Mailbox has
no phrase (such as a common name) before it, has no comment (text
surrounded in parentheses) after it, and is not surrounded by "<" and
">". Rules for encoding Internet mail addresses that include
internationalized domain names are specified in Section 7.5.

(From https://datatracker.ietf.org/doc/html/rfc5280#section-4.2.1.6)

Instead, I think we probably want username SANs to be encoded as "Other Names", with a custom (Sigstore-specific) OID indicating that they're usernames.

From the same RFC:

   GeneralName ::= CHOICE {
        otherName                       [0]     OtherName,
        rfc822Name                      [1]     IA5String,
        dNSName                         [2]     IA5String,
        x400Address                     [3]     ORAddress,
        directoryName                   [4]     Name,
        ediPartyName                    [5]     EDIPartyName,
        uniformResourceIdentifier       [6]     IA5String,
        iPAddress                       [7]     OCTET STRING,
        registeredID                    [8]     OBJECT IDENTIFIER }

   OtherName ::= SEQUENCE {
        type-id    OBJECT IDENTIFIER,
        value      [0] EXPLICIT ANY DEFINED BY type-id }

@znewman01
Copy link
Contributor

Instead, I think we probably want username SANs to be encoded as "Other Names", with a custom (Sigstore-specific) OID indicating that they're usernames.

+1, I somehow overlooked that the first time around

@priyawadhwa
Copy link
Contributor

Hey @haydentherapper ! I brought this to the TSC and they requested it be broken down into smaller issues for each thing that needs to be done. Then they can prioritize into which should be GA blockers and which can be done post-GA.

Could you break this down into smaller issues?

@znewman01
Copy link
Contributor

Most of the above is just context/motivation; the actual work is pretty bite-sized.

Basically, for each type of identity we issue, just add a quick validity check on the format (e.g. regex match .*@.* for emails).

@priyawadhwa
Copy link
Contributor

Gotcha, so we just need to add in a regex for the following?

  • emails
  • service accounts
  • usernames

@haydentherapper
Copy link
Contributor Author

haydentherapper commented Sep 1, 2022

Thanks Zach, yep, that's pretty much it!

The other thing would be reading all entries in the log to see if there's any that would be affected by this change. You could consider this a blocker first to make sure no one is submitting requests that would be broken by this change, but it's highly unlikely at this point.

@dlorenc
Copy link
Member

dlorenc commented Sep 21, 2022

Are there any updates here? Are we still considering this as a blocker for 1.0?

@haydentherapper
Copy link
Contributor Author

Yes, because it's a breaking change. I'm not concerned about its impact on GA, just making a post 1.0 change. Planning on doing this very soon.

@dlorenc
Copy link
Member

dlorenc commented Sep 21, 2022

IMO its debatable on whether this is breaking or not, we're not changing the API and only disabling things that shouldn't happen anyway.

I'm -1 on this being a 1.0 stopper.

@znewman01
Copy link
Contributor

I just grabbed all the Fulcio certs (1m, about 5 GB) using this fulcio_dump.sh script so we can pretty easily tell if it's breaking. tl;dr it doesn't look like it for the obvious regexes.

We can pull out the SANs:

#!/usr/bin/env bash
tmpdir=$(mktemp -d)
# you want to do this in parallel, trust me
ls . | xargs -L 1 -P 32 bash -c "
    cat \$0/*.crt \
        | step certificate inspect --bundle --format json \
        | jq -c 'map(.extensions.subject_alt_name | if . == null then {} else . end)[]' \
        > $tmpdir/sans-\$0
    "
cat $tmpdir/sans > /tmp/sans.jsonl
rm -rf "$tmpdir"

And then do some ad hoc analysis:

$ cat /tmp/sans.jsonl | jq -c 'try keys catch "null"' | sort | uniq -c
   4 "null"
491387 ["email_addresses"]
740609 ["uniform_resource_identifiers"]

And test out regexes:

$ cat /tmp/sans.jsonl \
        | jq -r 'select(has("email_addresses")) | .email_addresses[0]' \
        | grep -Ev '.*@.*' \
        || echo 'everything matched this regex'
everything matched this regex
$ cat /tmp/sans.jsonl \
        | jq -r 'select(has("uniform_resource_locators")) | .uniform_resource_locators[0]' \
        | grep -Ev '.*://.*' \
        || echo 'everything matched this regex'
everything matched this regex

@haydentherapper
Copy link
Contributor Author

Working on this now, and making the change to the username format.

haydentherapper added a commit to haydentherapper/fulcio that referenced this issue Sep 26, 2022
This updates the username type to avoid the username subject format
looking like an email. Fulcio will now specify the subject in the
OtherName SAN, and the format will use a ! instead of @.

This required some custom ASN.1 marshalling and unmarshalling, since
crypto/x509 does not support the OtherName SAN.

This also adds enforcement that email subjects match a basic email
regex format, and that other types do not look like emails.

Fixes sigstore#716

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
haydentherapper added a commit that referenced this issue Sep 28, 2022
* Change username format, enforce identity format

This updates the username type to avoid the username subject format
looking like an email. Fulcio will now specify the subject in the
OtherName SAN, and the format will use a ! instead of @.

This required some custom ASN.1 marshalling and unmarshalling, since
crypto/x509 does not support the OtherName SAN.

This also adds enforcement that email subjects match a basic email
regex format, and that other types do not look like emails.

Fixes #716

Signed-off-by: Hayden Blauzvern <hblauzvern@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ga-candidate Proposed blocking issue for GA release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants