Skip to content
This repository has been archived by the owner on Jul 12, 2023. It is now read-only.

implement token signing key rotation #348

Merged
merged 1 commit into from
Aug 27, 2020

Conversation

mikehelmick
Copy link
Contributor

@mikehelmick mikehelmick commented Aug 25, 2020

Fixes #94

Proposed Changes

  • Allow the token signing key to be rotated by the server operator. This is totally independent from the certificate signing keys

Release Note

Verification server operators can rotate their token signing key. `TOKEN_SIGNING_KEY` and `TOKEN_SIGNING_KEY_ID` are now array based env vars. They must be the same length. The first items in the lists represents the active key/kid and the remaining entries are allowed to validate.

@google-cla google-cla bot added the cla: yes Auto: added by CLA bot when all committers have signed a CLA. label Aug 25, 2020
VerificationTokenDuration time.Duration `env:"VERIFICATION_TOKEN_DURATION,default=24h"`
TokenSigningKey string `env:"TOKEN_SIGNING_KEY,required"`
TokenSigningKeyID string `env:"TOKEN_SIGNING_KEY_ID,default=v1"`
TokenIssuer string `env:"TOKEN_ISSUER,default=diagnosis-verification-example"`
// In order to rotate keys, one sets these values to the 'old' key ID allowing it to still verify
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This backs us into a world where there can only ever be two valid keys. While I understand that's our current situation, given the frequency of change in this project, I wonder if it makes sense to have an array instead where the "first" element is the newest key and all other ones are "old"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then you have to have parallel arrays w/ kid and key ref.

The other option would be to bring this up one level where we take a cryptoKey and allow all non-destroyed versions for validation and the primary for signing.
But - I then we have to support that across all our KMS implementations

Needing more than 2 valid keys means that you're rotating twice in a 24 hour period which hopefully isn't ever necessary.

If we want this to be more flexible, I think we can push it into the database and manage it like the per-realm signing keys.

We do need to have a general 'admin' page in the UI still anyway, to allow for realm creation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you. I just know the moment we do this, we'll need to support a third...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will switch to arrays in the config

@mikehelmick
Copy link
Contributor Author

/hold

@mikehelmick
Copy link
Contributor Author

/unhold

rebased + moved to array based implementation

// This represents the keys that are allowed to be used to verify tokens,
// the TokenSigningKey/TokenSigningKeyID.
func (c *APIServerConfig) AllowedTokenPublicKeys() map[string]string {
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why the curlies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

visual scoping of the read lock section

func (c *APIServerConfig) AllowedTokenPublicKeys() map[string]string {
{
c.mu.RLock()
if len(c.allowedTokenPublicKeys) != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I feel like > 0 conveys better

return c.allowedTokenPublicKeys
}

c.allowedTokenPublicKeys = make(map[string]string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c.allowedTokenPublicKeys = make(map[string]string)
c.allowedTokenPublicKeys = make(map[string]string, len(c.TokenSigning.TokenSigningKeyID))

@@ -79,6 +111,10 @@ func (c *APIServerConfig) Validate() error {
}
}

if err := c.TokenSigning.Validate(); err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return err
return fmt.Errorf("failed to validate signing tokens: %w", err)

TokenSigningKeyID string `env:"TOKEN_SIGNING_KEY_ID, default=v1"`
TokenIssuer string `env:"TOKEN_ISSUER, default=diagnosis-verification-example"`
TokenSigningKey []string `env:"TOKEN_SIGNING_KEY, required"`
TokenSigningKeyID []string `env:"TOKEN_SIGNING_KEY_ID, default=v1"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keys? and IDS?

}

func (t *TokenSigningConfig) ActiveKey() string {
return t.TokenSigningKey[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic if TokenSigningKey is empty

}

func (t *TokenSigningConfig) ActiveKeyID() string {
return t.TokenSigningKeyID[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Panic


func (t *TokenSigningConfig) Validate() error {
if len(t.TokenSigningKey) != len(t.TokenSigningKeyID) {
return fmt.Errorf("TOKEN_SIGNING_KEY and TOKEN_SIGNING_KEY_ID must be lists of the same length")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keys, ids := len(t.TokenSigningKey), len(t.TokenSigningKeyID); keys != ids {
  return fmt.Errorf("TOKEN_SIGNING_KEY (%d) and TOKEN_SIGNING_KEY_ID (%d) must be the same length", keys, ids)
}

func (c *Controller) validateToken(ctx context.Context, verToken string, publicKey crypto.PublicKey) (string, *database.Subject, error) {
// A map of valid 'kid' values is supported.
// If the token is valid the token id (`tid') and subject (`sub`) claims are returned.
func (c *Controller) validateToken(ctx context.Context, verToken string, publicKeys map[string]crypto.PublicKey) (string, *database.Subject, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine for now, but this looks awfully like the publicKeyCache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's populated from the public key cache - see certificate.go, line 63

stats.Record(ctx, c.metrics.CertificateErrors.M(1))
c.h.RenderJSON(w, http.StatusInternalServerError, api.InternalError())
return
allowedPublicKeys := make(map[string]crypto.PublicKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
allowedPublicKeys := make(map[string]crypto.PublicKey)
allowedPublicKeys := make(map[string]crypto.PublicKey, len(c.config.AllowedTokenPublicKeys()))

Copy link
Contributor

@jeremyfaller jeremyfaller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine w/ this if Seth is.

func (c *APIServerConfig) AllowedTokenPublicKeys() map[string]string {
{
c.mu.RLock()
if len(c.allowedTokenPublicKeys) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RLock is not necessarily released. If, for example, c=nil, you panic, and don't release the lock. More robust is:

func() {
c.mu.RLock()
defer c.mu.RUnlock()
if .....
}()

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jeremyfaller, mikehelmick

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jeremyfaller,mikehelmick]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-robot google-oss-robot merged commit a3954b3 into google:main Aug 27, 2020
@jeremyfaller
Copy link
Contributor

Shoot, I might have inadvertently caused an early merge. My mistake. I will clean up if Mike's too busy. (Sorry, for the noob mistake.)

@mikehelmick mikehelmick deleted the issue94 branch August 27, 2020 16:18
@google google locked and limited conversation to collaborators Oct 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes Auto: added by CLA bot when all committers have signed a CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Token key rotation
4 participants