From 820ad9cef77091bf71516c8941d387df1add7acf Mon Sep 17 00:00:00 2001 From: Daniel Goldstein Date: Wed, 24 May 2023 14:26:52 -0400 Subject: [PATCH] [auth] IDP access tokens over hail-minted tokens --- rfc/0000-hailctl-oauth.rst | 210 +++++++++++++++++++++++++++++++++++++ 1 file changed, 210 insertions(+) create mode 100644 rfc/0000-hailctl-oauth.rst diff --git a/rfc/0000-hailctl-oauth.rst b/rfc/0000-hailctl-oauth.rst new file mode 100644 index 0000000..eb1a3c3 --- /dev/null +++ b/rfc/0000-hailctl-oauth.rst @@ -0,0 +1,210 @@ +============== +OAuth 2.0 Authorization in the Hail Service +============== + +.. author:: Daniel Goldstein +.. date-accepted:: Leave blank. This will be filled in when the proposal is accepted. +.. ticket-url:: Leave blank. This will eventually be filled with the + ticket URL which will track the progress of the + implementation of the feature. +.. implemented:: Leave blank. This will be filled in with the first Hail version which + implements the described feature. +.. header:: This proposal is `discussed at this pull request `_. + **After creating the pull request, edit this file again, update the + number in the link, and delete this bold sentence.** +.. sectnum:: +.. contents:: +.. role:: python(code) + +Motivation +========== + +This proposal focuses on the way by which users of Hail Batch and Hail Query-on-Batch +(from now on referred to as the Hail Service) authorize programmatic access. + +The Hail Service authenticates users using the OAuth2 protocol, relying on either +GCP IAM or Azure AD as the identity providers. However, while the Hail Service +relies on these identity providers for authentication, it does *not* use them +to authorize access to Hail APIs. The Hail ``auth`` service acts as an Authorization +Server, minting long-lived tokens during the OAuth2 flow that are persisted +on user machines. Minting our own tokens imposes a maintenance and security burden +on the Hail team and any operators of a Hail Service. + +This proposal deprecates the use of hail-minted tokens in favor of using +access tokens from the identity providers listed above to authorize API access. +This removes the security burden of minting and protecting our own authorization +tokens while reducing code complexity as cloud access tokens are already +widespread within the hail codebase. + +Proposed Change Specification +============================= + +Currently, requests to the Hail APIs send a hail-minted bearer token in the +``Authorization`` header of HTTP requests. This token is stored in a well-known +location on the user's disk. +For user machines, this file is persisted during the login flow ``hailctl auth login``. +For use in Batch jobs, the tokens are stored in Kubernetes secrets and delivered +to the Batch Worker as part of the job spec. + +This proposal adds the ability for HTTP requests from hail clients to send +OAuth2 access tokens in the ``Authorization`` header instead of hail-minted +tokens. The ``auth`` service will: + +- Assert the validity, expiration, and audience of access tokens, and associate + them with users of the system. The ``auth`` service will need to broaden its + supported audience from just its own OAuth2 clients to include the IDs of + owned service accounts. +- Support hail-minted bearer tokens for backwards compatibility with old clients. + + +Hail clients will be updated to use access tokens in requests to Hail APIs. How +they do so is described in the following subsections. + + +User Machine Configuration +---------------------------------- + +Instead of depositing a ``tokens.json`` file during the login flow, +``hailctl auth login`` will instead result in the following file placed in the +user's configuration directory. + +.. + { + "idp": "Google" | "Microsoft", + "oauth_config_file": String | null, + } + +This file contains the identity provider the user used to login to the Hail +Service and a local OAuth2 client credential file issued by the Hail Service +at that identity provider. This client credential will be used in future requests +by the client to obtain scoped access tokens from the identity provider that are +intended for the Hail Service. + +For further information on the details of the OAuth2 flow, see the hailctl auth section. + +If a user does not reauthenticate after updating their hail version, +the client will continue to use extant ``tokens.json`` file. + + +Batch Job Configuration +------------------------------- +Batch jobs do not authenticate through an OAuth2 flow in the way that human users do. +The service account keys or metadata server available in batch jobs both provide +ways to easily obtain access tokens. All that the job needs to know is which identity +provider it should use. Batch jobs will then be provided with the `HAIL_IDENTITY_PROVIDER` +environment variable which is interpreted by the client application as the following +identity config: ``{"idp": "$HAIL_IDENTITY_PROVIDER", "oauth_config_file": null}``. + + +User Login Flow +------------------ + +Currently, ``hailctl auth login`` performs a sort of mixed desktop and server +OAuth2 login flow, which occurs in the following sequence: + +1. User executes ``hailctl auth login`` via the command line +2. The user's machine prompts the hail ``auth`` service to initiate a login flow + by making a request to ``/api/v1alpha/login``. The ``auth`` service responds + with an authorization URL that ``hailctl`` then opens in a browser. +3. The user provides authentication and user consent +4. The OAuth2 provider authenticates the user and sends a callack to ``localhost`` + with an authorization code. +5. ``hailctl`` sends that authorization code to the ``auth`` service, which uses + it to complete the OAuth flow, receiving an ID token, and access token and a refresh token. +6. The ``auth`` service uses the ID token to identify the user, assert that the + user has an account with the Service. +7. The ``auth`` service mints a token that it sends in the response to ``hailctl``. +8. ``hailctl`` persists the token for future authorization of API calls to the Service. + + +In order for ``hailctl`` to request access tokens for the Hail Service, it must have +local knowledge of: + +1. The client ID of the OAuth client with which it completed the OAuth2 flow. +2. The client secret of the OAuth client with which it completed the OAuth2 flow. +3. The ``refresh_token`` returned in the final step of the OAuth2 flow. + +Steps 7 and 8 can be adapted in the above flow to present ``hailctl`` with these +values, which it can then persist to obtain future access tokens. This would +replace the hail-minted token on the user's disk with the above 3 values. + +The programmatic OAuth2 flow will use a different OAuth2 client than that used +in the typical Web flow, as client secrets stored on user devices are *not considered secret*. +Flows that initiate from user devices can use PKCE to compensate for the decreased security. *shrugs* +^^^ Citations for above. + + +Effect and Interactions +----------------------- + +It is worth comparing the privileges obtained in both the current and proposed scenario +to determine if there are any increased risks under the new regime. + +For hail-minted access tokens: +- An attacker who obtains a token can fully impersonate a user to the Hail Service +- The token is *only* authorized to access the Hail Service +- Tokens can be explicitly revoked by the user by executing ``hailctl auth logout`` + +For hail-audience client secret: +- An attacker can just as easily access the client secret as they can the hail tokens. + The attacker can then generate access tokens. +- The audience claim of these access tokens will be the Hail Service, so these + tokens can only be used against the Hail Service. +- Unlike the hail-minted tokens, the Bearer token in the requests are short-lived + access tokens. So any access tokens that might be leaked are unlikely to pose + a security risk. +- The client can dynamically configure the validity period for access tokens it + generates. +- The credentials can be invalidated by the user revoking the refresh token. This + will be a side effect of ``hailctl auth logout``. + + +Alternatives +------------ + +An alternative to persisting a hail-owned client secret on the user's machine +is to use the latent credentials from ``gcloud`` Application-Default Credentials. +However, this is seen as an abuse of the OAuth2 model. Using Application Default +Credentials would require that the ``auth`` service accept tokens with the +``gcloud`` audience claim. This would obviate the need to authenticate with the +Hail Service at all, and any entity with a gcloud-generated user access token +would be able to impersonate the user to the Hail Service. Additionally, the +Hail Service, if compromised, could impersonate the user to Google APIs or any +other API that accepted the ``gcloud`` audience claim. + +Another alternative is to simply not change our authorization model. Doing nothing +would leave Hail Service operators with the management of token secrets. It would +also make more difficult the integration of hail services inside other +environments that use access-token based authentication such as the Terra platform. + + +Unresolved Questions +-------------------- + +It is as of yet unclear whether the ``hailctl`` OAuth flow should occur exclusively +on the client device or whether it should be performed in tandem with the +``auth`` service. Either scenario, at least in GCP, requires persistence of the +client ID/secret on the user's machine anyway to use the refresh token. The +benefit of using the auth server is it can provide early feedback on whether the +user account is registered with the Hail Service. I am not aware of security +vulnerabilities involved with this flow, aside from the concern that the Hail Service +receiving the authorization code means it is then privileged to generate access tokens +on behalf of the user. This is not against the OAuth2 design, and these OAuth2 +tokens would not be authorized to do much other than obtain information about the +user's identity, but it is not strictly necessary in our identity model. + +It is as of yet unclear whether regular rotation of client secrets stored on +client devices should be performed. If that should be the case, we could do so +without much effort because the Hail Service distributes the client secrets in +the first place. We would simply need to configure the ``hailctl`` client to reinitiate +a login flow when the credential expires. + + +Endorsements +------------- +(Optional) This section provides an opportunity for any third parties to express their +support for the proposal, and to say why they would like to see it adopted. +It is not mandatory for have any endorsements at all, but the more substantial +the proposal is, the more desirable it is to offer evidence that there is +significant demand from the community. This section is one way to provide +such evidence.