Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Use MSAL for managed identity authentication #25959

Draft
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

jiasli
Copy link
Member

@jiasli jiasli commented Mar 28, 2023

Related command
az login --identity

Description

Changes:

  1. Use msal.ManagedIdentity for managed identity authentication.

  2. Use PublicClientApplication.acquire_token_interactive(prompt="none") for Cloud Shell authentication.

  3. Limit the functionality of azure.cli.core._profile.Profile._create_credential and azure.cli.core.auth.identity.Identity to only user and service principal, because their implementation are very different from that of managed identity and Cloud Shell, which only talks to the local machine's metadata endpoint.

  4. Rename MSI in source code to managed identity. However, the text in az account list result is still preserved until further discussion - MSI, MSIClient-{id}, MSIObject-{id}, MSIResource-{id}.

Testing Guide

az account get-access-token --debug
az vm list --debug

@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Mar 28, 2023

❌AzureCLI-FullTest
️✔️acr
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️acs
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️advisor
️✔️latest
️✔️3.12
️✔️3.9
️✔️ams
️✔️latest
️✔️3.12
️✔️3.9
️✔️apim
️✔️latest
️✔️3.12
️✔️3.9
️✔️appconfig
️✔️latest
️✔️3.12
️✔️3.9
️✔️appservice
️✔️latest
️✔️3.12
️✔️3.9
️✔️aro
️✔️latest
️✔️3.12
️✔️3.9
️✔️backup
️✔️latest
️✔️3.12
️✔️3.9
️✔️batch
️✔️latest
️✔️3.12
️✔️3.9
️✔️batchai
️✔️latest
️✔️3.12
️✔️3.9
️✔️billing
️✔️latest
️✔️3.12
️✔️3.9
️✔️botservice
️✔️latest
️✔️3.12
️✔️3.9
️✔️cdn
️✔️latest
️✔️3.12
️✔️3.9
️✔️cloud
️✔️latest
️✔️3.12
️✔️3.9
️✔️cognitiveservices
️✔️latest
️✔️3.12
️✔️3.9
️✔️compute_recommender
️✔️latest
️✔️3.12
️✔️3.9
️✔️computefleet
️✔️latest
️✔️3.12
️✔️3.9
️✔️config
️✔️latest
️✔️3.12
️✔️3.9
️✔️configure
️✔️latest
️✔️3.12
️✔️3.9
️✔️consumption
️✔️latest
️✔️3.12
️✔️3.9
️✔️container
️✔️latest
️✔️3.12
️✔️3.9
️✔️containerapp
️✔️latest
️✔️3.12
️✔️3.9
️✔️core
️✔️2018-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️cosmosdb
️✔️latest
️✔️3.12
️✔️3.9
️✔️databoxedge
️✔️2019-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️dls
️✔️latest
️✔️3.12
️✔️3.9
️✔️dms
️✔️latest
️✔️3.12
️✔️3.9
️✔️eventgrid
️✔️latest
️✔️3.12
️✔️3.9
️✔️eventhubs
️✔️latest
️✔️3.12
️✔️3.9
️✔️feedback
️✔️latest
️✔️3.12
️✔️3.9
️✔️find
️✔️latest
️✔️3.12
️✔️3.9
️✔️hdinsight
️✔️latest
️✔️3.12
️✔️3.9
️✔️identity
️✔️latest
️✔️3.12
️✔️3.9
️✔️iot
️✔️2019-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️keyvault
️✔️2018-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️lab
️✔️latest
️✔️3.12
️✔️3.9
️✔️managedservices
️✔️latest
️✔️3.12
️✔️3.9
️✔️maps
️✔️latest
️✔️3.12
️✔️3.9
️✔️marketplaceordering
️✔️latest
️✔️3.12
️✔️3.9
️✔️monitor
️✔️latest
️✔️3.12
️✔️3.9
️✔️mysql
️✔️latest
️✔️3.12
️✔️3.9
️✔️netappfiles
️✔️latest
️✔️3.12
️✔️3.9
️✔️network
️✔️2018-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️policyinsights
️✔️latest
️✔️3.12
️✔️3.9
️✔️privatedns
️✔️latest
️✔️3.12
️✔️3.9
❌profile
❌latest
❌3.12
Type Test Case Error Message Line
Failed test_get_login self = <azure.cli.command_modules.profile.tests.latest.test_profile_custom.ProfileCommandTest testMethod=test_get_login>
profile_mock = <MagicMock name='Profile' spec='Profile' id='139714897818448'>

    @mock.patch('azure.cli.command_modules.profile.custom.Profile', autospec=True)
    def test_get_login(self, profile_mock):
        invoked = []
    
        def test_login(msi_port, identity_id=None):
            invoked.append(True)
    
        # mock the instance
        profile_instance = mock.MagicMock()
        profile_instance.login_with_managed_identity = test_login
        # mock the constructor
        profile_mock.return_value = profile_instance
    
        # action
        cmd = mock.MagicMock()
>       login(cmd, identity=True)

src/azure-cli/azure/cli/command_modules/profile/tests/latest/test_profile_custom.py:105: 
                                        

cmd = <MagicMock id='139714894488896'>, username = None, password = None
tenant = None, scopes = None, allow_no_subscriptions = False
use_device_code = False, service_principal = None, certificate = None
use_cert_sn_issuer = None, client_assertion = None, identity = True
client_id = None, object_id = None, resource_id = None

    def login(cmd, username=None, password=None, tenant=None, scopes=None, allow_no_subscriptions=False,
              # Device code flow
              use_device_code=False,
              # Service principal
              service_principal=None, certificate=None, use_cert_sn_issuer=None, client_assertion=None,
              # Managed identity
              identity=False, client_id=None, object_id=None, resource_id=None):
        """Log in to access Azure subscriptions"""
    
        # quick argument usage check
        if any([password, service_principal, tenant]) and identity:
            raise CLIError("usage error: '--identity' is not applicable with other arguments")
        if any([password, service_principal, username, identity]) and use_device_code:
            raise CLIError("usage error: '--use-device-code' is not applicable with other arguments")
        if use_cert_sn_issuer and not service_principal:
            raise CLIError("usage error: '--use-sn-issuer' is only applicable with a service principal")
        if service_principal and not username:
            raise CLIError('usage error: --service-principal --username NAME --password SECRET --tenant TENANT')
        if username and not service_principal and not identity:
            logger.warning(USERNAME_PASSWORD_DEPRECATION_WARNING)
    
        interactive = False
    
        profile = Profile(cli_ctx=cmd.cli_ctx)
    
        if identity:
            if in_cloud_console():
                return profile.login_in_cloud_shell()
>           return profile.login_with_managed_identity(client_id=client_id, object_id=object_id, resource_id=resource_id,
                                                       allow_no_subscriptions=allow_no_subscriptions)
E           TypeError: ProfileCommandTest.test_get_login..test_login() got an unexpected keyword argument 'client_id'

src/azure-cli/azure/cli/command_modules/profile/custom.py:146: TypeError
azure/cli/command_modules/profile/tests/latest/test_profile_custom.py:89
❌3.9
Type Test Case Error Message Line
Failed test_get_login self = <azure.cli.command_modules.profile.tests.latest.test_profile_custom.ProfileCommandTest testMethod=test_get_login>
profile_mock = <MagicMock name='Profile' spec='Profile' id='140484186223856'>

    @mock.patch('azure.cli.command_modules.profile.custom.Profile', autospec=True)
    def test_get_login(self, profile_mock):
        invoked = []
    
        def test_login(msi_port, identity_id=None):
            invoked.append(True)
    
        # mock the instance
        profile_instance = mock.MagicMock()
        profile_instance.login_with_managed_identity = test_login
        # mock the constructor
        profile_mock.return_value = profile_instance
    
        # action
        cmd = mock.MagicMock()
>       login(cmd, identity=True)

src/azure-cli/azure/cli/command_modules/profile/tests/latest/test_profile_custom.py:105: 
                                        

cmd = <MagicMock id='140484185912848'>, username = None, password = None
tenant = None, scopes = None, allow_no_subscriptions = False
use_device_code = False, service_principal = None, certificate = None
use_cert_sn_issuer = None, client_assertion = None, identity = True
client_id = None, object_id = None, resource_id = None

    def login(cmd, username=None, password=None, tenant=None, scopes=None, allow_no_subscriptions=False,
              # Device code flow
              use_device_code=False,
              # Service principal
              service_principal=None, certificate=None, use_cert_sn_issuer=None, client_assertion=None,
              # Managed identity
              identity=False, client_id=None, object_id=None, resource_id=None):
        """Log in to access Azure subscriptions"""
    
        # quick argument usage check
        if any([password, service_principal, tenant]) and identity:
            raise CLIError("usage error: '--identity' is not applicable with other arguments")
        if any([password, service_principal, username, identity]) and use_device_code:
            raise CLIError("usage error: '--use-device-code' is not applicable with other arguments")
        if use_cert_sn_issuer and not service_principal:
            raise CLIError("usage error: '--use-sn-issuer' is only applicable with a service principal")
        if service_principal and not username:
            raise CLIError('usage error: --service-principal --username NAME --password SECRET --tenant TENANT')
        if username and not service_principal and not identity:
            logger.warning(USERNAME_PASSWORD_DEPRECATION_WARNING)
    
        interactive = False
    
        profile = Profile(cli_ctx=cmd.cli_ctx)
    
        if identity:
            if in_cloud_console():
                return profile.login_in_cloud_shell()
>           return profile.login_with_managed_identity(client_id=client_id, object_id=object_id, resource_id=resource_id,
                                                       allow_no_subscriptions=allow_no_subscriptions)
E           TypeError: test_login() got an unexpected keyword argument 'client_id'

src/azure-cli/azure/cli/command_modules/profile/custom.py:146: TypeError
azure/cli/command_modules/profile/tests/latest/test_profile_custom.py:89
️✔️rdbms
️✔️latest
️✔️3.12
️✔️3.9
️✔️redis
️✔️latest
️✔️3.12
️✔️3.9
️✔️relay
️✔️latest
️✔️3.12
️✔️3.9
️✔️resource
️✔️2018-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️role
️✔️latest
️✔️3.12
️✔️3.9
️✔️search
️✔️latest
️✔️3.12
️✔️3.9
️✔️security
️✔️latest
️✔️3.12
️✔️3.9
️✔️servicebus
️✔️latest
️✔️3.12
️✔️3.9
️✔️serviceconnector
️✔️latest
️✔️3.12
️✔️3.9
️✔️servicefabric
️✔️latest
️✔️3.12
️✔️3.9
️✔️signalr
️✔️latest
️✔️3.12
️✔️3.9
️✔️sql
️✔️latest
️✔️3.12
️✔️3.9
️✔️sqlvm
️✔️latest
️✔️3.12
️✔️3.9
️✔️storage
️✔️2018-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️synapse
️✔️latest
️✔️3.12
️✔️3.9
️✔️telemetry
️✔️2018-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9
️✔️util
️✔️latest
️✔️3.12
️✔️3.9
️✔️vm
️✔️2018-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.12
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.12
️✔️3.9
️✔️latest
️✔️3.12
️✔️3.9

@azure-client-tools-bot-prd
Copy link

Hi @jiasli,
Since the current milestone time is less than 7 days, this pr will be reviewed in the next milestone.

@ghost ghost added Auto-Assign Auto assign by bot Core CLI core infrastructure labels Mar 28, 2023
@ghost ghost requested a review from yonzhan March 28, 2023 09:24
@ghost ghost assigned jiasli Mar 28, 2023
Comment on lines 108 to 129
def _normalize_expires_on(expires_on):
"""
The expires_on field returned by managed identity differs on Azure VM (epoch str) and App Service (datetime str).
Normalize to epoch int.
"""
try:
# Treat as epoch string "1605238724"
expires_on_epoch_int = int(expires_on)
except ValueError:
import datetime

# Python 3.6 doesn't recognize timezone as +00:00.
# These lines can be dropped after Python 3.6 is dropped.
# https://stackoverflow.com/questions/30999230/how-to-parse-timezone-with-colon
if expires_on[-3] == ":":
expires_on = expires_on[:-3] + expires_on[-2:]

# Treat as datetime string "11/05/2021 15:18:31 +00:00"
expires_on_epoch_int = int(datetime.datetime.strptime(expires_on, '%m/%d/%Y %H:%M:%S %z').timestamp())

logger.debug("Normalize expires_on: %r -> %r", expires_on, expires_on_epoch_int)
return expires_on_epoch_int
Copy link
Member Author

@jiasli jiasli Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are now calling the new API on App Service: https://learn.microsoft.com/en-us/azure/app-service/overview-managed-identity?tabs=portal%2Chttp#connect-to-azure-services-in-app-code

So I assume expires_on is now an epoch int, but more test is still required to verify if this logic is still needed.

Comment on lines 71 to 95
def set_token(self):
import traceback
from azure.cli.core.azclierror import AzureConnectionError, AzureResponseError
try:
super().set_token()
except requests.exceptions.ConnectionError as err:
logger.debug('throw requests.exceptions.ConnectionError when doing MSIAuthentication: \n%s',
traceback.format_exc())
raise AzureConnectionError('Failed to connect to MSI. Please make sure MSI is configured correctly '
'and check the network connection.\nError detail: {}'.format(str(err)))
except requests.exceptions.HTTPError as err:
logger.debug('throw requests.exceptions.HTTPError when doing MSIAuthentication: \n%s',
traceback.format_exc())
try:
raise AzureResponseError('Failed to connect to MSI. Please make sure MSI is configured correctly.\n'
'Get Token request returned http error: {}, reason: {}'
.format(err.response.status, err.response.reason))
except AttributeError:
raise AzureResponseError('Failed to connect to MSI. Please make sure MSI is configured correctly.\n'
'Get Token request returned: {}'.format(err.response))
except TimeoutError as err:
logger.debug('throw TimeoutError when doing MSIAuthentication: \n%s',
traceback.format_exc())
raise AzureConnectionError('MSI endpoint is not responding. Please make sure MSI is configured correctly.\n'
'Error detail: {}'.format(str(err)))
Copy link
Member Author

@jiasli jiasli Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep such error handler?

Comment on lines 683 to 781
system_assigned = 'MSI' # Not necessarily system-assigned. It merely means no ID is provided.
user_assigned_client_id = 'MSIClient'
user_assigned_object_id = 'MSIObject'
user_assigned_resource_id = 'MSIResource'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to rename these attributes.

return build_sdk_access_token(result)


class CloudShellCredential:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't use managed identity for Cloud Shell authentication

Comment from MSAL team:

The old implementation groups Cloud Shell and other managed identity in similar API only because their wire protocol happened to be similar. But that should be an implementation detail.

The meanings of them are actually quite different. Other managed identity are fundamentally confidential clients such as service principal. The Cloud Shell identity is a user account. Cloud Shell merely acts as a "broker" to obtain token for the user account. For what it's worth, the windows broker (WAM) in MSAL Python supports the same acquire_token_interactive(..., prompt="none") usage to obtain a token for the already-signed-in user without prompt.

The new MSAL API is designed this way so that existing apps building on top of acquire_token_interactive(...) could smoothly utilize Cloud Shell or WAM, with no/few source code change. Just as Azure CLI needed minimal change to pick up WAM.

It is just unfortunate that Azure CLI had that az login --identity usage pattern and now stuck with it, so you can't fully reap the benefit, but that is not enough a reason for MSAL Python to revert to the old API.

@jiasli jiasli requested review from dcaro and Jacekey23 March 28, 2023 09:42
Comment on lines 218 to 219
user = _USER_ASSIGNED_IDENTITY if '/Microsoft.ManagedIdentity/userAssignedIdentities' in resource_id \
else _SYSTEM_ASSIGNED_IDENTITY
Copy link
Member Author

@jiasli jiasli Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to test if xms_mirid exists in all types of managed identities' access tokens. (#13188)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why would you need to know whether the token acquired is for a system-assigned managed identity or a user-assigned one?

From MSAL's perspective, it just use the input parameters (which would indeed contain a client_id/resource_id/etc when and only when using a user-assigned identity). Once a token is returned, the job is done. Why would a caller need to care whether it is a system-assigned or user-assigned based on the token outcome?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure CLI has been showing system-assigned managed identity or user-assigned one since managed identity was initially supported. I don't tend to change its behavior. However, the logic for determining system-assigned or user-assigned is incorrect.

See

@yonzhan
Copy link
Collaborator

yonzhan commented Mar 28, 2023

MSAL adoption

Comment on lines 570 to 652
if not msi_info:
msi_info = account[_SUBSCRIPTION_NAME] # fall back to old persisting way
Copy link
Member Author

@jiasli jiasli Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In versions <= 2.0.50 (November 6, 2018), _SUBSCRIPTION_NAME is used to denote the managed identity ID type. This is super confusing, so _ASSIGNED_IDENTITY_INFO was added in #7744 and these lines are added as an adaptor. However, such logic is still difficult to maintain and can easily leads to unwanted code path. Even its creator admits:

the code is a bit messy here to support both old and new styles.
#7744 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove this adaptor, we can add such history notes:

[BREAKING CHANGE] No long be compatible with login profile created by az login --identity of Azure CLI versions <= 2.0.50. If you are updating from those versions, please run az login --identity again to refresh the login profile.

Comment on lines 558 to 563
# We no longer support login profile created by versions < 2.0.51, which uses _SUBSCRIPTION_NAME as
# _ASSIGNED_IDENTITY_INFO.
# https://github.com/Azure/azure-cli/pull/7744
if not identity_info:
raise CLIError(f'{_ASSIGNED_IDENTITY_INFO} property is missing from the current account. '
'Please run `az login --identity`.')
Copy link
Member Author

@jiasli jiasli Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have replaced the adaptor with a more general not-None check.

Copy link

azure-client-tools-bot-prd bot commented Nov 12, 2024

⚠️AzureCLI-BreakingChangeTest
⚠️profile
rule cmd_name rule_message suggest_message
⚠️ 1006 - ParaAdd login cmd login added parameter client_id
⚠️ 1006 - ParaAdd login cmd login added parameter object_id
⚠️ 1006 - ParaAdd login cmd login added parameter resource_id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auto-Assign Auto assign by bot Core CLI core infrastructure
Projects
None yet
3 participants