We newly introduced online endpoints and batch endpoints as v2 concepts. There are several deployment funnels such as managed online endpoints, kubernetes online endpoints (including Azure Kubernetes Services and Arc-enabled Kubernetes) in v2, and Azure Container Instances (ACI) and Kubernetes Services (AKS) webservices in v1. In this article, we'll focus on the comparison of deploying to ACI webservices (v1) and managed online endpoints (v2).
Examples in this article show how to:
- Deploy your model to Azure
- Score using the endpoint
- Delete the webservice/endpoint
- SDK v1
-
Configure a model, an environment, and a scoring script:
# configure a model. example for registering a model from azureml.core.model import Model model = Model.register(ws, model_name="bidaf_onnx", model_path="./model.onnx") # configure an environment from azureml.core import Environment env = Environment(name='myenv') python_packages = ['nltk', 'numpy', 'onnxruntime'] for package in python_packages: env.python.conda_dependencies.add_pip_package(package) # configure an inference configuration with a scoring script from azureml.core.model import InferenceConfig inference_config = InferenceConfig( environment=env, source_directory="./source_dir", entry_script="./score.py", )
-
Configure and deploy an ACI webservice:
from azureml.core.webservice import AciWebservice # defince compute resources for ACI deployment_config = AciWebservice.deploy_configuration( cpu_cores=0.5, memory_gb=1, auth_enabled=True ) # define an ACI webservice service = Model.deploy( ws, "myservice", [model], inference_config, deployment_config, overwrite=True, ) # create the service service.wait_for_deployment(show_output=True)
-
For more information on registering models, see Register a model from a local file.
-
SDK v2
-
Configure a model, an environment, and a scoring script:
from azure.ai.ml.entities import Model # configure a model model = Model(path="../model-1/model/sklearn_regression_model.pkl") # configure an environment from azure.ai.ml.entities import Environment env = Environment( conda_file="../model-1/environment/conda.yml", image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210727.v1", ) # configure an inference configuration with a scoring script from azure.ai.ml.entities import CodeConfiguration code_config = CodeConfiguration( code="../model-1/onlinescoring", scoring_script="score.py" )
-
Configure and create an online endpoint:
import datetime from azure.ai.ml.entities import ManagedOnlineEndpoint # create a unique endpoint name with current datetime to avoid conflicts online_endpoint_name = "endpoint-" + datetime.datetime.now().strftime("%m%d%H%M%f") # define an online endpoint endpoint = ManagedOnlineEndpoint( name=online_endpoint_name, description="this is a sample online endpoint", auth_mode="key", tags={"foo": "bar"}, ) # create the endpoint: ml_client.begin_create_or_update(endpoint)
-
Configure and create an online deployment:
from azure.ai.ml.entities import ManagedOnlineDeployment # define a deployment blue_deployment = ManagedOnlineDeployment( name="blue", endpoint_name=online_endpoint_name, model=model, environment=env, code_configuration=code_config, instance_type="Standard_F2s_v2", instance_count=1, ) # create the deployment: ml_client.begin_create_or_update(blue_deployment) # blue deployment takes 100 traffic endpoint.traffic = {"blue": 100} ml_client.begin_create_or_update(endpoint)
-
For more information on concepts for endpoints and deployments, see What are online endpoints?
-
SDK v1
import json data = { "query": "What color is the fox", "context": "The quick brown fox jumped over the lazy dog.", } data = json.dumps(data) predictions = service.run(input_data=data) print(predictions)
-
SDK v2
# test the endpoint (the request will route to blue deployment as set above) ml_client.online_endpoints.invoke( endpoint_name=online_endpoint_name, request_file="../model-1/sample-request.json", ) # test the specific (blue) deployment ml_client.online_endpoints.invoke( endpoint_name=online_endpoint_name, deployment_name="blue", request_file="../model-1/sample-request.json", )
-
SDK v1
service.delete()
-
SDK v2
ml_client.online_endpoints.begin_delete(name=online_endpoint_name)
For more information, see
v2 docs:
v1 docs: