Azure ML deployment: setting low memory request not taking effect #27672

kristofpanna · 2023-10-24T13:13:04Z

Describe the bug

Even if I set smaller than 500 MB memory request for the Azure ML deployment, at least 500 MB is always requested (as I understand, because of the storageinitializer init container).

Related command

az ml online-deployment create

Errors

(There is no error, but it does not work as expected.)

Issue script & Debug output

In Azure Machine Learning, there is an inference cluster named "reco-inference" which is a Azure Kubernetes cluster.

There is a custom instance type named "smallmemoryinstancetype" with 100Mi memory request, created like this:

kubectl apply -f smallmemory_instancetype.yaml

where the smallmemory_instancetype.yaml file contains:

apiVersion: amlarc.azureml.com/v1alpha1
kind: InstanceType
metadata:
    name: smallmemoryinstancetype
spec:
    resources:
    limits:
        cpu: "1"
        memory: "2Gi"
    requests:
        cpu: "10m"
        memory: "100Mi"

There is an azure ML environment: "machine-learning-recommendation-environment:12" (Linux, python version: 3.8).
There is a previously registered model: name: "modelname", version: 1. (Model artifact binary size: 78 mb.)
There is an endpoint named "endpointname" created like this:

az ml online-endpoint create --name endpointname --set compute=azureml:reco-inference

We deploy like this:

$azuremlModelId = "azureml:modelname:1"
az ml online-deployment create --name deploymentname -f deploymentConfigTest.yaml --set endpoint_name=endpointname --set model=$azuremlModelId --set environment=azureml:machine-learning-recommendation-environment:12

where deploymentConfigTest.yaml is:

type: kubernetes 

app_insights_enabled: true 

code_configuration: 

  code: .

  scoring_script: score.py 

request_settings: 

  request_timeout_ms: 3000 

  max_queue_wait_ms: 3000 

instance_type: smallmemoryinstancetype

instance_count: 1 

scale_settings: 

  type: default

The deployment is successful. It appears in kubectl describe node as a pod. I can verify that the instance type is successfully set for the deployment (at the endpoint in Azure Machine Learning Studio).

When inspected with kubectl describe node, I can see the Memory Requests for the pod of my deployment.
It is exactly 500Mi.

Non-terminated Pods:          (13 in total)
  Namespace                   Name                                         CPU Requests  CPU Limits    Memory Requests  Memory Limits  Age
  ---------                   ----                                         ------------  ----------    ---------------  -------------  ---
  default                     deploymentname-endpointname-54d8bf5d5w9dz    110m (5%)     1100m (57%)   500Mi (10%)      2098Mi (45%)   18h
  ...

(I also verified that if I use an instance type with more than 500Mi memory request then more than 500Mi will be used, so the instance type setting itself is taking an effect on memory requests.)

As I understand, this is because of the storageinitializer init container (that is in the same pod as my inference server), where memory request is a fixed 500Mi amount.
For example, in the pod settings inspected in Lens, I see these settings for the init container:

initContainers:
- name: storageinitializer-modeldata
  ...
  resources:
    limits:
      cpu: 100m
      memory: 500Mi
    requests:
      cpu: 100m
      memory: 500Mi
  ...

Expected behavior

I would expect the Memory Requests for the pod of my deployment to be set to a smaller amount than 500Mi.

Environment Summary

OS: Azure DevOps windows-latest agent (Windows Server 2022 with Visual Studio 2022)

azure-cli 2.53.1

core 2.53.1
telemetry 1.1.0

Extensions:
azure-devops 0.26.0
connectedk8s 1.5.2
k8s-extension 1.5.1
ml 2.21.1

Dependencies:
msal 1.24.0b2
azure-mgmt-resource 23.1.0b2

Python location 'C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\python.exe'
Extensions directory 'C:\Users\PannaKristof.azure\cliextensions'

Python (Windows) 3.10.10 (tags/v3.10.10:aad5f6a, Feb 7 2023, 17:05:00) [MSC v.1929 32 bit (Intel)]

Additional context

My question is: why is the memory request I set on my deployment not taking an effect on the init container also?
Is there any other (maybe completely different) solution to achieve smaller memory request for the init container? (For example, it would be ideal if I could set the request size for the init container also dynamically when running the online-deployment create command.)
The reason for my question: we would like to deploy several small deployments, but it is very wasteful to use 500 MB memory for each of them (when eg. 65Mi would be sufficient).

(Or is it possible that the init container actually needs this much space to work and I should not try to set the memory request?)

Thank you in advance for your help!

The text was updated successfully, but these errors were encountered:

yonzhan · 2023-10-24T13:13:16Z

Thank you for opening this issue, we will look into it.

microsoft-github-policy-service · 2023-10-24T13:14:03Z

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.

kristofpanna added the bug This issue requires a change to an existing behavior in the product in order to be resolved. label Oct 24, 2023

microsoft-github-policy-service bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. Auto-Assign Auto assign by bot Service Attention This issue is responsible by Azure service team. Machine Learning az ml labels Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure ML deployment: setting low memory request not taking effect #27672

Azure ML deployment: setting low memory request not taking effect #27672

kristofpanna commented Oct 24, 2023 •

edited

Loading

yonzhan commented Oct 24, 2023

microsoft-github-policy-service bot commented Oct 24, 2023

Azure ML deployment: setting low memory request not taking effect #27672

Azure ML deployment: setting low memory request not taking effect #27672

Comments

kristofpanna commented Oct 24, 2023 • edited Loading

Describe the bug

Related command

Errors

Issue script & Debug output

Expected behavior

Environment Summary

Additional context

yonzhan commented Oct 24, 2023

microsoft-github-policy-service bot commented Oct 24, 2023

kristofpanna commented Oct 24, 2023 •

edited

Loading