Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure ML deployment: setting low memory request not taking effect #27672

Open
kristofpanna opened this issue Oct 24, 2023 · 2 comments
Open

Azure ML deployment: setting low memory request not taking effect #27672

kristofpanna opened this issue Oct 24, 2023 · 2 comments
Labels
Auto-Assign Auto assign by bot bug This issue requires a change to an existing behavior in the product in order to be resolved. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning az ml Service Attention This issue is responsible by Azure service team.

Comments

@kristofpanna
Copy link

kristofpanna commented Oct 24, 2023

Describe the bug

Even if I set smaller than 500 MB memory request for the Azure ML deployment, at least 500 MB is always requested (as I understand, because of the storageinitializer init container).

Related command

az ml online-deployment create

Errors

(There is no error, but it does not work as expected.)

Issue script & Debug output

In Azure Machine Learning, there is an inference cluster named "reco-inference" which is a Azure Kubernetes cluster.

There is a custom instance type named "smallmemoryinstancetype" with 100Mi memory request, created like this:

kubectl apply -f smallmemory_instancetype.yaml

where the smallmemory_instancetype.yaml file contains:

apiVersion: amlarc.azureml.com/v1alpha1
kind: InstanceType
metadata:
    name: smallmemoryinstancetype
spec:
    resources:
    limits:
        cpu: "1"
        memory: "2Gi"
    requests:
        cpu: "10m"
        memory: "100Mi"

There is an azure ML environment: "machine-learning-recommendation-environment:12" (Linux, python version: 3.8).
There is a previously registered model: name: "modelname", version: 1. (Model artifact binary size: 78 mb.)
There is an endpoint named "endpointname" created like this:

az ml online-endpoint create --name endpointname --set compute=azureml:reco-inference

We deploy like this:

$azuremlModelId = "azureml:modelname:1"
az ml online-deployment create --name deploymentname -f deploymentConfigTest.yaml --set endpoint_name=endpointname --set model=$azuremlModelId --set environment=azureml:machine-learning-recommendation-environment:12

where deploymentConfigTest.yaml is:

type: kubernetes 

app_insights_enabled: true 

code_configuration: 

  code: .

  scoring_script: score.py 

request_settings: 

  request_timeout_ms: 3000 

  max_queue_wait_ms: 3000 

instance_type: smallmemoryinstancetype

instance_count: 1 

scale_settings: 

  type: default 

The deployment is successful. It appears in kubectl describe node as a pod. I can verify that the instance type is successfully set for the deployment (at the endpoint in Azure Machine Learning Studio).
instance_type_verif

When inspected with kubectl describe node, I can see the Memory Requests for the pod of my deployment.
It is exactly 500Mi.

Non-terminated Pods:          (13 in total)
  Namespace                   Name                                         CPU Requests  CPU Limits    Memory Requests  Memory Limits  Age
  ---------                   ----                                         ------------  ----------    ---------------  -------------  ---
  default                     deploymentname-endpointname-54d8bf5d5w9dz    110m (5%)     1100m (57%)   500Mi (10%)      2098Mi (45%)   18h
  ...

(I also verified that if I use an instance type with more than 500Mi memory request then more than 500Mi will be used, so the instance type setting itself is taking an effect on memory requests.)

As I understand, this is because of the storageinitializer init container (that is in the same pod as my inference server), where memory request is a fixed 500Mi amount.
For example, in the pod settings inspected in Lens, I see these settings for the init container:

initContainers:
- name: storageinitializer-modeldata
  ...
  resources:
    limits:
      cpu: 100m
      memory: 500Mi
    requests:
      cpu: 100m
      memory: 500Mi
  ...

Expected behavior

I would expect the Memory Requests for the pod of my deployment to be set to a smaller amount than 500Mi.

Environment Summary

OS: Azure DevOps windows-latest agent (Windows Server 2022 with Visual Studio 2022)

azure-cli 2.53.1

core 2.53.1
telemetry 1.1.0

Extensions:
azure-devops 0.26.0
connectedk8s 1.5.2
k8s-extension 1.5.1
ml 2.21.1

Dependencies:
msal 1.24.0b2
azure-mgmt-resource 23.1.0b2

Python location 'C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\python.exe'
Extensions directory 'C:\Users\PannaKristof.azure\cliextensions'

Python (Windows) 3.10.10 (tags/v3.10.10:aad5f6a, Feb 7 2023, 17:05:00) [MSC v.1929 32 bit (Intel)]

Additional context

My question is: why is the memory request I set on my deployment not taking an effect on the init container also?
Is there any other (maybe completely different) solution to achieve smaller memory request for the init container? (For example, it would be ideal if I could set the request size for the init container also dynamically when running the online-deployment create command.)
The reason for my question: we would like to deploy several small deployments, but it is very wasteful to use 500 MB memory for each of them (when eg. 65Mi would be sufficient).

(Or is it possible that the init container actually needs this much space to work and I should not try to set the memory request?)

Thank you in advance for your help!

@kristofpanna kristofpanna added the bug This issue requires a change to an existing behavior in the product in order to be resolved. label Oct 24, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. Auto-Assign Auto assign by bot Service Attention This issue is responsible by Azure service team. Machine Learning az ml labels Oct 24, 2023
@yonzhan
Copy link
Collaborator

yonzhan commented Oct 24, 2023

Thank you for opening this issue, we will look into it.

@microsoft-github-policy-service
Copy link
Contributor

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auto-Assign Auto assign by bot bug This issue requires a change to an existing behavior in the product in order to be resolved. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning az ml Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

2 participants