Azure ML deployment: setting low memory request not taking effect #27672
Labels
Auto-Assign
Auto assign by bot
bug
This issue requires a change to an existing behavior in the product in order to be resolved.
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
Machine Learning
az ml
Service Attention
This issue is responsible by Azure service team.
Describe the bug
Even if I set smaller than 500 MB memory request for the Azure ML deployment, at least 500 MB is always requested (as I understand, because of the
storageinitializer
init container).Related command
az ml online-deployment create
Errors
(There is no error, but it does not work as expected.)
Issue script & Debug output
In Azure Machine Learning, there is an inference cluster named "reco-inference" which is a Azure Kubernetes cluster.
There is a custom instance type named "smallmemoryinstancetype" with 100Mi memory request, created like this:
where the
smallmemory_instancetype.yaml
file contains:There is an azure ML environment: "machine-learning-recommendation-environment:12" (Linux, python version: 3.8).
There is a previously registered model: name: "modelname", version: 1. (Model artifact binary size: 78 mb.)
There is an endpoint named "endpointname" created like this:
We deploy like this:
where
deploymentConfigTest.yaml
is:The deployment is successful. It appears in
kubectl describe node
as a pod. I can verify that the instance type is successfully set for the deployment (at the endpoint in Azure Machine Learning Studio).When inspected with
kubectl describe node
, I can see the Memory Requests for the pod of my deployment.It is exactly 500Mi.
(I also verified that if I use an instance type with more than 500Mi memory request then more than 500Mi will be used, so the instance type setting itself is taking an effect on memory requests.)
As I understand, this is because of the storageinitializer init container (that is in the same pod as my inference server), where memory request is a fixed 500Mi amount.
For example, in the pod settings inspected in Lens, I see these settings for the init container:
Expected behavior
I would expect the Memory Requests for the pod of my deployment to be set to a smaller amount than 500Mi.
Environment Summary
OS: Azure DevOps windows-latest agent (Windows Server 2022 with Visual Studio 2022)
azure-cli 2.53.1
core 2.53.1
telemetry 1.1.0
Extensions:
azure-devops 0.26.0
connectedk8s 1.5.2
k8s-extension 1.5.1
ml 2.21.1
Dependencies:
msal 1.24.0b2
azure-mgmt-resource 23.1.0b2
Python location 'C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\python.exe'
Extensions directory 'C:\Users\PannaKristof.azure\cliextensions'
Python (Windows) 3.10.10 (tags/v3.10.10:aad5f6a, Feb 7 2023, 17:05:00) [MSC v.1929 32 bit (Intel)]
Additional context
My question is: why is the memory request I set on my deployment not taking an effect on the init container also?
Is there any other (maybe completely different) solution to achieve smaller memory request for the init container? (For example, it would be ideal if I could set the request size for the init container also dynamically when running the
online-deployment create
command.)The reason for my question: we would like to deploy several small deployments, but it is very wasteful to use 500 MB memory for each of them (when eg. 65Mi would be sufficient).
(Or is it possible that the init container actually needs this much space to work and I should not try to set the memory request?)
Thank you in advance for your help!
The text was updated successfully, but these errors were encountered: