[ML] Warn the user when the model memory limit is higher than the memory available in the ML node #63942

romain-chanu · 2020-04-20T08:21:10Z

Describe the feature: As a user, I can configure an anomaly detection job with a model memory limit (model_memory_limit) higher than the memory available in the ML node. The memory available is currently associated with max_machine_memory_percent (by default 30% of the total memory of the machine). The model memory limit could also be bound by max_model_memory_limit.

For example: given a ML node with 16 GB and max_machine_memory_percent set to 30%, the available memory will be 4.8 GB. Saving an anomaly detection job with model_memory_limit set to 6 GB results in no warning.

Describe a specific use case for the feature: users should be warned / informed if an anomaly detection job configuration (e.g model memory limit) exceeds the ML node memory capacity / configuration.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-04-20T08:21:12Z

Pinging @elastic/ml-ui (:ml)

The ML info endpoint returns the max_model_memory_limit setting if one is configured. However, it is still possible to create a job that cannot run anywhere in the current cluster because no node in the cluster has enough memory to accommodate it. This change adds an extra piece of information, limits.current_effective_max_model_memory_limit, to the ML info response that returns the biggest model memory limit that could be run in the current cluster assuming no other jobs were running. The idea is that the ML UI will be able to warn users who try to create jobs with higher model memory limits that their jobs will not be able to start unless they add a bigger ML node to their cluster. Relates elastic/kibana#63942

droberts195 · 2020-04-21T13:40:55Z

The backend support for this change is elastic/elasticsearch#55529

The ML info endpoint returns the max_model_memory_limit setting if one is configured. However, it is still possible to create a job that cannot run anywhere in the current cluster because no node in the cluster has enough memory to accommodate it. This change adds an extra piece of information, limits.effective_max_model_memory_limit, to the ML info response that returns the biggest model memory limit that could be run in the current cluster assuming no other jobs were running. The idea is that the ML UI will be able to warn users who try to create jobs with higher model memory limits that their jobs will not be able to start unless they add a bigger ML node to their cluster. Relates elastic/kibana#63942

romain-chanu added enhancement New value added to drive a business result :ml labels Apr 20, 2020

romain-chanu assigned droberts195 Apr 20, 2020

peteharverson added the v7.8.0 label Apr 21, 2020

droberts195 mentioned this issue Apr 21, 2020

[ML] Add effective max model memory limit to ML info elastic/elasticsearch#55529

Merged

peteharverson assigned jgowdyelastic May 5, 2020

peteharverson added v7.9.0 v7.8.0 and removed v7.8.0 v7.9.0 labels May 5, 2020

jgowdyelastic mentioned this issue May 7, 2020

[ML] Show warning when the model memory limit is higher than the memory available in the ML node #65652

Merged

2 tasks

jgowdyelastic closed this as completed in #65652 May 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Warn the user when the model memory limit is higher than the memory available in the ML node #63942

[ML] Warn the user when the model memory limit is higher than the memory available in the ML node #63942

romain-chanu commented Apr 20, 2020

elasticmachine commented Apr 20, 2020

droberts195 commented Apr 21, 2020

[ML] Warn the user when the model memory limit is higher than the memory available in the ML node #63942

[ML] Warn the user when the model memory limit is higher than the memory available in the ML node #63942

Comments

romain-chanu commented Apr 20, 2020

elasticmachine commented Apr 20, 2020

droberts195 commented Apr 21, 2020