This repository has been archived by the owner on Jun 19, 2022. It is now read-only.
Memory-based autoscaling for the Broker deployments may not be reachable in certain circumstances #1520
Labels
area/broker
area/performance
kind/bug
Something isn't working
priority/1
Blocks current release defined by release/* label or blocks current milestone
release/2
Milestone
For the Broker deployments — Fanout, Retry, Ingress — we have a considerable span between requested memory and the memory limit e.g. "500Mi" and "3000Mi" for Fanout, respectively. At the same time, when the HPA is initialized, it includes a parameter for memory-based auto-scaling, and the threshold is currently selected as half of the memory limit, e.g. "1500Mi" in this case. Note that "1500Mi" is 3 times as large as the initially requested memory. That is, during an increased load, the deployment would first need to be scaled up quite considerably before it scales out. When a deployment is scheduled for a node, the "requested" memory is what it is guaranteed to get, however, there seem to be no guarantee that the deployment will be allowed to grab all the resources up to the limit at any arbitrary point. For instance, imagine that the pod was scheduled for a node that has "800Mi" memory left.
As a result, the following might be possible:
Action items
Additional context
It seems that occasional OOM issues were previously observed on the Broker.
The text was updated successfully, but these errors were encountered: