Skip to content

Commit

Permalink
Update autoscale docs (#3905)
Browse files Browse the repository at this point in the history
* added info about scaling multiple containers

* removed heapster prereq, minor refactoring

* minor refactoring
  • Loading branch information
jacobmalmberg authored Jan 31, 2022
1 parent d7ed535 commit 5b32737
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 7 deletions.
10 changes: 6 additions & 4 deletions doc/source/graph/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,10 +103,10 @@ For more details you can follow [a worked example of scaling](../examples/scale.

## Autoscaling Seldon Deployments

To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are three steps:
To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are two steps:

1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory.
1. Add a HPA Spec referring to this Deployment. (We presently support v2beta2 version of k8s HPA Metrics spec)
1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory. This has to be done for every container in the seldondeployment, except for the seldon-container-image and the storage initializer. Some combinations of protocol and server type may spawn additional support containers; resource requests have to be added to those containers as well.
2. Add a HPA Spec referring to this Deployment. (We presently support v2beta2 version of k8s HPA Metrics spec)

To illustrate this we have an example Seldon Deployment below:

Expand All @@ -121,12 +121,12 @@ spec:
- componentSpecs:
- hpaSpec:
maxReplicas: 3
minReplicas: 1
metrics:
- resource:
name: cpu
targetAverageUtilization: 70
type: Resource
minReplicas: 1
spec:
containers:
- image: seldonio/mock_classifier_rest:1.3
Expand All @@ -150,5 +150,7 @@ The key points here are:
* We define a CPU request for our container. This is required to allow us to utilize cpu autoscaling in Kubernetes.
* We define an HPA associated with our componentSpec which scales on CPU when the average CPU is above 70% up to a maximum of 3 replicas.
Once deployed, the HPA resource may take a few minutes to start up. To check status of the HPA resource, `kubectl describe hpa -n <podname>` may be used.


For a worked example see [this notebook](../examples/autoscaling_example.html).
5 changes: 2 additions & 3 deletions examples/models/autoscaling/autoscaling_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,12 @@
"source": [
"## Prerequisites\n",
" \n",
"- The cluster should have `heapster` and `metric-server` running in the `kube-system` namespace\n",
"- The cluster should have `metric-server` running in the `kube-system` namespace\n",
"- For Kind install `../../testing/scripts/metrics.yaml` See https://github.com/kubernetes-sigs/kind/issues/398\n",
"- For Minikube run:\n",
" \n",
" ```\n",
" minikube addons enable metrics-server\n",
" minikube addons enable heapster\n",
" ```\n",
" "
]
Expand Down Expand Up @@ -90,12 +89,12 @@
"```\n",
" - hpaSpec:\n",
" maxReplicas: 3\n",
" minReplicas: 1\n",
" metrics:\n",
" - resource:\n",
" name: cpu\n",
" targetAverageUtilization: 10\n",
" type: Resource\n",
" minReplicas: 1\n",
"\n",
"```\n",
"\n",
Expand Down

0 comments on commit 5b32737

Please sign in to comment.