Update autoscale docs (#3905)

* added info about scaling multiple containers * removed heapster prereq, minor refactoring * minor refactoring
SeldonIO · Jan 31, 2022 · 5b32737 · 5b32737
1 parent d7ed535
commit 5b32737
Show file tree

Hide file tree

Showing 2 changed files with 8 additions and 7 deletions.
diff --git a/doc/source/graph/scaling.md b/doc/source/graph/scaling.md
@@ -103,10 +103,10 @@ For more details you can follow [a worked example of scaling](../examples/scale.
 
 ## Autoscaling Seldon Deployments
 
-To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are three steps:
+To autoscale your Seldon Deployment resources you can add Horizontal Pod Template Specifications to the Pod Template Specifications you create. There are two steps:
 
-  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory.
-  1. Add a HPA Spec referring to this Deployment. (We presently support v2beta2 version of k8s HPA Metrics spec)
+  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory. This has to be done for every container in the seldondeployment, except for the seldon-container-image and the storage initializer. Some combinations of protocol and server type may spawn additional support containers; resource requests have to be added to those containers as well.
+  2. Add a HPA Spec referring to this Deployment. (We presently support v2beta2 version of k8s HPA Metrics spec)
 
 To illustrate this we have an example Seldon Deployment below:
 
@@ -121,12 +121,12 @@ spec:
   - componentSpecs:
     - hpaSpec:
         maxReplicas: 3
+        minReplicas: 1
         metrics:
         - resource:
             name: cpu
             targetAverageUtilization: 70
           type: Resource
-        minReplicas: 1
       spec:
         containers:
         - image: seldonio/mock_classifier_rest:1.3
@@ -150,5 +150,7 @@ The key points here are:
  * We define a CPU request for our container. This is required to allow us to utilize cpu autoscaling in Kubernetes.
  * We define an HPA associated with our componentSpec which scales on CPU when the average CPU is above 70% up to a maximum of 3 replicas.
 
+Once deployed, the HPA resource may take a few minutes to start up. To check status of the HPA resource, `kubectl describe hpa -n <podname>` may be used.
+
 
 For a worked example see [this notebook](../examples/autoscaling_example.html).
diff --git a/examples/models/autoscaling/autoscaling_example.ipynb b/examples/models/autoscaling/autoscaling_example.ipynb
@@ -13,13 +13,12 @@
    "source": [
     "## Prerequisites\n",
     " \n",
-    "- The cluster should have `heapster` and `metric-server` running in the `kube-system` namespace\n",
+    "- The cluster should have `metric-server` running in the `kube-system` namespace\n",
     "- For Kind install `../../testing/scripts/metrics.yaml` See https://github.com/kubernetes-sigs/kind/issues/398\n",
     "- For Minikube run:\n",
     "    \n",
     "    ```\n",
     "    minikube addons enable metrics-server\n",
-    "    minikube addons enable heapster\n",
     "    ```\n",
     "    "
    ]
@@ -90,12 +89,12 @@
     "```\n",
     "    - hpaSpec:\n",
     "        maxReplicas: 3\n",
+    "        minReplicas: 1\n",
     "        metrics:\n",
     "        - resource:\n",
     "            name: cpu\n",
     "            targetAverageUtilization: 10\n",
     "          type: Resource\n",
-    "        minReplicas: 1\n",
     "\n",
     "```\n",
     "\n",