diff --git a/vertical-pod-autoscaler/README.md b/vertical-pod-autoscaler/README.md index 6ae688a51474..d64097275a51 100644 --- a/vertical-pod-autoscaler/README.md +++ b/vertical-pod-autoscaler/README.md @@ -1,34 +1,15 @@ # Vertical Pod Autoscaler ## Contents + - [Contents](#contents) - [Intro](#intro) -- [Installation](#installation) - - [Compatibility](#compatibility) - - [Notice on deprecation of v1beta2 version (>=0.13.0)](#notice-on-deprecation-of-v1beta2-version-0130) - - [Notice on removal of v1beta1 version (>=0.5.0)](#notice-on-removal-of-v1beta1-version-050) - - [Prerequisites](#prerequisites) - - [Install command](#install-command) - - [Quick start](#quick-start) - - [Test your installation](#test-your-installation) - - [Example VPA configuration](#example-vpa-configuration) - - [Troubleshooting](#troubleshooting) - - [Components of VPA](#components-of-vpa) - - [Tear down](#tear-down) -- [Limits control](#limits-control) -- [Examples](#examples) - - [Keeping limit proportional to request](#keeping-limit-proportional-to-request) - - [Capping to Limit Range](#capping-to-limit-range) - - [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range) - - [Starting multiple recommenders](#starting-multiple-recommenders) - - [Using CPU management with static policy](#using-cpu-management-with-static-policy) - - [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource) - - [Limiting which namespaces are used](#limiting-which-namespaces-are-used) - - [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy) -- [Known limitations](#known-limitations) +- [Getting started](#getting-started) +- [Components and Architecture](#components-and-architecture) +- [Features and Known limitations](#features-and-known-limitations) - [Related links](#related-links) -# Intro +## Intro Vertical Pod Autoscaler (VPA) frees users from the necessity of setting up-to-date resource requests for the containers in their pods. When @@ -50,402 +31,22 @@ resource recommendations are applied. To enable vertical pod autoscaling on your cluster please follow the installation procedure described below. -# Installation - -The current default version is Vertical Pod Autoscaler 1.2.1 - -### Compatibility - -| VPA version | Kubernetes version | -|-----------------|--------------------| -| 1.2.1 | 1.27+ | -| 1.2.0 | 1.27+ | -| 1.1.2 | 1.25+ | -| 1.1.1 | 1.25+ | -| 1.0 | 1.25+ | -| 0.14 | 1.25+ | -| 0.13 | 1.25+ | -| 0.12 | 1.25+ | -| 0.11 | 1.22 - 1.24 | -| 0.10 | 1.22+ | -| 0.9 | 1.16+ | -| 0.8 | 1.13+ | -| 0.4 to 0.7 | 1.11+ | -| 0.3.X and lower | 1.7+ | - -### Notice on CRD update (>=1.0.0) - -**NOTE:** In version 1.0.0, we have updated the CRD definition and added RBAC for the -status resource. If you are upgrading from version (<=0.14.0), you must update the CRD -definition and RBAC. -```shell -kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml -kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-rbac.yaml -``` -Another method is to re-execute the ./hack/vpa-process-yamls.sh script. -```shell -git clone https://github.com/kubernetes/autoscaler.git -cd autoscaler/vertical-pod-autoscaler -git checkout origin/vpa-release-1.0 -REGISTRY=registry.k8s.io/autoscaling TAG=1.0.0 ./hack/vpa-process-yamls.sh apply -``` - -If you need to roll back to version (<=0.14.0), please check out the release for your -rollback version and execute ./hack/vpa-process-yamls.sh. For example, to rollback to 0.14.0: -```shell -git checkout origin/vpa-release-0.14 -REGISTRY=registry.k8s.io/autoscaling TAG=0.14.0 ./hack/vpa-process-yamls.sh apply -kubectl delete clusterrole system:vpa-status-actor -kubectl delete clusterrolebinding system:vpa-status-actor -``` - -### Notice on deprecation of v1beta2 version (>=0.13.0) -**NOTE:** In 0.13.0 we deprecate `autoscaling.k8s.io/v1beta2` API. We plan to -remove this API version. While for now you can continue to use `v1beta2` API we -recommend using `autoscaling.k8s.io/v1` instead. `v1` and `v1beta2` APIs are -almost identical (`v1` API has some fields which are not present in `v1beta2`) -so simply changing which API version you're calling should be enough in almost -all cases. - -### Notice on removal of v1beta1 version (>=0.5.0) - -**NOTE:** In 0.5.0 we disabled the old version of the API - `autoscaling.k8s.io/v1beta1`. -The VPA objects in this version will no longer receive recommendations and -existing recommendations will be removed. The objects will remain present though -and a ConfigUnsupported condition will be set on them. - -This doc is for installing latest VPA. For instructions on migration from older versions see [Migration Doc](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/MIGRATE.md). - -### Prerequisites - -- `kubectl` should be connected to the cluster you want to install VPA. -- The metrics server must be deployed in your cluster. Read more about [Metrics Server](https://github.com/kubernetes-sigs/metrics-server). -- If you are using a GKE Kubernetes cluster, you will need to grant your current Google - identity `cluster-admin` role. Otherwise, you won't be authorized to grant extra - privileges to the VPA system components. - - ```console - $ gcloud info | grep Account # get current google identity - Account: [myname@example.org] - - $ kubectl create clusterrolebinding myname-cluster-admin-binding --clusterrole=cluster-admin --user=myname@example.org - Clusterrolebinding "myname-cluster-admin-binding" created - ``` - -- If you already have another version of VPA installed in your cluster, you have to tear down - the existing installation first with: - - ```console - ./hack/vpa-down.sh - ``` - -### Install command - -To install VPA, please download the source code of VPA (for example with `git clone https://github.com/kubernetes/autoscaler.git`) -and run the following command inside the `vertical-pod-autoscaler` directory: - -```console -./hack/vpa-up.sh -``` - -Note: the script currently reads environment variables: `$REGISTRY` and `$TAG`. -Make sure you leave them unset unless you want to use a non-default version of VPA. - -Note: If you are seeing following error during this step: -``` -unknown option -addext -``` -please upgrade openssl to version 1.1.1 or higher (needs to support -addext option) or use ./hack/vpa-up.sh on the [0.8 release branch](https://github.com/kubernetes/autoscaler/tree/vpa-release-0.8). - -The script issues multiple `kubectl` commands to the -cluster that insert the configuration and start all needed pods (see -[architecture](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md#architecture-overview)) -in the `kube-system` namespace. It also generates -and uploads a secret (a CA cert) used by VPA Admission Controller when communicating -with the API server. - -To print YAML contents with all resources that would be understood by -`kubectl diff|apply|...` commands, you can use - -```console -./hack/vpa-process-yamls.sh print -``` - -The output of that command won't include secret information generated by -[pkg/admission-controller/gencerts.sh](pkg/admission-controller/gencerts.sh) script. - -### Quick start - -After [installation](#installation) the system is ready to recommend and set -resource requests for your pods. -In order to use it, you need to insert a *Vertical Pod Autoscaler* resource for -each controller that you want to have automatically computed resource requirements. -This will be most commonly a **Deployment**. -There are four modes in which *VPAs* operate: - -- `"Auto"`: VPA assigns resource requests on pod creation as well as updates - them on existing pods using the preferred update mechanism. Currently, this is - equivalent to `"Recreate"` (see below). Once restart free ("in-place") update - of pod requests is available, it may be used as the preferred update mechanism by - the `"Auto"` mode. -- `"Recreate"`: VPA assigns resource requests on pod creation as well as updates - them on existing pods by evicting them when the requested resources differ significantly - from the new recommendation (respecting the Pod Disruption Budget, if defined). - This mode should be used rarely, only if you need to ensure that the pods are restarted - whenever the resource request changes. Otherwise, prefer the `"Auto"` mode which may take - advantage of restart-free updates once they are available. -- `"Initial"`: VPA only assigns resource requests on pod creation and never changes them - later. -- `"Off"`: VPA does not automatically change the resource requirements of the pods. - The recommendations are calculated and can be inspected in the VPA object. - -### Test your installation - -A simple way to check if Vertical Pod Autoscaler is fully operational in your -cluster is to create a sample deployment and a corresponding VPA config: - -```console -kubectl create -f examples/hamster.yaml -``` - -The above command creates a deployment with two pods, each running a single container -that requests 100 millicores and tries to utilize slightly above 500 millicores. -The command also creates a VPA config pointing at the deployment. -VPA will observe the behaviour of the pods, and after about 5 minutes, they should get -updated with a higher CPU request -(note that VPA does not modify the template in the deployment, but the actual requests -of the pods are updated). To see VPA config and current recommended resource requests run: - -```console -kubectl describe vpa -``` - -*Note: if your cluster has little free capacity these pods may be unable to schedule. -You may need to add more nodes or adjust examples/hamster.yaml to use less CPU.* - -### Example VPA configuration - -```yaml -apiVersion: autoscaling.k8s.io/v1 -kind: VerticalPodAutoscaler -metadata: - name: my-app-vpa -spec: - targetRef: - apiVersion: "apps/v1" - kind: Deployment - name: my-app - updatePolicy: - updateMode: "Auto" -``` - -### Troubleshooting - -To diagnose problems with a VPA installation, perform the following steps: - -- Check if all system components are running: - -```console -kubectl --namespace=kube-system get pods|grep vpa -``` - -The above command should list 3 pods (recommender, updater and admission-controller) -all in state Running. - -- Check if the system components log any errors. - For each of the pods returned by the previous command do: - -```console -kubectl --namespace=kube-system logs [pod name] | grep -e '^E[0-9]\{4\}' -``` - -- Check that the VPA Custom Resource Definition was created: - -```console -kubectl get customresourcedefinition | grep verticalpodautoscalers -``` - -### Components of VPA - -The project consists of 3 components: - -- [Recommender](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/recommender/README.md) - monitors the current and past resource consumption and, based on it, - provides recommended values for the containers' cpu and memory requests. - -- [Updater](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/updater/README.md) - checks which of the managed pods have correct resources set and, if not, - kills them so that they can be recreated by their controllers with the updated requests. - -- [Admission Plugin](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/admission-controller/README.md) - sets the correct resource requests on new pods (either just created - or recreated by their controller due to Updater's activity). - -More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md). - -### Tear down - -Note that if you stop running VPA in your cluster, the resource requests -for the pods already modified by VPA will not change, but any new pods -will get resources as defined in your controllers (i.e. deployment or -replicaset) and not according to previous recommendations made by VPA. - -To stop using Vertical Pod Autoscaling in your cluster: - -- If running on GKE, clean up role bindings created in [Prerequisites](#prerequisites): - -```console -kubectl delete clusterrolebinding myname-cluster-admin-binding -``` - -- Tear down VPA components: - -```console -./hack/vpa-down.sh -``` - -# Limits control - -When setting limits VPA will conform to -[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103). -It will maintain limit to request ratio specified for all containers. - -VPA will try to cap recommendations between min and max of -[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts -with VPA resource policy, VPA will follow VPA policy (and set values outside the limit -range). - -To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`. - -## Examples - -### Keeping limit proportional to request - -The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also -specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA -applies the recommendation, it will also set the memory limit to 4 GB. - -### Capping to Limit Range - -The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also -specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. -VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will -set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB ( -to maintain a 2:1 limit/request ratio from the template). - -### Resource Policy Overriding Limit Range - -The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also -specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. -VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and -2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation, -VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain -the 2:1 limit/request ratio from the template). - -### Starting multiple recommenders - -It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles. -For example you could have 3 profiles: [frugal](deploy/recommender-deployment-low.yaml), -[standard](deploy/recommender-deployment.yaml) and -[performance](deploy/recommender-deployment-high.yaml) which will -use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations. - -Please note the usage of the following arguments to override default names and percentiles: - -- --recommender-name=performance -- --target-cpu-percentile=0.95 - -You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec. - -### Custom memory bump-up after OOMKill - -After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`. -You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender: -`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event. -`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB) - -Usage in recommender deployment - -```yaml - containers: - - name: recommender - args: - - --oom-bump-up-ratio=2.0 - - --oom-min-bump-up-bytes=524288000 -``` - -### Using CPU management with static policy - -If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers, -you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up. -To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender. -The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container. -The annotation format is the following: - -``` -vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true -``` - -### Controlling eviction behavior based on scaling direction and resource - -To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container - -Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down - -```yaml - updatePolicy: - evictionRequirements: - - resources: ["cpu", "memory"] - changeRequirement: TargetHigherThanRequests -``` - -Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information. - -### Limiting which namespaces are used - - By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options: +## Getting Started -1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore -1. `vpa-object-namespace` - A single namespace to monitor +See [Installation](./docs/installation.md) for a guide on installation, followed by a the [Quick start](./docs/quickstart.md) guide. -These options cannot be used together and are mutually exclusive. +Also refer to the [FAQ](./docs/faq.md) for more. -### Setting the webhook failurePolicy +## Components and Architecture -It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller. -Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA. -Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk. +The Vertical Pod Autoscaler consists of three parts. The recommender, updater and admission-controller. Read more about them on the [components](./docs/components.md) page. -# Known limitations +## Features and Known limitations -- Whenever VPA updates the pod resources, the pod is recreated, which causes all - running containers to be recreated. The pod may be recreated on a different - node. -- VPA cannot guarantee that pods it evicts or deletes to apply recommendations - (when configured in `Auto` and `Recreate` modes) will be successfully - recreated. This can be partly - addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). -- VPA does not update resources of pods which are not run under a controller. -- Vertical Pod Autoscaler **should not be used with the [Horizontal Pod - Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-resource-metrics) - (HPA) on the same resource metric (CPU or memory)** at this moment. However, you can use [VPA with - HPA on separate resource metrics](https://github.com/kubernetes/autoscaler/issues/6247) (e.g. VPA - on memory and HPA on CPU) as well as with [HPA on custom and external - metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-custom-metrics). -- The VPA admission controller is an admission webhook. If you add other admission webhooks - to your cluster, it is important to analyze how they interact and whether they may conflict - with each other. The order of admission controllers is defined by a flag on API server. -- VPA reacts to most out-of-memory events, but not in all situations. -- VPA performance has not been tested in large clusters. -- VPA recommendation might exceed available resources (e.g. Node size, available - size, available quota) and cause **pods to go pending**. This can be partly - addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). -- Multiple VPA resources matching the same pod have undefined behavior. -- Running the vpa-recommender with leader election enabled (`--leader-elect=true`) in a GKE cluster - causes contention with a lease called `vpa-recommender` held by the GKE system component of the - same name. To run your own VPA in GKE, make sure to specify a different lease name using - `--leader-elect-resource-name=vpa-recommender-lease` (or specify your own lease name). +You can also read about the [features](./docs/features.md) and [known limitations](./docs/known-limitaions.md) of the VPA. -# Related links +## Related links -- [FAQ](FAQ.md) - [Design proposal](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md) - [API diff --git a/vertical-pod-autoscaler/docs/components.md b/vertical-pod-autoscaler/docs/components.md new file mode 100644 index 000000000000..74fbceb9ae01 --- /dev/null +++ b/vertical-pod-autoscaler/docs/components.md @@ -0,0 +1,135 @@ +# Components + +## Contents + +- [Components](#components) + - [Introduction](#introduction) + - [Recommender](#recommender) + - [Running](#running-the-recommender) + - [Implementation](#implementation-of-the-recommender) + - [Updater](#updater) + - [Current implementation](#current-implementation) + - [Missing Parts](#missing-parts) + - [Admission Controller](#admission-controller) + - [Running](#running-the-admission-controller) + - [Implementation](#implementation-of-the-admission-controller) + +## Introduction + +The VPA project consists of 3 components: + +- [Recommender](#recommender) - monitors the current and past resource consumption and, based on it, + provides recommended values for the containers' cpu and memory requests. + +- [Updater](#updater) - checks which of the managed pods have correct resources set and, if not, + kills them so that they can be recreated by their controllers with the updated requests. + +- [Admission Controller](#admission-controller) - sets the correct resource requests on new pods (either just created + or recreated by their controller due to Updater's activity). + +More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md). + +## Recommender + +Recommender is the core binary of Vertical Pod Autoscaler system. +It computes the recommended resource requests for pods based on +historical and current usage of the resources. +The current recommendations are put in status of the VPA resource, where they +can be inspected. + +## Running the recommender + +- In order to have historical data pulled in by the recommender, install + Prometheus in your cluster and pass its address through a flag. +- Create RBAC configuration from `../deploy/vpa-rbac.yaml`. +- Create a deployment with the recommender pod from + `../deploy/recommender-deployment.yaml`. +- The recommender will start running and pushing its recommendations to VPA + object statuses. + +### Implementation of the recommender + +The recommender is based on a model of the cluster that it builds in its memory. +The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with +their configuration (e.g. labels) as well as other information, e.g. usage data for +each container. + +After starting the binary, recommender reads the history of running pods and +their usage from Prometheus into the model. +It then runs in a loop and at each step performs the following actions: + +- update model with recent information on resources (using listers based on + watch), +- update model with fresh usage samples from Metrics API, +- compute new recommendation for each VPA, +- put any changed recommendations into the VPA resources. + +## Updater + +Updater component for Vertical Pod Autoscaler described in the [Vertical Pod Autoscaler - design proposal](https://github.com/kubernetes/community/pull/338) + +Updater runs in Kubernetes cluster and decides which pods should be restarted +based on resources allocation recommendation calculated by Recommender. +If a pod should be updated, Updater will try to evict the pod. +It respects the pod disruption budget, by using Eviction API to evict pods. +Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin +to update pod resources when the pod is recreated after eviction. + +### Current implementation + +Runs in a loop. On one iteration performs: + +- Fetching Vertical Pod Autoscaler configuration using a lister implementation. +- Fetching live pods information with their current resource allocation. +- For each replicated pods group calculating if pod update is required and how many replicas can be evicted. +Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag. +- Evicting pods if recommended resources significantly vary from the actual resources allocation. +Threshold for evicting pods is specified by recommended min/max values from VPA resource. +Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources +(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted +before pod with 20% memory increase and no change in cpu). + +### Missing parts + +- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender. + +## Admission-controller + +This is a binary that registers itself as a Mutating Admission Webhook +and because of that is on the path of creating all pods. +For each pod creation, it will get a request from the apiserver and it will +either decide there's no matching VPA configuration or find the corresponding +one and use current recommendation to set resource requests in the pod. + +### Running the admission-controller + +1. You should make sure your API server supports Mutating Webhooks. +Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of +the values on the list and its `--runtime-config` flag should include +`admissionregistration.k8s.io/v1beta1=true`. +To change those flags, ssh to your API Server instance, edit +`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick +up the changes: ```sudo systemctl restart kubelet.service``` +1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create + a secret in your cluster with the certs. +1. Create RBAC configuration for the admission controller pod by running + `kubectl create -f ../deploy/admission-controller-rbac.yaml` +1. Create the pod: + `kubectl create -f ../deploy/admission-controller-deployment.yaml`. + The first thing this will do is it will register itself with the apiserver as + Webhook Admission Controller and start changing resource requirements + for pods on their creation & updates. +1. You can specify a path for it to register as a part of the installation process + by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`. +1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`. +1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`. +1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2. + +### Implementation of the Admission Controller + +All VPA configurations in the cluster are watched with a lister. +In the context of pod creation, there is an incoming https request from +apiserver. +The logic to serve that request involves finding the appropriate VPA, retrieving +current recommendation from it and encodes the recommendation as a json patch to +the Pod resource. diff --git a/vertical-pod-autoscaler/docs/examples.md b/vertical-pod-autoscaler/docs/examples.md new file mode 100644 index 000000000000..ed5d5108601b --- /dev/null +++ b/vertical-pod-autoscaler/docs/examples.md @@ -0,0 +1,110 @@ +# Examples + +## Contents + +- [Examples](#examples) + - [Keeping limit proportional to request](#keeping-limit-proportional-to-request) + - [Capping to Limit Range](#capping-to-limit-range) + - [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range) + - [Starting multiple recommenders](#starting-multiple-recommenders) + - [Using CPU management with static policy](#using-cpu-management-with-static-policy) + - [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource) + - [Limiting which namespaces are used](#limiting-which-namespaces-are-used) + - [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy) + +## Keeping limit proportional to request + +The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also +specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA +applies the recommendation, it will also set the memory limit to 4 GB. + +## Capping to Limit Range + +The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also +specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. +VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will +set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB ( +to maintain a 2:1 limit/request ratio from the template). + +## Resource Policy Overriding Limit Range + +The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also +specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. +VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and +2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation, +VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain +the 2:1 limit/request ratio from the template). + +## Starting multiple recommenders + +It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles. +For example you could have 3 profiles: [frugal](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-low.yaml), +[standard](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment.yaml) and +[performance](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-high.yaml) which will +use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations. + +Please note the usage of the following arguments to override default names and percentiles: + +- --recommender-name=performance +- --target-cpu-percentile=0.95 + +You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec. + +## Custom memory bump-up after OOMKill + +After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`. +You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender: +`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event. +`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB) + +Usage in recommender deployment + +```yaml + containers: + - name: recommender + args: + - --oom-bump-up-ratio=2.0 + - --oom-min-bump-up-bytes=524288000 +``` + +## Using CPU management with static policy + +If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers, +you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up. +To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender. +The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container. +The annotation format is the following: + +```yaml +vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true +``` + +## Controlling eviction behavior based on scaling direction and resource + +To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container + +Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down + +```yaml + updatePolicy: + evictionRequirements: + - resources: ["cpu", "memory"] + changeRequirement: TargetHigherThanRequests +``` + +Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information. + +## Limiting which namespaces are used + + By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options: + +1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore +1. `vpa-object-namespace` - A single namespace to monitor + +These options cannot be used together and are mutually exclusive. + +## Setting the webhook failurePolicy + +It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller. +Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA. +Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk. diff --git a/vertical-pod-autoscaler/FAQ.md b/vertical-pod-autoscaler/docs/faq.md similarity index 97% rename from vertical-pod-autoscaler/FAQ.md rename to vertical-pod-autoscaler/docs/faq.md index 53a83ff489b8..80f1c8774af1 100644 --- a/vertical-pod-autoscaler/FAQ.md +++ b/vertical-pod-autoscaler/docs/faq.md @@ -2,7 +2,7 @@ ## Contents -- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-CPU-or-memory-settings) +- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-cpu-or-memory-settings) - [How can I apply VPA to my Custom Resource?](#how-can-i-apply-vpa-to-my-custom-resource) - [How can I use Prometheus as a history provider for the VPA recommender?](#how-can-i-use-prometheus-as-a-history-provider-for-the-vpa-recommender) - [I get recommendations for my single pod replicaSet, but they are not applied. Why?](#i-get-recommendations-for-my-single-pod-replicaset-but-they-are-not-applied) @@ -135,7 +135,7 @@ spec: - --v=4 - --storage=prometheus - --prometheus-address=http://prometheus.default.svc.cluster.local:9090 - ``` +``` In this example, Prometheus is running in the default namespace. @@ -148,9 +148,9 @@ Here you should see the flags that you set for the VPA recommender and you shoul This means that the VPA recommender is now using Prometheus as the history provider. -### I get recommendations for my single pod replicaSet but they are not applied +### I get recommendations for my single pod replicaset but they are not applied -By default, the [`--min-replicas`](pkg/updater/main.go#L56) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](deploy/updater-deployment.yaml) file: +By default, the [`--min-replicas`](https://github.com/kubernetes/autoscaler/tree/master/pkg/updater/main.go#L44) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](https://github.com/kubernetes/autoscaler/tree/master/deploy/updater-deployment.yaml) file: ```yaml spec: @@ -179,7 +179,7 @@ election with the `--leader-elect=true` parameter. The following startup parameters are supported for VPA recommender: Name | Type | Description | Default -|-|-|-|-| +-|-|-|- `recommendation-margin-fraction` | Float64 | Fraction of usage added as the safety margin to the recommended request | 0.15 `pod-recommendation-min-cpu-millicores` | Float64 | Minimum CPU recommendation for a pod | 25 `pod-recommendation-min-memory-mb` | Float64 | Minimum memory recommendation for a pod | 250 @@ -230,7 +230,7 @@ Name | Type | Description | Default The following startup parameters are supported for VPA updater: Name | Type | Description | Default -|-|-|-|-| +-|-|-|- `pod-update-threshold` | Float64 | Ignore updates that have priority lower than the value of this flag | 0.1 `in-recommendation-bounds-eviction-lifetime-threshold` | Duration | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range | time.Hour*12 `evict-after-oom-threshold` | Duration | Evict pod that has OOMed in less than evict-after-oom-threshold since start. | 10*time.Minute diff --git a/vertical-pod-autoscaler/docs/features.md b/vertical-pod-autoscaler/docs/features.md new file mode 100644 index 000000000000..ff8ced24041b --- /dev/null +++ b/vertical-pod-autoscaler/docs/features.md @@ -0,0 +1,18 @@ +# Features + +## Contents + +- [Limits control](#limits-control) + +## Limits control + +When setting limits VPA will conform to +[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103). +It will maintain limit to request ratio specified for all containers. + +VPA will try to cap recommendations between min and max of +[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts +with VPA resource policy, VPA will follow VPA policy (and set values outside the limit +range). + +To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`. diff --git a/vertical-pod-autoscaler/docs/installation.md b/vertical-pod-autoscaler/docs/installation.md new file mode 100644 index 000000000000..69b53d5d1fac --- /dev/null +++ b/vertical-pod-autoscaler/docs/installation.md @@ -0,0 +1,161 @@ +# Installation + +## Contents + +- [Installation](#installation) + - [Compatibility](#compatibility) + - [Notice on deprecation of v1beta2 version (>=0.13.0)](#notice-on-deprecation-of-v1beta2-version-0130) + - [Notice on removal of v1beta1 version (>=0.5.0)](#notice-on-removal-of-v1beta1-version-050) + - [Prerequisites](#prerequisites) + - [Install command](#install-command) + - [Tear down](#tear-down) + +The current default version is Vertical Pod Autoscaler 1.2.1 + +## Compatibility + +| VPA version | Kubernetes version | +|-----------------|--------------------| +| 1.2.1 | 1.27+ | +| 1.2.0 | 1.27+ | +| 1.1.2 | 1.25+ | +| 1.1.1 | 1.25+ | +| 1.0 | 1.25+ | +| 0.14 | 1.25+ | +| 0.13 | 1.25+ | +| 0.12 | 1.25+ | +| 0.11 | 1.22 - 1.24 | +| 0.10 | 1.22+ | +| 0.9 | 1.16+ | +| 0.8 | 1.13+ | +| 0.4 to 0.7 | 1.11+ | +| 0.3.X and lower | 1.7+ | + +## Notice on CRD update (>=1.0.0) + +**NOTE:** In version 1.0.0, we have updated the CRD definition and added RBAC for the +status resource. If you are upgrading from version (<=0.14.0), you must update the CRD +definition and RBAC. + +```shell +kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml +kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-rbac.yaml +``` + +Another method is to re-execute the ./hack/vpa-process-yamls.sh script. + +```shell +git clone https://github.com/kubernetes/autoscaler.git +cd autoscaler/vertical-pod-autoscaler +git checkout origin/vpa-release-1.0 +REGISTRY=registry.k8s.io/autoscaling TAG=1.0.0 ./hack/vpa-process-yamls.sh apply +``` + +If you need to roll back to version (<=0.14.0), please check out the release for your +rollback version and execute ./hack/vpa-process-yamls.sh. For example, to rollback to 0.14.0: + +```shell +git checkout origin/vpa-release-0.14 +REGISTRY=registry.k8s.io/autoscaling TAG=0.14.0 ./hack/vpa-process-yamls.sh apply +kubectl delete clusterrole system:vpa-status-actor +kubectl delete clusterrolebinding system:vpa-status-actor +``` + +## Notice on deprecation of v1beta2 version (>=0.13.0) + +**NOTE:** In 0.13.0 we deprecate `autoscaling.k8s.io/v1beta2` API. We plan to +remove this API version. While for now you can continue to use `v1beta2` API we +recommend using `autoscaling.k8s.io/v1` instead. `v1` and `v1beta2` APIs are +almost identical (`v1` API has some fields which are not present in `v1beta2`) +so simply changing which API version you're calling should be enough in almost +all cases. + +## Notice on removal of v1beta1 version (>=0.5.0) + +**NOTE:** In 0.5.0 we disabled the old version of the API - `autoscaling.k8s.io/v1beta1`. +The VPA objects in this version will no longer receive recommendations and +existing recommendations will be removed. The objects will remain present though +and a ConfigUnsupported condition will be set on them. + +This doc is for installing latest VPA. For instructions on migration from older versions see [Migration Doc](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/MIGRATE.md). + +## Prerequisites + +- `kubectl` should be connected to the cluster you want to install VPA. +- The metrics server must be deployed in your cluster. Read more about [Metrics Server](https://github.com/kubernetes-sigs/metrics-server). +- If you are using a GKE Kubernetes cluster, you will need to grant your current Google + identity `cluster-admin` role. Otherwise, you won't be authorized to grant extra + privileges to the VPA system components. + + ```console + $ gcloud info | grep Account # get current google identity + Account: [myname@example.org] + + $ kubectl create clusterrolebinding myname-cluster-admin-binding --clusterrole=cluster-admin --user=myname@example.org + Clusterrolebinding "myname-cluster-admin-binding" created + ``` + +- If you already have another version of VPA installed in your cluster, you have to tear down + the existing installation first with: + + ```console + ./hack/vpa-down.sh + ``` + +## Install command + +To install VPA, please download the source code of VPA (for example with `git clone https://github.com/kubernetes/autoscaler.git`) +and run the following command inside the `vertical-pod-autoscaler` directory: + +```console +./hack/vpa-up.sh +``` + +Note: the script currently reads environment variables: `$REGISTRY` and `$TAG`. +Make sure you leave them unset unless you want to use a non-default version of VPA. + +Note: If you are seeing following error during this step: + +```console +unknown option -addext +``` + +please upgrade openssl to version 1.1.1 or higher (needs to support -addext option) or use ./hack/vpa-up.sh on the [0.8 release branch](https://github.com/kubernetes/autoscaler/tree/vpa-release-0.8). + +The script issues multiple `kubectl` commands to the +cluster that insert the configuration and start all needed pods (see +[architecture](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md#architecture-overview)) +in the `kube-system` namespace. It also generates +and uploads a secret (a CA cert) used by VPA Admission Controller when communicating +with the API server. + +To print YAML contents with all resources that would be understood by +`kubectl diff|apply|...` commands, you can use + +```console +./hack/vpa-process-yamls.sh print +``` + +The output of that command won't include secret information generated by +[pkg/admission-controller/gencerts.sh](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/pkg/admission-controller/gencerts.sh) script. + +## Tear down + +Note that if you stop running VPA in your cluster, the resource requests +for the pods already modified by VPA will not change, but any new pods +will get resources as defined in your controllers (i.e. deployment or +replicaset) and not according to previous recommendations made by VPA. + +To stop using Vertical Pod Autoscaling in your cluster: + +- If running on GKE, clean up role bindings created in [Prerequisites](#prerequisites): + +```console +kubectl delete clusterrolebinding myname-cluster-admin-binding +``` + +- Tear down VPA components: + +```console +./hack/vpa-down.sh +``` diff --git a/vertical-pod-autoscaler/docs/known-limitations.md b/vertical-pod-autoscaler/docs/known-limitations.md new file mode 100644 index 000000000000..a6e08c849016 --- /dev/null +++ b/vertical-pod-autoscaler/docs/known-limitations.md @@ -0,0 +1,29 @@ +# Known limitations + +- Whenever VPA updates the pod resources, the pod is recreated, which causes all + running containers to be recreated. The pod may be recreated on a different + node. +- VPA cannot guarantee that pods it evicts or deletes to apply recommendations + (when configured in `Auto` and `Recreate` modes) will be successfully + recreated. This can be partly + addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). +- VPA does not update resources of pods which are not run under a controller. +- Vertical Pod Autoscaler **should not be used with the [Horizontal Pod + Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-resource-metrics) + (HPA) on the same resource metric (CPU or memory)** at this moment. However, you can use [VPA with + HPA on separate resource metrics](https://github.com/kubernetes/autoscaler/issues/6247) (e.g. VPA + on memory and HPA on CPU) as well as with [HPA on custom and external + metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-custom-metrics). +- The VPA admission controller is an admission webhook. If you add other admission webhooks + to your cluster, it is important to analyze how they interact and whether they may conflict + with each other. The order of admission controllers is defined by a flag on API server. +- VPA reacts to most out-of-memory events, but not in all situations. +- VPA performance has not been tested in large clusters. +- VPA recommendation might exceed available resources (e.g. Node size, available + size, available quota) and cause **pods to go pending**. This can be partly + addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). +- Multiple VPA resources matching the same pod have undefined behavior. +- Running the vpa-recommender with leader election enabled (`--leader-elect=true`) in a GKE cluster + causes contention with a lease called `vpa-recommender` held by the GKE system component of the + same name. To run your own VPA in GKE, make sure to specify a different lease name using + `--leader-elect-resource-name=vpa-recommender-lease` (or specify your own lease name). diff --git a/vertical-pod-autoscaler/docs/quickstart.md b/vertical-pod-autoscaler/docs/quickstart.md new file mode 100644 index 000000000000..7ef784d30009 --- /dev/null +++ b/vertical-pod-autoscaler/docs/quickstart.md @@ -0,0 +1,97 @@ +# Quick start + +## Contents + +- [Quick start](#quick-start) + - [Test your installation](#test-your-installation) + - [Example VPA configuration](#example-vpa-configuration) + - [Troubleshooting](#troubleshooting) + +After [installation](./installation.md) the system is ready to recommend and set +resource requests for your pods. +In order to use it, you need to insert a *Vertical Pod Autoscaler* resource for +each controller that you want to have automatically computed resource requirements. +This will be most commonly a **Deployment**. +There are four modes in which *VPAs* operate: + +- `"Auto"`: VPA assigns resource requests on pod creation as well as updates + them on existing pods using the preferred update mechanism. Currently, this is + equivalent to `"Recreate"` (see below). Once restart free ("in-place") update + of pod requests is available, it may be used as the preferred update mechanism by + the `"Auto"` mode. +- `"Recreate"`: VPA assigns resource requests on pod creation as well as updates + them on existing pods by evicting them when the requested resources differ significantly + from the new recommendation (respecting the Pod Disruption Budget, if defined). + This mode should be used rarely, only if you need to ensure that the pods are restarted + whenever the resource request changes. Otherwise, prefer the `"Auto"` mode which may take + advantage of restart-free updates once they are available. +- `"Initial"`: VPA only assigns resource requests on pod creation and never changes them + later. +- `"Off"`: VPA does not automatically change the resource requirements of the pods. + The recommendations are calculated and can be inspected in the VPA object. + +## Test your installation + +A simple way to check if Vertical Pod Autoscaler is fully operational in your +cluster is to create a sample deployment and a corresponding VPA config: + +```console +kubectl create -f examples/hamster.yaml +``` + +The above command creates a deployment with two pods, each running a single container +that requests 100 millicores and tries to utilize slightly above 500 millicores. +The command also creates a VPA config pointing at the deployment. +VPA will observe the behaviour of the pods, and after about 5 minutes, they should get +updated with a higher CPU request +(note that VPA does not modify the template in the deployment, but the actual requests +of the pods are updated). To see VPA config and current recommended resource requests run: + +```console +kubectl describe vpa +``` + +*Note: if your cluster has little free capacity these pods may be unable to schedule. +You may need to add more nodes or adjust examples/hamster.yaml to use less CPU.* + +## Example VPA configuration + +```yaml +apiVersion: autoscaling.k8s.io/v1 +kind: VerticalPodAutoscaler +metadata: + name: my-app-vpa +spec: + targetRef: + apiVersion: "apps/v1" + kind: Deployment + name: my-app + updatePolicy: + updateMode: "Auto" +``` + +## Troubleshooting + +To diagnose problems with a VPA installation, perform the following steps: + +- Check if all system components are running: + +```console +kubectl --namespace=kube-system get pods|grep vpa +``` + +The above command should list 3 pods (recommender, updater and admission-controller) +all in state Running. + +- Check if the system components log any errors. + For each of the pods returned by the previous command do: + +```console +kubectl --namespace=kube-system logs [pod name] | grep -e '^E[0-9]\{4\}' +``` + +- Check that the VPA Custom Resource Definition was created: + +```console +kubectl get customresourcedefinition | grep verticalpodautoscalers +``` diff --git a/vertical-pod-autoscaler/pkg/admission-controller/README.md b/vertical-pod-autoscaler/pkg/admission-controller/README.md deleted file mode 100644 index 1f11552cad66..000000000000 --- a/vertical-pod-autoscaler/pkg/admission-controller/README.md +++ /dev/null @@ -1,46 +0,0 @@ -# VPA Admission Controller - -- [Intro](#intro) -- [Running](#running) -- [Implementation](#implementation) - -## Intro - -This is a binary that registers itself as a Mutating Admission Webhook -and because of that is on the path of creating all pods. -For each pod creation, it will get a request from the apiserver and it will -either decide there's no matching VPA configuration or find the corresponding -one and use current recommendation to set resource requests in the pod. - -## Running - -1. You should make sure your API server supports Mutating Webhooks. -Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of -the values on the list and its `--runtime-config` flag should include -`admissionregistration.k8s.io/v1beta1=true`. -To change those flags, ssh to your API Server instance, edit -`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick -up the changes: ```sudo systemctl restart kubelet.service``` -1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create - a secret in your cluster with the certs. -1. Create RBAC configuration for the admission controller pod by running - `kubectl create -f ../deploy/admission-controller-rbac.yaml` -1. Create the pod: - `kubectl create -f ../deploy/admission-controller-deployment.yaml`. - The first thing this will do is it will register itself with the apiserver as - Webhook Admission Controller and start changing resource requirements - for pods on their creation & updates. -1. You can specify a path for it to register as a part of the installation process - by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`. -1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`. -1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`. -1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2. - -## Implementation - -All VPA configurations in the cluster are watched with a lister. -In the context of pod creation, there is an incoming https request from -apiserver. -The logic to serve that request involves finding the appropriate VPA, retrieving -current recommendation from it and encodes the recommendation as a json patch to -the Pod resource. diff --git a/vertical-pod-autoscaler/pkg/recommender/README.md b/vertical-pod-autoscaler/pkg/recommender/README.md deleted file mode 100644 index 9b3c73b9945a..000000000000 --- a/vertical-pod-autoscaler/pkg/recommender/README.md +++ /dev/null @@ -1,40 +0,0 @@ -# VPA Recommender - -- [Intro](#intro) -- [Running](#running) -- [Implementation](#implementation) - -## Intro - -Recommender is the core binary of Vertical Pod Autoscaler system. -It computes the recommended resource requests for pods based on -historical and current usage of the resources. -The current recommendations are put in status of the VPA resource, where they -can be inspected. - -## Running - -- In order to have historical data pulled in by the recommender, install - Prometheus in your cluster and pass its address through a flag. -- Create RBAC configuration from `../deploy/vpa-rbac.yaml`. -- Create a deployment with the recommender pod from - `../deploy/recommender-deployment.yaml`. -- The recommender will start running and pushing its recommendations to VPA - object statuses. - -## Implementation - -The recommender is based on a model of the cluster that it builds in its memory. -The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with -their configuration (e.g. labels) as well as other information, e.g. usage data for -each container. - -After starting the binary, recommender reads the history of running pods and -their usage from Prometheus into the model. -It then runs in a loop and at each step performs the following actions: - -- update model with recent information on resources (using listers based on - watch), -- update model with fresh usage samples from Metrics API, -- compute new recommendation for each VPA, -- put any changed recommendations into the VPA resources. diff --git a/vertical-pod-autoscaler/pkg/updater/README.md b/vertical-pod-autoscaler/pkg/updater/README.md deleted file mode 100644 index 6d783ffd2b0e..000000000000 --- a/vertical-pod-autoscaler/pkg/updater/README.md +++ /dev/null @@ -1,34 +0,0 @@ -# Vertical Pod Autoscaler - Updater - -- [Introduction](#introduction) -- [Current implementation](current-implementation) -- [Missing parts](#missing-parts) - -# Introduction - -Updater component for Vertical Pod Autoscaler described in https://github.com/kubernetes/community/pull/338 - -Updater runs in Kubernetes cluster and decides which pods should be restarted -based on resources allocation recommendation calculated by Recommender. -If a pod should be updated, Updater will try to evict the pod. -It respects the pod disruption budget, by using Eviction API to evict pods. -Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin -to update pod resources when the pod is recreated after eviction. - -# Current implementation - -Runs in a loop. On one iteration performs: - -- Fetching Vertical Pod Autoscaler configuration using a lister implementation. -- Fetching live pods information with their current resource allocation. -- For each replicated pods group calculating if pod update is required and how many replicas can be evicted. -Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag. -- Evicting pods if recommended resources significantly vary from the actual resources allocation. -Threshold for evicting pods is specified by recommended min/max values from VPA resource. -Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources -(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted -before pod with 20% memory increase and no change in cpu). - -# Missing parts - -- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender.