Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPA admission controller can't admit pods in kube-system namespace #7392

Closed
umagnus opened this issue Oct 15, 2024 · 21 comments
Closed

VPA admission controller can't admit pods in kube-system namespace #7392

umagnus opened this issue Oct 15, 2024 · 21 comments
Labels
area/vertical-pod-autoscaler kind/support Categorizes issue or PR as a support question.

Comments

@umagnus
Copy link
Contributor

umagnus commented Oct 15, 2024

Which component are you using?:
vertical-pod-autoscaler

What version of the component are you using?:

Component version: v1.2.1

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.8

What environment is this in?:

AKS

What did you expect to happen?:

The pod in kube-system namespace can change memory and cpu request

What happened instead?:

They just restart all time. Admission controller can't admit pod in kube-system namespace. But recommander and update can listen them. So they evict pod all the time but no changes.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Admission controller logs:

I1014 02:55:54.373976       1 flags.go:57] FLAG: --add-dir-header="false"
I1014 02:55:54.374255       1 flags.go:57] FLAG: --address=":8944"
I1014 02:55:54.374365       1 flags.go:57] FLAG: --alsologtostderr="false"
I1014 02:55:54.374444       1 flags.go:57] FLAG: --client-ca-file="/etc/tls-certs/caCert.pem"
I1014 02:55:54.374517       1 flags.go:57] FLAG: --ignored-vpa-object-namespaces=""
I1014 02:55:54.374588       1 flags.go:57] FLAG: --kube-api-burst="10"
I1014 02:55:54.374656       1 flags.go:57] FLAG: --kube-api-qps="5"
I1014 02:55:54.374741       1 flags.go:57] FLAG: --kubeconfig=""
I1014 02:55:54.374804       1 flags.go:57] FLAG: --log-backtrace-at=":0"
I1014 02:55:54.374884       1 flags.go:57] FLAG: --log-dir=""
I1014 02:55:54.374952       1 flags.go:57] FLAG: --log-file=""
I1014 02:55:54.375021       1 flags.go:57] FLAG: --log-file-max-size="1800"
I1014 02:55:54.375084       1 flags.go:57] FLAG: --logtostderr="true"
I1014 02:55:54.375174       1 flags.go:57] FLAG: --min-tls-version="tls1_2"
I1014 02:55:54.375236       1 flags.go:57] FLAG: --one-output="false"
I1014 02:55:54.375312       1 flags.go:57] FLAG: --port="8000"
I1014 02:55:54.375370       1 flags.go:57] FLAG: --register-by-url="false"
I1014 02:55:54.375446       1 flags.go:57] FLAG: --register-webhook="true"
I1014 02:55:54.375515       1 flags.go:57] FLAG: --reload-cert="true"
I1014 02:55:54.375585       1 flags.go:57] FLAG: --skip-headers="false"
I1014 02:55:54.375654       1 flags.go:57] FLAG: --skip-log-headers="false"
I1014 02:55:54.375739       1 flags.go:57] FLAG: --stderrthreshold="0"
I1014 02:55:54.375798       1 flags.go:57] FLAG: --tls-cert-file="/etc/tls-certs/serverCert.pem"
I1014 02:55:54.375873       1 flags.go:57] FLAG: --tls-ciphers=""
I1014 02:55:54.375947       1 flags.go:57] FLAG: --tls-private-key="/etc/tls-certs/serverKey.pem"
I1014 02:55:54.376024       1 flags.go:57] FLAG: --v="4"
I1014 02:55:54.376103       1 flags.go:57] FLAG: --vmodule=""
I1014 02:55:54.376190       1 flags.go:57] FLAG: --vpa-object-namespace=""
I1014 02:55:54.376265       1 flags.go:57] FLAG: --webhook-address=""
I1014 02:55:54.376332       1 flags.go:57] FLAG: --webhook-port=""
I1014 02:55:54.376459       1 flags.go:57] FLAG: --webhook-service="vpa-webhook"
I1014 02:55:54.376555       1 flags.go:57] FLAG: --webhook-timeout-seconds="30"
I1014 02:55:54.376638       1 main.go:87] Vertical Pod Autoscaler 1.2.1 Admission Controller
I1014 02:55:54.377264       1 reflector.go:289] Starting reflector *v1.VerticalPodAutoscaler (1h0m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/vpa/api.go:90
I1014 02:55:54.377390       1 reflector.go:325] Listing and watching *v1.VerticalPodAutoscaler from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/vpa/api.go:90
I1014 02:55:54.477146       1 shared_informer.go:341] caches populated
I1014 02:55:54.477195       1 api.go:94] Initial VPA synced successfully
I1014 02:55:54.478093       1 reflector.go:289] Starting reflector *v1.ReplicaSet (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.478119       1 reflector.go:325] Listing and watching *v1.ReplicaSet from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.578388       1 shared_informer.go:341] caches populated
I1014 02:55:54.578418       1 fetcher.go:99] Initial sync of ReplicaSet completed
I1014 02:55:54.578782       1 reflector.go:289] Starting reflector *v1.StatefulSet (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.578804       1 reflector.go:325] Listing and watching *v1.StatefulSet from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.679135       1 shared_informer.go:341] caches populated
I1014 02:55:54.679168       1 fetcher.go:99] Initial sync of StatefulSet completed
I1014 02:55:54.679409       1 reflector.go:289] Starting reflector *v1.ReplicationController (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.679504       1 reflector.go:325] Listing and watching *v1.ReplicationController from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.779295       1 shared_informer.go:341] caches populated
I1014 02:55:54.779483       1 fetcher.go:99] Initial sync of ReplicationController completed
I1014 02:55:54.779832       1 reflector.go:289] Starting reflector *v1.Job (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.779936       1 reflector.go:325] Listing and watching *v1.Job from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.880319       1 shared_informer.go:341] caches populated
I1014 02:55:54.880359       1 fetcher.go:99] Initial sync of Job completed
I1014 02:55:54.880643       1 reflector.go:289] Starting reflector *v1.CronJob (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.880665       1 reflector.go:325] Listing and watching *v1.CronJob from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.981339       1 shared_informer.go:341] caches populated
I1014 02:55:54.981367       1 fetcher.go:99] Initial sync of CronJob completed
I1014 02:55:54.981583       1 reflector.go:289] Starting reflector *v1.DaemonSet (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:54.981607       1 reflector.go:325] Listing and watching *v1.DaemonSet from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:55.082136       1 shared_informer.go:341] caches populated
I1014 02:55:55.082176       1 fetcher.go:99] Initial sync of DaemonSet completed
I1014 02:55:55.082358       1 reflector.go:289] Starting reflector *v1.Deployment (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:55.082372       1 reflector.go:325] Listing and watching *v1.Deployment from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94
I1014 02:55:55.183287       1 shared_informer.go:341] caches populated
I1014 02:55:55.183324       1 fetcher.go:99] Initial sync of Deployment completed
I1014 02:55:55.183717       1 shared_informer.go:341] caches populated
I1014 02:55:55.183735       1 controller_fetcher.go:141] Initial sync of DaemonSet completed
I1014 02:55:55.183766       1 shared_informer.go:341] caches populated
I1014 02:55:55.183862       1 controller_fetcher.go:141] Initial sync of Deployment completed
I1014 02:55:55.183886       1 shared_informer.go:341] caches populated
I1014 02:55:55.183893       1 controller_fetcher.go:141] Initial sync of ReplicaSet completed
I1014 02:55:55.183903       1 shared_informer.go:341] caches populated
I1014 02:55:55.183910       1 controller_fetcher.go:141] Initial sync of StatefulSet completed
I1014 02:55:55.183919       1 shared_informer.go:341] caches populated
I1014 02:55:55.183929       1 controller_fetcher.go:141] Initial sync of ReplicationController completed
I1014 02:55:55.183938       1 shared_informer.go:341] caches populated
I1014 02:55:55.183945       1 controller_fetcher.go:141] Initial sync of Job completed
I1014 02:55:55.183954       1 shared_informer.go:341] caches populated
I1014 02:55:55.183963       1 controller_fetcher.go:141] Initial sync of CronJob completed
W1014 02:55:55.184074       1 shared_informer.go:459] The sharedIndexInformer has started, run more than once is not allowed
W1014 02:55:55.184095       1 shared_informer.go:459] The sharedIndexInformer has started, run more than once is not allowed
W1014 02:55:55.184108       1 shared_informer.go:459] The sharedIndexInformer has started, run more than once is not allowed
W1014 02:55:55.184161       1 shared_informer.go:459] The sharedIndexInformer has started, run more than once is not allowed
W1014 02:55:55.184175       1 shared_informer.go:459] The sharedIndexInformer has started, run more than once is not allowed
W1014 02:55:55.184185       1 shared_informer.go:459] The sharedIndexInformer has started, run more than once is not allowed
W1014 02:55:55.184383       1 shared_informer.go:459] The sharedIndexInformer has started, run more than once is not allowed
I1014 02:55:55.184413       1 reflector.go:289] Starting reflector *v1.LimitRange (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/limitrange/limit_range_calculator.go:60
I1014 02:55:55.184566       1 reflector.go:325] Listing and watching *v1.LimitRange from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/limitrange/limit_range_calculator.go:60
I1014 02:55:55.284351       1 shared_informer.go:341] caches populated
I1014 02:55:55.285002       1 certs.go:41] Successfully read 1168 bytes from /etc/tls-certs/caCert.pem
I1014 02:56:05.310003       1 config.go:174] Self registration as MutatingWebhook succeeded.
I1014 03:01:38.785099       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.Job total 6 items received
I1014 03:02:07.098453       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.Deployment total 14 items received
I1014 03:03:52.190782       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/limitrange/limit_range_calculator.go:60: Watch close - *v1.LimitRange total 7 items received
I1014 03:04:16.496959       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.ReplicaSet total 15 items received
I1014 03:04:42.888530       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.CronJob total 3 items received
I1014 03:04:55.687499       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.ReplicationController total 5 items received
I1014 03:05:44.001605       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.DaemonSet total 34 items received
I1014 03:05:46.416523       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/vpa/api.go:90: Watch close - *v1.VerticalPodAutoscaler total 0 items received
I1014 03:05:52.586648       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.StatefulSet total 5 items received

Recommender logs:

I1014 03:20:52.194654       1 recommender.go:155] Recommender Run
I1014 03:20:52.194682       1 cluster_feeder.go:321] Start selecting the vpaCRDs.
I1014 03:20:52.194689       1 cluster_feeder.go:362] Fetched 1 VPAs.
I1014 03:20:52.194792       1 cluster_feeder.go:372] Using selector app=csi-azuredisk-controller for VPA kube-system/csi-azuredisk-controller
I1014 03:20:52.198076       1 cluster_feeder.go:411] Deleting Pod kube-system/csi-azuredisk-controller-77fd5ccddb-ns2pj
I1014 03:20:52.198189       1 cluster_feeder.go:411] Deleting Pod kube-system/csi-azuredisk-controller-77fd5ccddb-j4lph
I1014 03:20:52.391707       1 metrics_client.go:74] 1436 podMetrics retrieved for all namespaces
I1014 03:20:52.482781       1 cluster_feeder.go:450] ClusterSpec fed with #4726 ContainerUsageSamples for #2363 containers. Dropped #0 samples.
I1014 03:20:52.483024       1 recommender.go:165] ClusterState is tracking 1436 PodStates and 1 VPAs
I1014 03:20:52.690068       1 checkpoint_writer.go:114] Saved VPA kube-system/csi-azuredisk-controller checkpoint for csi-snapshotter
I1014 03:20:52.699407       1 checkpoint_writer.go:114] Saved VPA kube-system/csi-azuredisk-controller checkpoint for azuredisk
I1014 03:20:52.710316       1 checkpoint_writer.go:114] Saved VPA kube-system/csi-azuredisk-controller checkpoint for csi-provisioner
I1014 03:20:52.786718       1 checkpoint_writer.go:114] Saved VPA kube-system/csi-azuredisk-controller checkpoint for csi-attacher
I1014 03:20:52.797884       1 checkpoint_writer.go:114] Saved VPA kube-system/csi-azuredisk-controller checkpoint for liveness-probe
I1014 03:20:52.808952       1 checkpoint_writer.go:114] Saved VPA kube-system/csi-azuredisk-controller checkpoint for csi-resizer
I1014 03:20:52.808983       1 recommender.go:175] ClusterState is tracking 449 aggregated container states
I1014 03:21:35.009970       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/controller_fetcher/controller_fetcher.go:136: Watch close - *v1.Job total 7 items received

Updater logs:

I1014 03:08:51.225357       1 update_priority_calculator.go:146] pod accepted for update kube-system/csi-azuredisk-controller-77fd5ccddb-cnc28 with priority 1.0833333015441895 - processed recommendations:
azuredisk: target: 43691k uncappedTarget: 43691k 
csi-attacher: target: 43691k uncappedTarget: 43691k 
csi-provisioner: target: 43691k uncappedTarget: 43691k 
csi-resizer: target: 43691k uncappedTarget: 43691k 
csi-snapshotter: target: 43691k uncappedTarget: 43691k 
liveness-probe: target: 43691k uncappedTarget: 43691k 
I1014 03:08:51.225619       1 update_priority_calculator.go:146] pod accepted for update kube-system/csi-azuredisk-controller-77fd5ccddb-mft2v with priority 1.0833333015441895 - processed recommendations:
azuredisk: target: 43691k uncappedTarget: 43691k 
csi-attacher: target: 43691k uncappedTarget: 43691k 
csi-provisioner: target: 43691k uncappedTarget: 43691k 
csi-resizer: target: 43691k uncappedTarget: 43691k 
csi-snapshotter: target: 43691k uncappedTarget: 43691k 
liveness-probe: target: 43691k uncappedTarget: 43691k 
I1014 03:08:51.225698       1 updater.go:228] evicting pod kube-system/csi-azuredisk-controller-77fd5ccddb-cnc28
I1014 03:08:51.254218       1 event.go:298] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"csi-azuredisk-controller-77fd5ccddb-cnc28", UID:"b1a7a872-08f9-4613-80d7-fd968c8959b9", APIVersion:"v1", ResourceVersion:"5463334", FieldPath:""}): type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.
I1014 03:09:23.415340       1 reflector.go:790] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go:94: Watch close - *v1.StatefulSet total 4 items received
@umagnus umagnus added the kind/bug Categorizes issue or PR as related to a bug. label Oct 15, 2024
@Shubham82
Copy link
Contributor

/area vertical-pod-autoscaler

@adrianmoisey
Copy link
Member

Can you paste the output of the webhook: kubectl get MutatingWebhookConfiguration vpa-webhook-config -o yaml

Do other webhooks work in AKS? (ie: is this AKS blocking changes to kube-system?)

@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

Here is the webhook

$ kubectl get MutatingWebhookConfiguration aks-node-mutating-webhook -o yaml
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"admissionregistration.k8s.io/v1","kind":"MutatingWebhookConfiguration","metadata":{"annotations":{},"labels":{"admissions.enforcer/disabled":"true","kubernetes.azure.com/controlPlaneWebhook":"Reconcile"},"name":"aks-node-mutating-webhook"},"webhooks":[{"admissionReviewVersions":["v1beta1"],"clientConfig":{"caBundle":"...","url":"https://ccp-webhook.....svc.cluster.local.:8443/mutate-nodes"},"failurePolicy":"Fail","matchPolicy":"Equivalent","name":"aks-node-mutating-webhook.azmk8s.io","rules":[{"apiGroups":[""],"apiVersions":["v1"],"operations":["CREATE"],"resources":["nodes"]}],"sideEffects":"NoneOnDryRun","timeoutSeconds":5}]}
  creationTimestamp: "2024-10-14T07:40:52Z"
  generation: 1
  labels:
    admissions.enforcer/disabled: "true"
    kubernetes.azure.com/controlPlaneWebhook: Reconcile
  name: aks-node-mutating-webhook
  resourceVersion: "1898374"
  uid: 1358a827-c00b-47ef-9e37-ba20461bdc0f
webhooks:
- admissionReviewVersions:
  - v1beta1
  clientConfig:
    caBundle: ...
    url: https://ccp-webhook....svc.cluster.local.:8443/mutate-nodes
  failurePolicy: Fail
  matchPolicy: Equivalent
  name: aks-node-mutating-webhook.azmk8s.io
  namespaceSelector: {}
  objectSelector: {}
  reinvocationPolicy: Never
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    resources:
    - nodes
    scope: '*'
  sideEffects: NoneOnDryRun
  timeoutSeconds: 5

I check the other webhook in AKS but they seems not related to it

@adrianmoisey
Copy link
Member

That isn't a webhook for the VPA.
Based on the logs you pasted, the web hook was created:

I1014 02:56:05.310003       1 config.go:174] Self registration as MutatingWebhook succeeded.

But the one you listed isn't it.

@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

Sorry I pasted the wrong one:

$ k get MutatingWebhookConfiguration vpa-webhook-config -o yaml
WARNING: version difference between client (1.31) and server (1.29) exceeds the supported minor version skew of +/-1
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  creationTimestamp: "2024-10-16T06:22:28Z"
  generation: 2
  name: vpa-webhook-config
  resourceVersion: "2086036"
  uid: 1bf667f1-e275-4bcd-aa1d-ca0969a1072c
webhooks:
- admissionReviewVersions:
  - v1
  clientConfig:
    caBundle: ..
    service:
      name: vpa-webhook
      namespace: kube-system
      port: 443
  failurePolicy: Ignore
  matchPolicy: Equivalent
  name: vpa.k8s.io
  namespaceSelector:
    matchExpressions:
    - key: kubernetes.io/metadata.name
      operator: NotIn
      values:
      - ""
    - key: kubernetes.azure.com/managedby
      operator: NotIn
      values:
      - aks
    - key: control-plane
      operator: NotIn
      values:
      - "true"
  objectSelector: {}
  reinvocationPolicy: Never
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    resources:
    - pods
    scope: '*'
  - apiGroups:
    - autoscaling.k8s.io
    apiVersions:
    - '*'
    operations:
    - CREATE
    - UPDATE
    resources:
    - verticalpodautoscalers
    scope: '*'
  sideEffects: None
  timeoutSeconds: 30

@adrianmoisey
Copy link
Member

adrianmoisey commented Oct 16, 2024

It seems like AKS adds some specific AKS config there:

  namespaceSelector:
    matchExpressions:
    - key: kubernetes.io/metadata.name
      operator: NotIn
      values:
      - ""
    - key: kubernetes.azure.com/managedby
      operator: NotIn
      values:
      - aks
    - key: control-plane
      operator: NotIn
      values:
      - "true"

Can you show the details of the namespace? kubectl get namespace kube-system -o yaml ?

@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

It seems to need add a label in MutatingWebhookConfiguration in AKS. But when I edited vpa-webhook-config and restart vpa-admission-controller pod, the vpa-webhook-config MutatingWebhookConfiguration will change to default. Is there any function to add a label to it?

@adrianmoisey
Copy link
Member

I'm not sure what label you're talking about. I assume AKS is specifically designed to not allow VPA to modify kube-system resources.
Can you paste the output of the previous command I sent?

@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

Here is the output:

$ kubectl get namespace kube-system -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","control-plane":"true","kubernetes.azure.com/managedby":"aks","kubernetes.io/cluster-service":"true"},"name":"kube-system"}}
  creationTimestamp: "2024-10-14T07:40:51Z"
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    control-plane: "true"
    kubernetes.azure.com/managedby: aks
    kubernetes.io/cluster-service: "true"
    kubernetes.io/metadata.name: kube-system
  name: kube-system
  resourceVersion: "523"
  uid: c857ad6a-613c-42f9-bf20-31722ba90d88
spec:
  finalizers:
  - kubernetes
status:
  phase: Active

@adrianmoisey
Copy link
Member

It seems like the VPA is being configured to avoid mutating pods in the kube-system namespace.
I would guess that AKS is specifically designed this way to protect that workload.
I'd recommend asking AKS support for help here.

@adrianmoisey
Copy link
Member

/remove-kind bug
/kind support

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Oct 16, 2024
@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

Yes, they have aks managed vpa and it need to add a label to webhook. Do upstream VPA can edit MutatingWebhookConfiguration to add a label? It will recreate a default one when recreate admission-controller pod

@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

like this:

  generation: 1
  labels:
    admissions.enforcer/disabled: "true"
    app.kubernetes.io/managed-by: aks
  name: vpa-webhook-config

@adrianmoisey
Copy link
Member

Yes, they have aks managed vpa and it need to add a label to webhook. Do upstream VPA can edit MutatingWebhookConfiguration to add a label? It will recreate a default one when recreate admission-controller pod

Which variant of the VPA are you running? One installed using AKS or did you install this yourself?
The webhook that you pasted is already non-standard, so AKS is possibly adding their own labels already.

@adrianmoisey
Copy link
Member

Ah, I see you're referring to https://learn.microsoft.com/en-us/azure/aks/faq#can-admission-controller-webhooks-impact-kube-system-and-internal-aks-namespaces-

I guess you can make a PR to include a flag to toggle that

@adrianmoisey
Copy link
Member

Another option is to disable the webhook being created --register-webhook=false, and adding it yourself.

@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

Yes, they have aks managed vpa and it need to add a label to webhook. Do upstream VPA can edit MutatingWebhookConfiguration to add a label? It will recreate a default one when recreate admission-controller pod

Which variant of the VPA are you running? One installed using AKS or did you install this yourself? The webhook that you pasted is already non-standard, so AKS is possibly adding their own labels already.

Yes, the config with that label is AKS VPA. I will try --register-webhook=false and to see how to make a PR to add label. Thanks!

@umagnus
Copy link
Contributor Author

umagnus commented Oct 16, 2024

Hi, @adrianmoisey I have make a pr for it: #7402. Could you please have a look? Thanks!

@voelzmo
Copy link
Contributor

voelzmo commented Oct 28, 2024

/close as #7402 has been merged. Thanks!

@voelzmo
Copy link
Contributor

voelzmo commented Oct 28, 2024

/close

@k8s-ci-robot
Copy link
Contributor

@voelzmo: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

adrianmoisey added a commit to adrianmoisey/autoscaler that referenced this issue Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/vertical-pod-autoscaler kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

5 participants