Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes v1.18.8 - Startup Probes missing - Pod is missing attributes #84

Closed
Xyaren opened this issue Oct 16, 2020 · 21 comments
Closed

Comments

@Xyaren
Copy link

Xyaren commented Oct 16, 2020

Basic Questions answered here:
kubernetes/kubernetes#95604 (comment)

I think the webhook might be the issue.

Environment:

  • AWS Region: eu-central-1
  • EKS Platform version: eks.1
  • Kubernetes version: v1.18.8-eks-7c9bda
  • Webhook Version: How do i find this out ?
@Xyaren
Copy link
Author

Xyaren commented Oct 16, 2020

I managed to get the audit event for the modification by the webhook:
Used Deployment:

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: "*********REDACTED*********"
  name: experiment-service-account
  namespace: dev
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: experiment-bug-chaser
  namespace: dev
spec:
  selector:
    matchLabels:
      app: experiment-bug-chaser-pod
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 50%
  template:
    metadata:
      labels:
        app: experiment-bug-chaser-pod
    spec:
      serviceAccountName: experiment-service-account
      containers:
        - name: app-container
          image: brndnmtthws/nginx-echo-headers:latest
          imagePullPolicy: Always
          readinessProbe:
            httpGet:
              path: /
              port: 8080
            timeoutSeconds: 1
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /
              port: 8080
            timeoutSeconds: 1
            periodSeconds: 10
          startupProbe:
            httpGet:
              path: /
              port: 8080
            timeoutSeconds: 1
            periodSeconds: 5
            failureThreshold: 10

Event:


{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "6df7d8da-1ebd-494b-a155-e5df8440001e",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/dev/pods",
"verb": "create",
"user": {
"username": "system:serviceaccount:kube-system:replicaset-controller",
"uid": "0f7d7dbc-a47d-11e9-a50f-02ba487cec36",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:kube-system",
"system:authenticated"
]
},
"sourceIPs": [
"172.16.38.182"
],
"userAgent": "kube-controller-manager/v1.18.8 (linux/amd64) kubernetes/7c9bda5/system:serviceaccount:kube-system:replicaset-controller",
"objectRef": {
"resource": "pods",
"namespace": "dev",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"code": 201
},
"requestObject": {
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"generateName": "experiment-bug-chaser-7c8fff648c-",
"creationTimestamp": null,
"labels": {
"app": "experiment-bug-chaser-pod",
"pod-template-hash": "7c8fff648c"
},
"ownerReferences": [
{
"apiVersion": "apps/v1",
"kind": "ReplicaSet",
"name": "experiment-bug-chaser-7c8fff648c",
"uid": "c8ced562-8446-4f26-bb8f-55df36be99d8",
"controller": true,
"blockOwnerDeletion": true
}
]
},
"spec": {
"containers": [
{
"name": "app-container",
"image": "brndnmtthws/nginx-echo-headers:latest",
"resources": {},
"livenessProbe": {
"httpGet": {
"path": "/",
"port": 8080,
"scheme": "HTTP"
},
"timeoutSeconds": 1,
"periodSeconds": 10,
"successThreshold": 1,
"failureThreshold": 3
},
"readinessProbe": {
"httpGet": {
"path": "/",
"port": 8080,
"scheme": "HTTP"
},
"timeoutSeconds": 1,
"periodSeconds": 10,
"successThreshold": 1,
"failureThreshold": 3
},
"startupProbe": {
"httpGet": {
"path": "/",
"port": 8080,
"scheme": "HTTP"
},
"timeoutSeconds": 1,
"periodSeconds": 5,
"successThreshold": 1,
"failureThreshold": 10
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "Always"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"serviceAccountName": "experiment-service-account",
"serviceAccount": "experiment-service-account",
"securityContext": {},
"schedulerName": "default-scheduler",
"enableServiceLinks": true
},
"status": {}
},
"responseObject": {
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"name": "experiment-bug-chaser-7c8fff648c-zh7s7",
"generateName": "experiment-bug-chaser-7c8fff648c-",
"namespace": "dev",
"selfLink": "/api/v1/namespaces/dev/pods/experiment-bug-chaser-7c8fff648c-zh7s7",
"uid": "56966970-c3bc-4bd0-aa77-f8ee54b03d4e",
"resourceVersion": "168927993",
"creationTimestamp": "2020-10-16T13:24:47Z",
"labels": {
"app": "experiment-bug-chaser-pod",
"pod-template-hash": "7c8fff648c"
},
"annotations": {
"kubernetes.io/psp": "eks.privileged"
},
"ownerReferences": [
{
"apiVersion": "apps/v1",
"kind": "ReplicaSet",
"name": "experiment-bug-chaser-7c8fff648c",
"uid": "c8ced562-8446-4f26-bb8f-55df36be99d8",
"controller": true,
"blockOwnerDeletion": true
}
],
"managedFields": [
{
"manager": "kube-controller-manager",
"operation": "Update",
"apiVersion": "v1",
"time": "2020-10-16T13:24:47Z",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:generateName": {},
"f:labels": {
".": {},
"f:app": {},
"f:pod-template-hash": {}
},
"f:ownerReferences": {
".": {},
"k:{"uid":"c8ced562-8446-4f26-bb8f-55df36be99d8"}": {
".": {},
"f:apiVersion": {},
"f:blockOwnerDeletion": {},
"f:controller": {},
"f:kind": {},
"f:name": {},
"f:uid": {}
}
}
},
"f:spec": {
"f:containers": {
"k:{"name":"app-container"}": {
".": {},
"f:image": {},
"f:imagePullPolicy": {},
"f:livenessProbe": {
".": {},
"f:failureThreshold": {},
"f:httpGet": {
".": {},
"f:path": {},
"f:port": {},
"f:scheme": {}
},
"f:periodSeconds": {},
"f:successThreshold": {},
"f:timeoutSeconds": {}
},
"f:name": {},
"f:readinessProbe": {
".": {},
"f:failureThreshold": {},
"f:httpGet": {
".": {},
"f:path": {},
"f:port": {},
"f:scheme": {}
},
"f:periodSeconds": {},
"f:successThreshold": {},
"f:timeoutSeconds": {}
},
"f:resources": {},
"f:startupProbe": {
".": {},
"f:failureThreshold": {},
"f:httpGet": {
".": {},
"f:path": {},
"f:port": {},
"f:scheme": {}
},
"f:periodSeconds": {},
"f:successThreshold": {},
"f:timeoutSeconds": {}
},
"f:terminationMessagePath": {},
"f:terminationMessagePolicy": {}
}
},
"f:dnsPolicy": {},
"f:enableServiceLinks": {},
"f:restartPolicy": {},
"f:schedulerName": {},
"f:securityContext": {},
"f:serviceAccount": {},
"f:serviceAccountName": {},
"f:terminationGracePeriodSeconds": {}
}
}
}
]
},
"spec": {
"volumes": [
{
"name": "aws-iam-token",
"projected": {
"sources": [
{
"serviceAccountToken": {
"audience": "sts.amazonaws.com",
"expirationSeconds": 86400,
"path": "token"
}
}
],
"defaultMode": 420
}
},
{
"name": "experiment-service-account-token-zggqp",
"secret": {
"secretName": "experiment-service-account-token-zggqp",
"defaultMode": 420
}
}
],
"containers": [
{
"name": "app-container",
"image": "brndnmtthws/nginx-echo-headers:latest",
"env": [
{
"name": "AWS_DEFAULT_REGION",
"value": "eu-central-1"
},
{
"name": "AWS_REGION",
"value": "eu-central-1"
},
{
"name": "AWS_ROLE_ARN",
"value": "Redacted"
},
{
"name": "AWS_WEB_IDENTITY_TOKEN_FILE",
"value": "/var/run/secrets/eks.amazonaws.com/serviceaccount/token"
}
],
"resources": {},
"volumeMounts": [
{
"name": "experiment-service-account-token-zggqp",
"readOnly": true,
"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
},
{
"name": "aws-iam-token",
"readOnly": true,
"mountPath": "/var/run/secrets/eks.amazonaws.com/serviceaccount"
}
],
"livenessProbe": {
"httpGet": {
"path": "/",
"port": 8080,
"scheme": "HTTP"
},
"timeoutSeconds": 1,
"periodSeconds": 10,
"successThreshold": 1,
"failureThreshold": 3
},
"readinessProbe": {
"httpGet": {
"path": "/",
"port": 8080,
"scheme": "HTTP"
},
"timeoutSeconds": 1,
"periodSeconds": 10,
"successThreshold": 1,
"failureThreshold": 3
},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "Always"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"serviceAccountName": "experiment-service-account",
"serviceAccount": "experiment-service-account",
"securityContext": {},
"schedulerName": "default-scheduler",
"tolerations": [
{
"key": "node.kubernetes.io/not-ready",
"operator": "Exists",
"effect": "NoExecute",
"tolerationSeconds": 300
},
{
"key": "node.kubernetes.io/unreachable",
"operator": "Exists",
"effect": "NoExecute",
"tolerationSeconds": 300
}
],
"priorityClassName": "default",
"priority": 0,
"enableServiceLinks": true
},
"status": {
"phase": "Pending",
"qosClass": "BestEffort"
}
},
"requestReceivedTimestamp": "2020-10-16T13:24:47.047669Z",
"stageTimestamp": "2020-10-16T13:24:47.069906Z",
"annotations": {
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding "system:controller:replicaset-controller" of ClusterRole "system:controller:replicaset-controller" to ServiceAccount "replicaset-controller/kube-system"",
"mutation.webhook.admission.k8s.io/round_0_index_0": "{"configuration":"pod-identity-webhook","webhook":"iam-for-pods.amazonaws.com","mutated":true}",
"mutation.webhook.admission.k8s.io/round_0_index_1": "{"configuration":"vpc-resource-mutating-webhook","webhook":"mpod.vpc.k8s.aws","mutated":false}",
"patch.webhook.admission.k8s.io/round_0_index_0": "{"configuration":"pod-identity-webhook","webhook":"iam-for-pods.amazonaws.com","patch":[{"op":"add","path":"/spec/volumes/0","value":{"name":"aws-iam-token","projected":{"sources":[{"serviceAccountToken":{"audience":"sts.amazonaws.com","expirationSeconds":86400,"path":"token"}}]}}},{"op":"add","path":"/spec/containers","value":[{"name":"app-container","image":"brndnmtthws/nginx-echo-headers:latest","env":[{"name":"AWS_DEFAULT_REGION","value":"eu-central-1"},{"name":"AWS_REGION","value":"eu-central-1"},{"name":"AWS_ROLE_ARN","value":"Redacted"},{"name":"AWS_WEB_IDENTITY_TOKEN_FILE","value":"/var/run/secrets/eks.amazonaws.com/serviceaccount/token"}],"resources":{},"volumeMounts":[{"name":"experiment-service-account-token-zggqp","readOnly":true,"mountPath":"/var/run/secrets/kubernetes.io/serviceaccount"},{"name":"aws-iam-token","readOnly":true,"mountPath":"/var/run/secrets/eks.amazonaws.com/serviceaccount"}],"livenessProbe":{"httpGet":{"path":"/","port":8080,"scheme":"HTTP"},"timeoutSeconds":1,"periodSeconds":10,"successThreshold":1,"failureThreshold":3},"readinessProbe":{"httpGet":{"path":"/","port":8080,"scheme":"HTTP"},"timeoutSeconds":1,"periodSeconds":10,"successThreshold":1,"failureThreshold":3},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"Always"}]}],"patchType":"JSONPatch"}",
"podsecuritypolicy.policy.k8s.io/admit-policy": "eks.privileged",
"podsecuritypolicy.policy.k8s.io/validate-policy": "eks.privileged"
}
}

In the annotations you can see the Modifications:

{
  "configuration": "pod-identity-webhook",
  "webhook": "iam-for-pods.amazonaws.com",
  "patch": [
    {
      "op": "add",
      "path": "/spec/volumes/0",
      "value": {
        "name": "aws-iam-token",
        "projected": {
          "sources": [
            {
              "serviceAccountToken": {
                "audience": "sts.amazonaws.com",
                "expirationSeconds": 86400,
                "path": "token"
              }
            }
          ]
        }
      }
    },
    {
      "op": "add",
      "path": "/spec/containers",
      "value": [
        {
          "name": "app-container",
          "image": "brndnmtthws/nginx-echo-headers:latest",
          "env": [
            {
              "name": "AWS_DEFAULT_REGION",
              "value": "eu-central-1"
            },
            {
              "name": "AWS_REGION",
              "value": "eu-central-1"
            },
            {
              "name": "AWS_ROLE_ARN",
              "value": "*********REDACTED*********"
            },
            {
              "name": "AWS_WEB_IDENTITY_TOKEN_FILE",
              "value": "/var/run/secrets/eks.amazonaws.com/serviceaccount/token"
            }
          ],
          "resources": {},
          "volumeMounts": [
            {
              "name": "experiment-service-account-token-zggqp",
              "readOnly": true,
              "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
            },
            {
              "name": "aws-iam-token",
              "readOnly": true,
              "mountPath": "/var/run/secrets/eks.amazonaws.com/serviceaccount"
            }
          ],
          "livenessProbe": {
            "httpGet": {
              "path": "/",
              "port": 8080,
              "scheme": "HTTP"
            },
            "timeoutSeconds": 1,
            "periodSeconds": 10,
            "successThreshold": 1,
            "failureThreshold": 3
          },
          "readinessProbe": {
            "httpGet": {
              "path": "/",
              "port": 8080,
              "scheme": "HTTP"
            },
            "timeoutSeconds": 1,
            "periodSeconds": 10,
            "successThreshold": 1,
            "failureThreshold": 3
          },
          "terminationMessagePath": "/dev/termination-log",
          "terminationMessagePolicy": "File",
          "imagePullPolicy": "Always"
        }
      ]
    }
  ],
  "patchType": "JSONPatch"
}

so basically the whole container gets replaced but without the startup probe.

@Xyaren Xyaren changed the title Kubernetes v1.18.8 Startup Probes missing Kubernetes v1.18.8 - Startup Probes missing - Pod is missing attributes Oct 16, 2020
@mcristina422
Copy link

I think this is due to the old version of k8s.io/api being used in the webhook. The API for the startup probe was added more recently. Updating the k8s packages should fix this

@smrutiranjantripathy
Copy link

The version "v0.0.0-20190606204050-af9c91bd2759" that is being used in go.mod.[1] doesn't contain startup probe in "core/v1" package. [2] The latest package contains it.[3] Upgrading it will fix the issue.

References:

[1] https://github.com/aws/amazon-eks-pod-identity-webhook/blob/master/go.mod#L21
[2] https://pkg.go.dev/k8s.io/api@v0.0.0-20190606204050-af9c91bd2759/core/v1#Container
[3] https://pkg.go.dev/k8s.io/api@v0.19.2/core/v1#Container

@jqmichael
Copy link
Contributor

jqmichael commented Nov 10, 2020

@shravan-achar
Copy link
Contributor

@moonlsd
Copy link

moonlsd commented Nov 23, 2020

Encounted problem in our production cluster today related to this, after setting serviceAccountName to Deployments, all created Pods lost startupProbe thus it kept failing on livenessProbe and readinessProbe and got killed, hopefully AWS can have a fix soon.

@mcristina422
Copy link

I think the api update here #92 should have fixed the issue

@brunomanzo
Copy link

any plans on releasing this fix?

@mcristina422
Copy link

Every commit is a release, check for the latest here https://hub.docker.com/r/amazon/amazon-eks-pod-identity-webhook/tags?page=1&ordering=last_updated

@brunomanzo
Copy link

@mcristina422 nice, how can i change the docker version on eks?

@mcristina422
Copy link

That's up to however you deployed the webhook onto your EKS cluster. You probably just need to rerun this with the IMAGE updated
https://github.com/aws/amazon-eks-pod-identity-webhook#in-cluster

@Xyaren
Copy link
Author

Xyaren commented Dec 1, 2020

I'm pretty sure the webhook runs on the AWS provided EKS API Server.
How to update it there ?

Isn't that something AWS has to do with a Platform update?

@brunomanzo
Copy link

That is the case, the pod isn't managed by me, it is provided by the EKS cluster...
Or else is just I would just edit a deployment. I will open a support case on AWS

@mcristina422
Copy link

😕 I wasn't aware EKS considered this a managed service. The docs do not make that apparent, the only thing I could really find is this. Our EKS cluster we had to install it manually, I'm not sure what the process is for getting it enabled or updated in the Platform. A maintainer like @josselin-c will need to speak to that

@jqmichael
Copy link
Contributor

We merged a change to update client-go version to 1.18 which should support startupProbe.
#92

We will be releasing a new version of the eks-pod-identity webhook with this fix in 1.19 and to all existing clusters in the coming months.

Patches are usually released to the latest EKS supported k8s version first. Customers are suggested to upgrade to the latest EKS supported k8s version if they want to get unblocked sooner.

@moonlsd
Copy link

moonlsd commented Dec 2, 2020

@jqmichael can we expect with existing cluster, when changes are applied, just redeploying our deployments will simply work, right? As with re-deploy, new webhook version can properly inject correct attributes to pod?

@lukasz-kaniowski
Copy link

Is there any progress on this issue?

@Xyaren
Copy link
Author

Xyaren commented Jan 5, 2021

@jqmichael can we expect with existing cluster, when changes are applied, just redeploying our deployments will simply work, right? As with re-deploy, new webhook version can properly inject correct attributes to pod?

I don't think a redeploy is necessary actually:
The webhook is called when creating a POD from a DEPLOYMENT. So deleting the Pod (or doing a rolling restart) should be sufficient.

@chengchengmu
Copy link

hello @jqmichael !

Thanks for your update !

How will we know when this fix is made available in existing clusters ?

@Skaronator
Copy link

Fix seems to be available in the EKS 1.19 update which was released today.

@Xyaren
Copy link
Author

Xyaren commented Feb 18, 2021

Yes, I'll close this ticket.
https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#kubernetes-1.19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants