Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support for EKS Pod Indentities #493

Open
ricardo8990 opened this issue Jun 26, 2024 · 2 comments
Open

Feature Request: Support for EKS Pod Indentities #493

ricardo8990 opened this issue Jun 26, 2024 · 2 comments

Comments

@ricardo8990
Copy link

If you want to see App Mesh implement this idea, please upvote with a 👍.

Tell us about your request
I think EKS Pod Identities are not supported at this time for the Envoy containers injected in EKS.

Which integration(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
I created an app in my EKS cluster and gave permissions using EKS Pod Identities. I'm deploying a Node App with an AppConfig container. It works fine and the permissions are working as expected. However, when I added the AppMesh integration with the Container Injected automatically I receive the following error:

[2024-06-26 16:00:34.205][21][error][aws] [source/extensions/common/aws/credentials_provider_impl.cc:302] Could not load AWS credentials document from the task role
[2024-06-26 16:00:34.208][15][warning][config] [./source/extensions/config_subscription/grpc/grpc_stream.h:152] StreamAggregatedResources gRPC config stream to appmesh-envoy-management.us-west-2.amazonaws.com:443 closed: 16, Missing Authentication Token

Which causes the AppConfig container to fail trying to fetch the parameters

appconfig agent] 2024/06/26 15:37:10 INFO AppConfig Agent 2.0.3896
[appconfig agent] 2024/06/26 15:37:10 INFO serving on localhost:2772
[appconfig agent] 2024/06/26 15:37:32 ERROR retrieve failure for 'APP:ENV:DEP': bad gateway: network error connecting to service (retry in 60s)

However, I can see that the env variables in the Envoy container that EKS pod identities inject into containers are correctly set:

        - name: AWS_STS_REGIONAL_ENDPOINTS
          value: regional
        - name: AWS_CONTAINER_CREDENTIALS_FULL_URI
          value: 'http://169.254.170.23/v1/credentials'
        - name: AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE
          value: >-
            /var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token

This is the whole manifest for this particular container:

    - env:
        - name: APPMESH_PLATFORM_K8S_VERSION
          value: v1.29.4-eks-036c24b
        - name: APPNET_AGENT_ADMIN_UDS_PATH
          value: /tmp/agent.sock
        - name: APPMESH_PLATFORM_APP_MESH_CONTROLLER_VERSION
          value: v1.12.7-dirty
        - name: APPMESH_RESOURCE_ARN
          value: mesh/MESH/virtualNode/NODE_MESH
        - name: ENVOY_ADMIN_ACCESS_ENABLE_IPV6
          value: 'false'
        - name: APPMESH_FIPS_ENDPOINT
          value: '0'
        - name: ENVOY_LOG_LEVEL
          value: info
        - name: APPMESH_DUALSTACK_ENDPOINT
          value: '0'
        - name: APPMESH_PREVIEW
          value: '0'
        - name: ENVOY_ADMIN_ACCESS_LOG_FILE
          value: /tmp/envoy_admin_access.log
        - name: APPNET_AGENT_ADMIN_MODE
          value: uds
        - name: APPMESH_VIRTUAL_NODE_NAME
          value: mesh/MESH/virtualNode/NODE_MESH
        - name: AWS_REGION
          value: us-west-2
        - name: ENVOY_ADMIN_ACCESS_PORT
          value: '9901'
        - name: APPMESH_PLATFORM_K8S_POD_UID
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.uid
        - name: AWS_STS_REGIONAL_ENDPOINTS
          value: regional
        - name: AWS_CONTAINER_CREDENTIALS_FULL_URI
          value: 'http://169.254.170.23/v1/credentials'
        - name: AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE
          value: >-
            /var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token
      image: >-
        840364872350.dkr.ecr.us-west-2.amazonaws.com/aws-appmesh-envoy:v1.27.2.0-prod
      imagePullPolicy: IfNotPresent
      lifecycle:
        preStop:
          exec:
            command:
              - sh
              - '-c'
              - sleep 20
      name: envoy
      ports:
        - containerPort: 9901
          name: stats
          protocol: TCP
      readinessProbe:
        exec:
          command:
            - sh
            - '-c'
            - >-
              curl -s http://localhost:9901/server_info | grep state | grep -q
              LIVE
        failureThreshold: 3
        initialDelaySeconds: 1
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
      resources:
        requests:
          cpu: 10m
          memory: 32Mi
      securityContext:
        runAsUser: 1337
      terminationMessagePath: /dev/termination-log
      terminationMessagePolicy: File
      volumeMounts:
        - mountPath: /var/run/secrets/pods.eks.amazonaws.com/serviceaccount
          name: eks-pod-identity-token
          readOnly: true
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-8nkjm
          readOnly: true

I wonder if EKS Pod Identities are not supported at this time or if there is something I can't see.

By the way, the App Role already has permissions for appmesh:StreamAggregatedResources with the resource set to the Virtual Node ARN

@ricardo8990
Copy link
Author

I added the ENVOY_LOG_LEVEL to DEBUG and found this logs:

[2024-06-27 03:40:49.620][22][debug][aws] [source/extensions/common/aws/credentials_provider_impl.cc:67] Getting AWS credentials from the environment
[2024-06-27 03:40:49.620][22][debug][aws] [source/extensions/common/aws/credentials_provider_impl.cc:288] Getting AWS credentials from the task role at URI: http://169.254.170.23/v1/credentials
[2024-06-27 03:40:49.621][22][debug][misc] [source/extensions/common/aws/utility.cc:300] Could not fetch AWS metadata: HTTP response code said error
[2024-06-27 03:40:50.281][17][debug][main] [source/server/server.cc:263] flushing stats
[2024-06-27 03:40:50.281][17][debug][main] [source/server/server.cc:273] Envoy is not fully initialized, skipping histogram merge and flushing stats
[2024-06-27 03:40:50.622][22][debug][misc] [source/extensions/common/aws/utility.cc:300] Could not fetch AWS metadata: HTTP response code said error
[2024-06-27 03:40:51.623][22][debug][misc] [source/extensions/common/aws/utility.cc:300] Could not fetch AWS metadata: HTTP response code said error
[2024-06-27 03:40:52.624][22][debug][misc] [source/extensions/common/aws/utility.cc:300] Could not fetch AWS metadata: HTTP response code said error
[2024-06-27 03:40:53.624][22][error][aws] [source/extensions/common/aws/credentials_provider_impl.cc:302] Could not load AWS credentials document from the task role
[2024-06-27 03:40:53.625][22][debug][aws] [source/extensions/common/aws/credentials_provider_impl.cc:442] No AWS credentials found, using anonymous credentials
[2024-06-27 03:40:53.627][17][debug][grpc] [source/common/grpc/google_async_client_impl.cc:379] Finish with grpc-status code 16
[2024-06-27 03:40:53.627][17][debug][grpc] [source/common/grpc/google_async_client_impl.cc:224] notifyRemoteClose 16 Missing Authentication Token
[2024-06-27 03:40:53.627][17][warning][config] [./source/extensions/config_subscription/grpc/grpc_stream.h:152] StreamAggregatedResources gRPC config stream to appmesh-envoy-management.us-west-2.amazonaws.com:443 closed: 16, Missing Authentication Token
[2024-06-27 03:40:53.627][17][debug][config] [source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:115] gRPC update for type.googleapis.com/envoy.config.cluster.v3.Cluster failed

Looking at the logs in the Pod Intentity I can see this repeated many times:

{"client-addr":"10.0.3.220:59826","cluster-name":"CLUSTER_NAME","level":"info","msg":"handling new request request from 10.0.3.220:59826","time":"2024-06-27T03:43:22Z"}
{"client-addr":"10.0.3.220:59826","cluster-name":"CLUSTER_NAME","level":"error","msg":"Error fetching credentials: Service account token cannot be empty","time":"2024-06-27T03:43:22Z"}

@AhmadMS1988
Copy link

AhmadMS1988 commented Jul 17, 2024

Adding to the point, upstream envoy supported it starting 1.30.0.
https://github.com/envoyproxy/envoy/blob/f79b881883e862bc0f7dc7f09d3bc811fb0944f6/changelogs/1.30.0.yaml#L483
Can we have aws-appmesh-envoy image based on 1.30?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants