To ensure Workload Identity is correctly configured for a Google Cloud Service Account (GSA), and a Kubernetes Service Account (KSA), you need to check:
-
If the Google Cloud Service Account has the desired
iam-policy-binding
.By running the following command:
gcloud iam service-accounts get-iam-policy gsa-name@$PROJECT_ID.iam.gserviceaccount.com
if the
iam-policy-binding
is correct, you'll find aniam-policy-binding
similar to:bindings: - members: - serviceAccount:$PROJECT_ID.svc.id.goog[ksa-namespace/ksa-name] role: roles/iam.workloadIdentityUser
-
If the Kubernetes Service Account has the desired annotation.
By running the following command:
kubectl get serviceaccount ksa-name -n ksa-namespace -o yaml
if the annotation exists, you'll find an annotation similar to:
metadata: annotations: iam.gke.io/gcp-service-account: gsa-name@$PROJECT_ID.iam.gserviceaccount.com
To ensure kubernetes Secrets correctly are configured in a namespace, you need to check if a Kubernetes Secret with the correct name is in the namespace.
By default, this secret is named google-cloud-key
, but it could be set to
something else either in the custom object itself (e.g. the Source's
spec.secret
) or in the config-gcp-auth
ConfigMap. Verify that a secret with
the correct name is in the same namespace as the custom object by running the
following command:
kubectl get secret -n namespace
If a resource instance is not READY due to an authentication configuration
problem, you are likely to get permission related error for
User not authorized to perform this action
(for Workload Identity, the error
might look like
IAM returned 403 Forbidden: The caller does not have permission
).
This error indicates that the Control Plane may not configure authentication properly. You can find it in a resource instance by:
kubectl describe resource-kind resource-instance-name -n resource-instance-namespace
Here is a detailed example checking a CloudAuditlogsSource test
in namespace
default
:
After running the following command, you will find a couple of Condition
s
under this CloudAuditLogsSource's Status
.
kubectl describe cloudauditlogssource test -n default
-
If this CloudAuditLogsSource failed due to a missing Kubernetes Secret (for Kubernetes Secret authentication configuration), or a missing Kubernetes Service Account annotation (for Workload Identity authentication configuration), the
Condition
Ready
would look like:Type: Ready Status: False Message: Failed to reconcile Pub/Sub topic: rpc error: code = PermissionDenied desc = User not authorized to perform this action. Reason: TopicReconcileFailed
-
If this CloudAuditLogsSource failed due to the JSON private key stored in the Kubernetes Secret having been revoked (for Kubernetes Secret authentication configuration), the
Condition
Ready
would look like:Type: Ready Status: False Message: Failed to reconcile Pub/Sub topic: rpc error: code = Unauthenticated desc = transport: oauth2: cannot fetch token: 400 Bad Request Response: {"error":"invalid_grant","error_description":"Invalid JWT Signature."} Reason: TopicReconcileFailed
-
If this CloudAuditLogsSource failed due to the GSA having insufficient permissions (for either Kubernetes Secret authentication configuration or Workload Identity authentication configuration), the
Condition
Ready
would look like:Type: Ready Status: False Message: Failed to ensure creation of logging sink: rpc error: code = PermissionDenied desc = The caller does not have permission Reason: SinkCreateFailed
-
If this CloudAuditLogsSource failed due to the
iam-policy-binding
being setup incorrectly on the Google Service Account (for Workload Identity authentication configuration), theCondition
Ready
would look like:Type: Ready Status: False Message: Failed to reconcile Pub/Sub topic: rpc error: code = Unauthenticated desc = transport: compute: Received 403 ` unable to generate token; IAM returned 403 Forbidden: The caller does not have permission his error could be caused by a missing IAM policy binding on the target IAM service account. Reason: TopicReconcileFailed
To solve this issue, you can:
-
Check the Google Cloud Service Account
cloud-run-events
for the Control Plane has all required permissions. -
Check authentication configuration is correct for the Control Plane.
- If you are using Workload Identity for the Control Plane, refer
here to
check the Google Cloud Service Account
cloud-run-events
, and the Kubernetes Service Accountcontroller
in namespacecloud-run-events
. - If you are using Kubernetes Secret for the Control Plane, refer
here to
check the Kubernetes Secret
google-cloud-key
in namespacecloud-run-events
.
- If you are using Workload Identity for the Control Plane, refer
here to
check the Google Cloud Service Account
Note: For Kubernetes Secret, if the JSON private key no longer exists
under your Google Cloud Service Account cloud-run-events
. Then, even the
Google Cloud Service Account cloud-run-events
has all required permissions,
and the corresponding Kubernetes Secret google-cloud-key
is in namespace
cloud-run-events
, you still get permission related error. To such case, you
have to re-download the JSON private key and re-create the Kubernetes Secret,
refer
here
for instructions.
Sometimes, a resource instance is READY, but it can't receive Events. It might
be an authentication configuration problem, if the underlying Deployment
doesn't have minimum availability.
Typically, the name of an underlying Deployment
for a resource instance could
be a truncated version of (prefix)-(resource-instance-name)-(uid)
.
- If the resource instance is a
Source
, the prefix iscre-src
. - If the resource instance is a
Channel
, the prefix iscre-chan
. - If the resource instance is a
Pullsubscription
, the prefix iscre-ps
You can use the following command to check if the underlying Deployment
's
available pod is zero.
kubectl get deployment -n resource-instance-namespace
Here is a detailed example checking a CloudAuditlogsSource test
's underlying
Deployment
in namespace default
:
After running the following command, you will find a Deployment
named as
cre-src-cloudauditlogssource-te(UID)
.
kubectl get deployment -n default
-
If this
Deployment
doesn't have minimum availability due to a missing Kubernetes Secret (for Kubernetes Secret authentication configuration), thePod
which belongs to thisDeployment
(Pod
's name will have the same prefixcre-src-cloudauditlogssource-
as theDeployment
's name) will block atContainerCreating
status. Usingkubectl describe pod pod-name -n namespace
, you can find a Warning Event like this:Warning FailedMount 27s (x2 over 2m40s) kubelet, gke-knative-1-default-pool-73d30583-c1hv Unable to mount volumes for pod "cre-src-cloudauditlogssource-tef1a55bfca8e1d5620e0c4199495mwk6k_default(f35883bd-ee66-4497-94fa-05196a6043fe)": timeout expired waiting for volumes to attach or mount for pod "default"/"cre-src-cloudauditlogssource-tef1a55bfca8e1d5620e0c4199495mwk6k". list of unmounted volumes=[google-cloud-key]. list of unattached volumes=[google-cloud-key default-token-dd9cd]
-
If this
Deployment
doesn't have minimum availability due to the JSON private key stored in the Kubernetes Secret having been revoked (for Kubernetes Secret authentication configuration), thePod
which belongs to thisDeployment
(Pod
's name will have the same prefixcre-src-cloudauditlogssource-
as theDeployment
's name) will block atError
status. Usingkubectl log pod-name -n namespace
, you can find a Log containing information like this:"msg":"failed to start adapter: ", "commit":"9e8388f", "error":"unable to create subscription \"cre-src_default_cloudauditlogssource-test_a2ca50b7-c951-4682-b8f0-e902a449dbb1\", rpc error: code = Unauthenticated desc = transport: oauth2: cannot fetch token: 400 Bad Request\nResponse: {\"error\":\"invalid_grant\",\"error_description\":\"Invalid JWT Signature.\"}"
-
If this
Deployment
doesn't have minimum availability due to the GSA having insufficient permissions (for either Kubernetes Secret authentication configuration or Workload Identity authentication configuration), thePod
which belongs to thisDeployment
(Pod
's name will have the same prefixcre-src-cloudauditlogssource-
as theDeployment
's name) will block atError
status. Usingkubectl log pod-name -n namespace
, you can find a Log containing information like this:"msg":"failed to start adapter: ", "commit":"9e8388f", "error":"unable to create subscription \"cre-src_default_cloudauditlogssource-test_663a7f95-8b82-46aa-85b8-8fee4e65b26f\", rpc error: code = PermissionDenied desc = User not authorized to perform this action.
-
If this
Deployment
doesn't have minimum availability due to a missing Kubernetes Service Account (for Workload Identity authentication configuration), theDeployment
can't create anyPod
. Usingkubectl describe deployment-name -n namespace
, you can find aCondition
ReplicaFailure
underStatus
like this:type: ReplicaFailure status: "True" reason: FailedCreate message: 'pods "cre-src-cloudauditlogssource-tebc146e4349a11efaf078c20f8ab65089-6b6b57997d-" is forbidden: error looking up service account default/test1: serviceaccount "test1" not found'
-
If this
Deployment
doesn't have minimum availability due to a missing Kubernetes Service Account annotation (for Workload Identity authentication configuration), thePod
which belongs to thisDeployment
(Pod
's name will have the same prefixcre-src-cloudauditlogssource-
as theDeployment
's name) will block atError
status. Usingkubectl log pod-name -n namespace
, you can find a Log containing information like this:"msg":"failed to start adapter: ", "commit":"9e8388f", "error":"unable to create subscription \"cre-src_default_cloudauditlogssource-test_663a7f95-8b82-46aa-85b8-8fee4e65b26f\", rpc error: code = PermissionDenied desc = User not authorized to perform this action.
-
If this
Deployment
doesn't have minimum availability due to theiam-policy-binding
being setup incorrectly (for Workload Identity authentication configuration), thePod
which belongs to thisDeployment
(Pod
's name will have the same prefixcre-src-cloudauditlogssource-
as theDeployment
's name) will block atError
status. Usingkubectl log pod-name -n namespace
, you can find a Log containing information like this:"msg":"failed to start adapter: ", "commit":"9e8388f", "error":"unable to create subscription \"cre-src_default_cloudauditlogssource-test_4daeacfe-6155-4f69-8d52-64745fa0591d\", rpc error: code = Unauthenticated desc = transport: compute: Received 403 `\nUnable to generate token; IAM returned 403 Forbidden: The caller does not have permission\n\nThis error could be caused by a missing IAM policy binding on the target IAM service account.
To solve this issue, you can:
-
Check the Google Cloud Service Account
cre-dataplane
for the Data Plane has all required permissions. -
Check authentication configuration is correct for this resource instance.
- If you are using Workload Identity for this resource instance, refer
here to
check the Google Cloud Service Account
cre-dataplane
, and the Kubernetes Service Account in the namespace where this resource instance resides. - If you are using Kubernetes Secrets for this resource instance, refer here to check the Kubernetes Secret in namespace where this resource instance resides.
- If you are using Workload Identity for this resource instance, refer
here to
check the Google Cloud Service Account
Note: For Workload Identity, there is a known issue #759 for credential sync delay (~1 min) in resources' underlying components. You'll probably encounter this issue, if you immediately send Events after you finish Workload Identity setup for a resource instance.
This error only exists when you use default scenario for Workload Identity. You can find detailed failure message by:
kubectl describe resource-kind resource-instance-name -n resource-instance-namespace
To solve this issue, you can:
- Make sure the
workloadIdentityMapping
underdefault-auth-config
inConfigMap
config-gcp-auth
is correct (a correct Kubernetes Service Account paired with a correct Google Cloud Service Account). - If the
Condition
Ready
has permission related error message like this:it is most likely that you didn't granttype: Ready status: "False" message: 'rpc error: code = PermissionDenied desc = Permission iam.serviceAccounts.setIamPolicy is required to perform this operation on service account projects/-/serviceAccounts/cre-dataplane@PROJECT_ID.iam.gserviceaccount.com.' reason: WorkloadIdentityFailed
iam.serviceAccountAdmin
permission of the Google Cloud Service Account to the Control Plane's Google Cloud Service Accountcloud-run-events
, refer to default scenario to grant permission. - If the
Condition
Ready
has concurrency related error message like this:the controller will retry it in the next reconciliation loop (the maximum retry period is 5 min). You can also use non-default scenario if this error lasts for a long time.type: Ready status: "False" message: 'adding iam policy binding failed with: failed to set iam policy: googleapi: Error 409: There were concurrent policy changes. Please retry the whole read-modify-write with exponential backoff' reason: WorkloadIdentityFailed