Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix clusterrole permissions: watch cninodes #2567

Closed
wants to merge 1 commit into from

Conversation

marcincuber
Copy link
Contributor

@marcincuber marcincuber commented Sep 14, 2023

What type of PR is this?
Fix permissions issue

Which issue does this PR fix:
VPC CNI errors:

E0914 13:07:56.908969      12 reflector.go:148] pkg/mod/k8s.io/client-go@v0.27.3/tools/cache/reflector.go:231: Failed to watch *v1alpha1.CNINode: unknown (get cninodes.vpcresources.k8s.aws)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@marcincuber marcincuber requested a review from a team as a code owner September 14, 2023 13:15
@jdn5126
Copy link
Contributor

jdn5126 commented Sep 14, 2023

@marcincuber in what context are you seeing this error? The VPC CNI only does a GET and PATCH on CNINode (https://github.com/aws/amazon-vpc-cni-k8s/blob/v1.15.0/pkg/ipamd/ipamd.go#L2283), so WATCH should not be required

@marcincuber
Copy link
Contributor Author

marcincuber commented Sep 14, 2023

@marcincuber in what context are you seeing this error? The VPC CNI only does a GET and PATCH on CNINode (https://github.com/aws/amazon-vpc-cni-k8s/blob/v1.15.0/pkg/ipamd/ipamd.go#L2283), so WATCH should not be required

Not sure which context. But we upgraded to 1.15.0 today and saw the error immediately being thrown every 50 seconds or so. Adding watch perms fixed it. I believe it was happening on the main container aws-node.

Worth noting that it wasn't happening on 1.14.X release which we tested yesterday as well.

@jdn5126
Copy link
Contributor

jdn5126 commented Sep 14, 2023

This is just an error log in the IPAMD logs, right? In other words, the container itself is not crashing and restarting, right? What I assume is happening is that the k8s client is trying to cache CNINode, so it issues a WATCH and logs an error since it lacks permission. We do not need or want to cache this resource locally, though, so this log is harmless. We should still fix it in a future version, though.

@marcincuber
Copy link
Contributor Author

This is just an error log in the IPAMD logs, right? In other words, the container itself is not crashing and restarting, right? What I assume is happening is that the k8s client is trying to cache CNINode, so it issues a WATCH and logs an error since it lacks permission. We do not need or want to cache this resource locally, though, so this log is harmless. We should still fix it in a future version, though.

Correct, container runs fine and no crashing. Just logging the same error without watch permission.

@jdn5126
Copy link
Contributor

jdn5126 commented Sep 14, 2023

Thanks for finding this @marcincuber! I want to do some more digging on a solution that does not involve caching CNINode resources, as they scale with the number of nodes in the cluster. Adding watch permission is a great workaround in the meantime.

@jdn5126
Copy link
Contributor

jdn5126 commented Sep 15, 2023

@marcincuber I created #2570 for the long-term fix, as this way no WATCH is created for CNINode resources.

@jdn5126
Copy link
Contributor

jdn5126 commented Sep 19, 2023

Closing in favor of #2570

@jdn5126 jdn5126 closed this Sep 19, 2023
@marcincuber marcincuber deleted the fix/role-perms branch September 19, 2023 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants