-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster-autoscaler caches workload cluster kubeconfig #4784
Comments
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/assign I'm gonna take a stab at this. |
@charlie-haley Hello 👋 Wouldn't it be better to just not cache the config at all, but rather whenever it tries to access something just always fetch the kubeconfig? At least that's what I'm planning on doing. |
(copied from slack) another approach might be to add a flag which would allow a user to specify that it should be reloaded on each transaction. my concern is that the cluster-api provider is quite chatty on the kube client and i wonder if rebuilding the client every 15 seconds is going to negatively affect performance. i don't think there would be an issue rebuilding it once every scan interval, maybe we could focus on have one client cached per interval or something. one more thought, rebuilding on failure is another option here. on further reflection the flag is probably not a good option. we should either rebuild every interval, or rebuild on failure, potentially keeping a single cached client for backup. |
Cool, further talk with more ideas:
|
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten Actively working on this. We have a way forward. We are going to fetch the token from the kubeconfig; create a client and override the transport; and keep refreshing the token in a go routine separate from the actual call. This will make sure that we don't have to rebuild the informers all the time and client should remain authenticated. 🤞 Testing this will be super painful. :D |
Sorry for the slow reply, I was away! That's correct, we'd reload the autoscaler when the secret changed which was roughly every 10 minutes. It's not the cleanest solution but it worked for our use-case so I never delved into it further. Refreshing the kubeconfig token in the background definitely sounds like a good plan 🎉 |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
i'm not sure if we've completely solved this yet, but i want to keep it open until we know as this is important to capi. /reopen |
@elmiko: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Which component are you using?: cluster-autoscaler
What version of the component are you using?: 1.21.2
What k8s version are you using (
kubectl version
)?:kubectl version
OutputWhat environment is this in?: EKS managed by Cluster API
What did you expect to happen?:
The autoscaler should use the renewed kubeconfig for the workload cluster.
What happened instead?:
After around ~10 minutes the token renews in the kubeconfig secret for the workload cluster and the autoscaler no longer works, the pod has to be killed for it to pick up the new token.
How to reproduce it (as minimally and precisely as possible):
Deploy cluster-autoscaler on a CAPI management cluster, pointing at a workload cluster.
Example Helm values:
The text was updated successfully, but these errors were encountered: