Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve startup performance of IPAM by optimizing kube client creation #1855

Merged
merged 7 commits into from
Apr 11, 2022

Conversation

backjo
Copy link
Contributor

@backjo backjo commented Feb 8, 2022

What type of PR is this?
bug

Which issue does this PR fix:

What does this PR do / Why do we need it:
This improves the startup performance of the IPAM. Right now, when the IPAM initializes, the kube-controller client does API discovery to build a mapper. This operation is done twice (once for the regular client, once for the cache client). This PR re-uses the RESTMapper, and also increases the burst option to reduce client-side throttling happening during the construction of the mapper.

Today, it often takes between 5-10 seconds for these clients to get initialized - see the time difference between k8s cluster communication being validated and the aws session getting started:

{"level":"info","ts":"2022-02-08T16:53:04.255Z","caller":"aws-k8s-agent/main.go:42","msg":"Successful communication with the Cluster! Cluster Version is: v1.21+. git version: v1.21.5-eks-bc4871b. git tree state: clean. commit: 5236faf39f1b7a7dabea8df12726f25608131aa9. platform: linux/amd64"}
{"level":"warn","ts":"2022-02-08T16:53:09.263Z","caller":"awssession/session.go:64","msg":"HTTP_TIMEOUT env is not set or set to less than 10 seconds, defaulting to httpTimeout to 10sec"}

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:

Testing done on this change:

Automation added to e2e:

** Will this PR introduce any new dependencies?**:

Will this break upgrades or downgrades. Has updating a running cluster been tested?:

Does this change require updates to the CNI daemonset config files to work?:

Does this PR introduce any user-facing change?:


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@backjo backjo requested a review from a team as a code owner February 8, 2022 18:25
@backjo backjo changed the title Fix/improve cache performance Improve performance of IPAM by optimizing kube client creation Feb 8, 2022
@jayanthvn jayanthvn requested a review from achevuru February 8, 2022 18:26
@backjo backjo changed the title Improve performance of IPAM by optimizing kube client creation Improve startup performance of IPAM by optimizing kube client creation Feb 8, 2022
@achevuru
Copy link
Contributor

achevuru commented Feb 23, 2022

LGTM. Thanks for the PR.

Would be good if you can quickly run the integration test suites on your cluster - this & this

Copy link
Contributor

@jayanthvn jayanthvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@M00nF1sh M00nF1sh merged commit 8e94299 into aws:master Apr 11, 2022
@backjo backjo deleted the fix/ImproveCachePerformance branch April 12, 2022 14:38
@backjo backjo restored the fix/ImproveCachePerformance branch April 12, 2022 14:38
@backjo backjo deleted the fix/ImproveCachePerformance branch April 12, 2022 14:38
@backjo backjo restored the fix/ImproveCachePerformance branch April 12, 2022 14:38
sushrk pushed a commit to sushrk/amazon-vpc-cni-k8s that referenced this pull request Jun 17, 2022
aws#1855)

* fix: improve startup performance of kube clients

* fix: improve startup performance of kube clients

* fix: improve startup performance of kube clients

* fix: improve startup performance of kube clients

* fix compile failure in cni-metrics-helper

Co-authored-by: M00nF1sh <yyyng@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants