Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase datastore pool at startup #2354

Merged
merged 1 commit into from
May 12, 2023
Merged

Increase datastore pool at startup #2354

merged 1 commit into from
May 12, 2023

Conversation

jdn5126
Copy link
Contributor

@jdn5126 jdn5126 commented Apr 19, 2023

What type of PR is this?
bug

Which issue does this PR fix:
#1724

What does this PR do / Why do we need it:
This PR fixes two issues:

  1. The container entrypoint was not properly validating IPAMD startup.
  2. During node initialization, IPAMD used to only build datastore from already attached ENIs and available CIDRs. Now, if the datastore pool is too low (based on settings), it tries to increase the datastore pool by attaching ENIs. This improves the startup time for custom networking and branch ENIs. There is also a check for custom networking such that if the datastore is empty after the increase attempt, there must be a configuration error, so IPAMD returns an error, which will put the pod into a CrashLoopBackOff until the configuration error is resolved. In the future, we want to generate an event for this case.

If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
N/A

Testing done on this change:
Manually verified that when ENIConfigs are missing, node remains in "Not Ready" state as IPAMD returns error. Also verified that when ENIConfigs are present, ENIs are attached during node initialization.

Verified that CNI integration release tests pass. Scheduling GitHub runner as well.

Automation added to e2e:
N/A

Will this PR introduce any new dependencies?:
No

Will this break upgrades or downgrades. Has updating a running cluster been tested?:
No, Yes

Does this change require updates to the CNI daemonset config files to work?:
No

Does this PR introduce any user-facing change?:
Yes

Enhance node initialization to attach ENIs if needed.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jdn5126 jdn5126 requested a review from a team as a code owner April 19, 2023 16:23
cmd/aws-vpc-cni/main.go Outdated Show resolved Hide resolved
pkg/ipamd/ipamd.go Outdated Show resolved Hide resolved
@jdn5126 jdn5126 force-pushed the cn_opt branch 3 times, most recently from b96ef96 to ea99b53 Compare May 2, 2023 15:57
@jdn5126 jdn5126 force-pushed the cn_opt branch 5 times, most recently from 4b7ad82 to 08b3b42 Compare May 9, 2023 20:57
@jdn5126 jdn5126 force-pushed the cn_opt branch 3 times, most recently from 1122fce to 86b7843 Compare May 11, 2023 21:16
timeSinceLast := time.Since(c.lastNodeIPPoolAction)
if timeSinceLast <= interval {
// Make an exception if node needs a trunk ENI and one is not currently attached.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One callout for this change is that, if trunk ENI fails to attach for any VPC RC issues then aws-node keeps running the reconciliation..maybe as an enhancement we should make it like a backoff retry in cases of SGPP..

Copy link
Contributor

@jayanthvn jayanthvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Release testing should cover negative test cases while increasing the DS pool and trunk ENI failed to attach scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants