Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 fix upgrade of managed node groups using custom AMIs #4830

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

fad3t
Copy link
Contributor

@fad3t fad3t commented Feb 28, 2024

Hello all,

What type of PR is this?

/kind bug

What this PR does / why we need it:

When using a custom AMI with a launch template, node group upgrades get stuck with the following error message:

E0227 15:07:47.415572       1 controller.go:329] "Reconciler error" err=<
	failed to reconcile machine pool for AWSManagedMachinePool dev/pocg1eks80-mmp-infra: failed to reconcile nodegroup version: failed to update EKS nodegroup: InvalidParameterException: Launch template details can't be null for Custom ami type node group
	{
	  RespMetadata: {
	    StatusCode: 400,
	    RequestID: "e8b04ccf-56aa-4bab-b313-0f4b25137e5c"
	  },
	  Message_: "Launch template details can't be null for Custom ami type node group"
	}
 > controller="awsmanagedmachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AWSManagedMachinePool" AWSManagedMachinePool="dev/pocg1eks80-mmp-infra" namespace="dev" name="pocg1eks80-mmp-infra" reconcileID="441e08da-f7ba-4fde-af7b-e3eb250aae02"

This is because the specVersion is checked first in the switch/case, causing the UpdateNodegroupVersion function to be called without the launch template details.

switch {
case specVersion != nil && ngVersion.LessThan(specVersion):
// NOTE: you can only upgrade increments of minor versions. If you want to upgrade 1.14 to 1.16 we
// need to go 1.14-> 1.15 and then 1.15 -> 1.16.
input.Version = aws.String(versionToEKS(ngVersion.WithMinor(ngVersion.Minor() + 1)))
updateMsg = fmt.Sprintf("to version %s", *input.Version)
case specAMI != nil && *specAMI != ngAMI:
input.ReleaseVersion = specAMI
updateMsg = fmt.Sprintf("to AMI version %s", *input.ReleaseVersion)
case statusLaunchTemplateVersion != nil && *statusLaunchTemplateVersion != *ngLaunchTemplateVersion:
input.LaunchTemplate = &eks.LaunchTemplateSpecification{
Id: s.scope.ManagedMachinePool.Status.LaunchTemplateID,
Version: statusLaunchTemplateVersion,
}
updateMsg = fmt.Sprintf("to launch template version %s", *statusLaunchTemplateVersion)
}

Changing the order of the switch/case makes it possible to upgrade a node group using a custom launch template.

This would fix #4327

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release notes:

fix upgrade of managed node groups using custom AMIs

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 28, 2024
@k8s-ci-robot k8s-ci-robot added needs-priority needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 28, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @fad3t. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mloiseleur
Copy link

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 29, 2024
@Ankitasw
Copy link
Member

Ankitasw commented Mar 6, 2024

/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-eks

@Ankitasw
Copy link
Member

Ankitasw commented Mar 6, 2024

/retest

@Ankitasw
Copy link
Member

Ankitasw commented Mar 7, 2024

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ankitasw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 7, 2024
@k8s-ci-robot k8s-ci-robot merged commit e7c9629 into kubernetes-sigs:main Mar 7, 2024
26 checks passed
@fad3t fad3t deleted the fix-ng-lt-update branch March 7, 2024 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AMI_CUSTOM has no way for upgrading
4 participants