Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 There is no need to check instance refresh if ASG is not found #4662

Merged
merged 1 commit into from
Dec 13, 2023

Conversation

fiunchinho
Copy link
Contributor

@fiunchinho fiunchinho commented Nov 28, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:

On this PR we tried to fix #4655 by ignoring the "not found" error when describing instance refreshes resources on AWS. But the AWS API returns a 400 error with code ValidationError when trying to describe the instance refreshes for a non existing AutoScalingGroup. Since ValidationError sounds super generic, instead of ignoring that error, on this PR I'm moving the call to find the AutoScalingGroup before defining the function that will check if we can do an instance refresh. That way, if the AutoScalingGroup is nil, we know we can skip checking. Also, we still want to update the LaunchTemplate when the AutoScalingGroup does not exist, because an error in the LaunchTemplate may be the root cause why the ASG is not created yet.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #4655

Special notes for your reviewer:

Unfortunately, there is no easy way to test this change, because ReconcileLaunchTemplate, which is the one receiving the anonymous function that we want to modify here is mocked.

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

Skip instance refresh attempt if ASG does not yet exist

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority labels Nov 28, 2023
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 28, 2023
@fiunchinho fiunchinho changed the title There is no need to check instance refresh if ASG is not found 🐛 There is no need to check instance refresh if ASG is not found Nov 28, 2023
@fiunchinho
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-eks

@fiunchinho
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-eks

Copy link
Contributor

@AndiDog AndiDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM – one minor comment.

Can you please fill the release note on the PR description? The release note, commit and PR title could read something like "Skip instance refresh attempt if ASG does not yet exist".

exp/controllers/awsmachinepool_controller.go Show resolved Hide resolved
@@ -323,9 +323,6 @@ func (s *Service) CanStartASGInstanceRefresh(scope *scope.MachinePoolScope) (boo
describeInput := &autoscaling.DescribeInstanceRefreshesInput{AutoScalingGroupName: aws.String(scope.Name())}
refreshes, err := s.ASGClient.DescribeInstanceRefreshesWithContext(context.TODO(), describeInput)
if err != nil {
if awserrors.IsNotFound(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, this is removed because AWS returns ValidationError instead of a clear "not found" code, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. It never returns an error that would match IsNotFound() so it's useless to have the check here.

@fiunchinho
Copy link
Contributor Author

Mostly LGTM – one minor comment.

Can you please fill the release note on the PR description? The release note, commit and PR title could read something like "Skip instance refresh attempt if ASG does not yet exist".

I left the release note intentionally because I added it here, but I guess it's better to have a more specific one.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Nov 28, 2023
@fiunchinho
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-eks

Co-authored-by: Andreas Sommer <andreas@giantswarm.io>
@AndiDog
Copy link
Contributor

AndiDog commented Nov 28, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 28, 2023
@fiunchinho
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-eks

@fiunchinho
Copy link
Contributor Author

/test pull-cluster-api-provider-aws-e2e-eks

@richardcase
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: richardcase

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 13, 2023
@k8s-ci-robot k8s-ci-robot merged commit f2956f7 into kubernetes-sigs:main Dec 13, 2023
11 checks passed
@richardcase
Copy link
Member

/cherrypick release-2.3

@k8s-infra-cherrypick-robot

@richardcase: #4662 failed to apply on top of branch "release-2.3":

Applying: There is no need to check instance refresh if ASG is not found
Using index info to reconstruct a base tree...
M	exp/controllers/awsmachinepool_controller.go
M	pkg/cloud/services/autoscaling/autoscalinggroup.go
M	pkg/cloud/services/autoscaling/autoscalinggroup_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/cloud/services/autoscaling/autoscalinggroup_test.go
CONFLICT (content): Merge conflict in pkg/cloud/services/autoscaling/autoscalinggroup_test.go
Auto-merging exp/controllers/awsmachinepool_controller.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 There is no need to check instance refresh if ASG is not found
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-2.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AWSMachinePool reconciliation stuck if ASG could not be created
5 participants