Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2.7.1] Enable support for custom node roles for node groups #73

Merged
merged 7 commits into from
Nov 30, 2022

Conversation

a-blender
Copy link

@a-blender a-blender commented Sep 13, 2022

Issue: rancher/rancher#29071

Problem

Enable custom node roles for node groups.

Solution

Changes

  • NodeRole (string) is added to the NodeGroup struct for a user-defined node group node role in Rancher
  • GeneratedNodeRole (string) is added to the config.Spec.EKSStatus for a node group node role generated and tracked by Rancher if a user doesn't provide one.

Testing

  1. Provision 2 node EKS cluster with no specified node group node role. Cluster should come up active and node role should be viewable on spec.config.EKSStatus in the v3 object
  2. Provision 2 node EKS cluster with user-defined node group node role
  3. Provision 2 node EKS cluster with 2 node groups, 1 with no specified node group node role and 1 with a user-defined node-group role.

In the UI issue, it is specified

The value passed to the Rancher API should be a string. If no role is selected for a nodegroup, then the empty string should be passed.

I simulated rancher UI input by setting the empty string and a valid user-defined string on the existing node group NodeRole field in the code and verify the expected flow happens in the operator. NodeRole cannot be nil. Changes have been tested and verified with the PR rancher/rancher#38939.

@a-blender a-blender requested review from a team and cmurphy September 13, 2022 21:33
jakefhyde
jakefhyde previously approved these changes Sep 13, 2022
@jakefhyde jakefhyde requested a review from a team September 13, 2022 21:37
@a-blender a-blender changed the title Enable support for custom node roles for node groups [DNM] Enable support for custom node roles for node groups Sep 16, 2022
@a-blender a-blender changed the title [DNM] Enable support for custom node roles for node groups [Work in Progress] Enable support for custom node roles for node groups Sep 16, 2022
Copy link
Contributor

@rmweir rmweir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using a test catalog in your setup that has the new crd?

controller/nodegroup.go Outdated Show resolved Hide resolved
@a-blender a-blender changed the title [Work in Progress] Enable support for custom node roles for node groups Enable support for custom node roles for node groups Oct 7, 2022
controller/nodegroup.go Outdated Show resolved Hide resolved
controller/nodegroup.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
Copy link
Contributor

@jakefhyde jakefhyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple nits

@a-blender a-blender changed the title Enable support for custom node roles for node groups [DNM] Enable support for custom node roles for node groups Oct 20, 2022
@a-blender a-blender force-pushed the enable-node-group-node-role branch 3 times, most recently from 60cf084 to 35ca225 Compare October 20, 2022 17:55
@a-blender
Copy link
Author

a-blender commented Oct 21, 2022

This has been retested on rancher v2.7 head with PR comments addressed!

I had an issue with generatedNodeRole not being set on the Status because I had accidentally mod some previous code while testing that caused the config obj to be shadowed. So, config was being modified before a new node group was created causing k8s to throw an obj modification err when I tried to update the Status. This has been fixed and my solution is working fine - thank you @jakefhyde for the 2nd pair of eyes.

The changes to the fix include

  • generatedNodeRole is not passed to the operator from rancher when building the upstream state.
  • generatedNodeRole is now set on the Status in the handler instead of in nodegroup.go because it doesn't coincide with the utility of that function
  • unnecessary deepCopies have been removed/reorganized

Ready for re-review

jakefhyde
jakefhyde previously approved these changes Oct 24, 2022
controller/eks-cluster-config-handler.go Outdated Show resolved Hide resolved
rmweir
rmweir previously approved these changes Oct 26, 2022
Copy link
Contributor

@rmweir rmweir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

jakefhyde
jakefhyde previously approved these changes Oct 27, 2022
@a-blender a-blender changed the title [DNM] Enable support for custom node roles for node groups [DNM v2.7.1] Enable support for custom node roles for node groups Nov 1, 2022
@a-blender a-blender changed the title [DNM v2.7.1] Enable support for custom node roles for node groups [v2.7.1] Enable support for custom node roles for node groups Nov 17, 2022
generatedNodeRole := config.Status.GeneratedNodeRole

if aws.StringValue(group.NodeRole) == "" {
if config.Status.GeneratedNodeRole == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.Status.GeneratedRole is not set until after the first reconciliation, so every node group will end up creating a stack. Is that by design?

@a-blender a-blender dismissed stale reviews from jakefhyde and rmweir via e87b089 November 18, 2022 06:23
@rancher rancher deleted a comment from jakefhyde Nov 18, 2022
@rancher rancher deleted a comment from jakefhyde Nov 18, 2022

// if a generated node role has not been set on the Status yet and it
// was just generated, set it
if config.Status.GeneratedNodeRole == "" && generatedNodeRole != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the only potential concerns I might have are that if an error returns later on, and we have to re-reconcile this object, it could potentially create more node groups than necessary (i.e. if another error happens between now and the time we end up updating the config object). I consider that outside of the scope of this PR because it looks like the entirety of this operator (and likely all the others) is written in a way that is contradictory to the design of wrangler's controller framework. We should revisit this at some point to prevent unnecessary thrashing by refactoring this. For example, once config.Status.GeneratedNodeRole is set, we should update the config object, and if it fails delete the node group and reenqueue the config object, and if it succeeds, return early since the next reconciliation will shore up the rest of the status updates. Once again, out of the scope of this PR though.

Copy link
Contributor

@rmweir rmweir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants