Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mask GKE Sandbox-specific labels and taints #3749

Merged

Conversation

impl
Copy link
Contributor

@impl impl commented Jul 16, 2020

Hi friends,

This change prevents the GKE Sandbox labels and taints (currently sandbox.gke.io/runtime=gvisor) from causing node pools to always need to be recreated. Without this change, once a node pool is created, it needs to be manually updated to include the additional GKE Sandbox label and taint items.

I believe this PR fixes hashicorp/terraform-provider-google#4210.

The addition of the DiffSuppressFunc on a TypeList is a little weird (not sure if it's using prescribed functionality or not; I see hashicorp/terraform-plugin-sdk#477 but it seems to work here). I'd be happy to take any suggestions on how to improve it.

Once I get an OK on the general approach, I will add a few tests as well. Thanks for taking a look!

Release Note Template for Downstream PRs (will be copied)

compute: Masked automatically applied GKE Sandbox node labels and taints on node pools

@modular-magician
Copy link
Collaborator

Hello! I am a robot who works on Magic Modules PRs.

I have detected that you are a community contributor, so your PR will be assigned to someone with a commit-bit on this repo for initial review.

Thanks for your contribution! A human will be with you soon.

@danawillow, please review this PR or find an appropriate assignee.

@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform Beta: Diff ( 1 file changed, 64 insertions(+), 4 deletions(-))

Copy link
Contributor

@danawillow danawillow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some comments for a first pass! I'd definitely like to see some tests so we can be sure this actually works.

@@ -502,4 +509,58 @@ func flattenSandboxConfig(c *containerBeta.SandboxConfig) []map[string]interface
}
return result
}

func containerNodePoolLabelsSuppress(k, old, new string, d *schema.ResourceData) bool {
o, n := d.GetChange("node_config.0.sandbox_config.0.sandbox_type")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is trying to suggest that we only care about the sandbox label if sandbox_type is set, is that correct? If so can you add a comment to that regard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed the case; I added a comment about this to both methods.

return false
}

if strings.HasPrefix(k, "node_config.0.labels.sandbox.gke.io/") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this will work. It would be worth verifying (with some print statements or a test) but I'm pretty sure 'k' here is going to be the full list, rather than this func being called on each element of the list.

Copy link
Contributor Author

@impl impl Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the behavior here is really interesting. You actually get each key and value under the root independently and can return a different result for each. For example, dumping the k, old, and new values for each invocation you'd see something like:

k=node_config.0.labels.# old=2 new=1
k=node_config.0.labels.test.terraform.io/test old=foo new=foo
k=node_config.0.labels.sandbox.gke.io/runtime old=gvisor new=

It turns out to not be important (see my additional comment below) now, but the initial pass did depend on this admittedly strange behavior. I think it has to do with old and new being strings.

@@ -502,4 +509,58 @@ func flattenSandboxConfig(c *containerBeta.SandboxConfig) []map[string]interface
}
return result
}

func containerNodePoolLabelsSuppress(k, old, new string, d *schema.ResourceData) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the node config schema gets used both in the node pool resource and in the container cluster resource (and within container cluster, it's top level and part of the node pool block), so the address of the field you're looking for will be different depending on where the user is specifying it. I'd recommend using the value of 'k' to help you figure out where you are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they both actually use the same hierarchy by chance, but it raises the well-taken point that there's no restriction on this schema being used elsewhere in the future. I have updated the code to derive the root of the node config from k and added a comment to clarify.

return false
}

func containerNodePoolTaintSuppress(k, old, new string, d *schema.ResourceData) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a fairly similar function in resource_container_cluster: containerClusterAddedScopesSuppress. Can you look at that and see if using similar logic makes sense here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic makes way more sense than the initial way I put this PR together. I have updated the code to use a similar strategy. Note that it does get called multiple times -- once for each "stringable" key/value Terraform provides -- but I think the performance hit is worth the clarity in this case. Now the suppression will be consistent across the entire field, rather than changing for some sub-keys (it is not clear at all what the upward propagation behavior is). Thank you for the pointer!

@impl impl force-pushed the gke-sandbox-labels-taints branch from 44495fe to be6ed8e Compare July 17, 2020 06:40
@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform Beta: Diff ( 3 files changed, 274 insertions(+), 4 deletions(-))

@impl
Copy link
Contributor Author

impl commented Jul 17, 2020

Here are some comments for a first pass! I'd definitely like to see some tests so we can be sure this actually works.

Thank you for the quick and detailed review! I've updated the code to address your concerns and added tests for both the container_cluster and container_node_pool resources.

@impl impl requested a review from danawillow July 20, 2020 17:17
@danawillow
Copy link
Contributor

Hey, just wanted to pop in and let you know I haven't forgotten about this! I've had a bunch of other things on my plate, but I should be able to look at this again next week.

@impl
Copy link
Contributor Author

impl commented Jul 27, 2020

Hi there, no worries! Totally understand. Thanks for the update -- much appreciated!

// GKE sets automatic labels and taints on nodes. This makes
// sure we ignore the automatic ones and keep our own.
Config: testAccContainerCluster_withSandboxConfig(clusterName),
PlanOnly: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a comment here about what this does, since it's a less common feature in our tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

PlanOnly: true,
},
{
// Now we'll modify the labels, which should force a change to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI labels aren't modifiable in-place, so this test step destroys and recreates the cluster. If that's fine for this test, can you add a note here so it doesn't surprise people if they come across this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed intentional, and I've added a comment to note that!

@impl impl force-pushed the gke-sandbox-labels-taints branch from be6ed8e to 67d4b88 Compare July 30, 2020 23:28
This change prevents the GKE Sandbox labels and taints (currently
sandbox.gke.io/runtime=gvisor) from causing node pools to always need to
be recreated. Without this change, once a node pool is created, it needs
to be manually updated to include the additional GKE Sandbox label and
taint items.
@impl impl force-pushed the gke-sandbox-labels-taints branch from 67d4b88 to c7b06d8 Compare July 30, 2020 23:29
@impl impl requested a review from danawillow July 30, 2020 23:29
@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform Beta: Diff ( 3 files changed, 286 insertions(+), 4 deletions(-))

1 similar comment
@modular-magician
Copy link
Collaborator

Hi! I'm the modular magician. Your PR generated some diffs in downstreams - here they are.

Diff report:

Terraform Beta: Diff ( 3 files changed, 286 insertions(+), 4 deletions(-))

Copy link
Contributor

@danawillow danawillow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @impl!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GKE Container Node Pool with sandbox type of gvisor missing label "sandbox.gke.io/runtime" = "gvisor"
4 participants