Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend google_container_node_pool documentation on nodepool changes #17522

Open
nemethloci opened this issue Mar 8, 2024 · 12 comments
Open

Extend google_container_node_pool documentation on nodepool changes #17522

nemethloci opened this issue Mar 8, 2024 · 12 comments

Comments

@nemethloci
Copy link

nemethloci commented Mar 8, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to a user, that user is claiming responsibility for the issue.
  • Customers working with a Google Technical Account Manager or Customer Engineer can ask them to reach out internally to expedite investigation and resolution of this issue.

Terraform Version

v1.7.4

Affected Resource(s)

google_container_node_pool

Terraform Configuration

# Before

resource "google_container_node_pool" "primary_preemptible_nodes" {
  name       = "my-node-pool"
  cluster    = google_container_cluster.primary.id
  node_count = 1

  node_config {
    preemptible  = true
    machine_type = "e2-medium"

    # Google recommends custom service accounts that have cloud-platform scope and permissions granted via IAM Roles.
    service_account = google_service_account.default.email
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]
  }
}

# After

resource "google_container_node_pool" "primary_preemptible_nodes" {
  name       = "my-node-pool"
  cluster    = google_container_cluster.primary.id
  node_count = 1

  node_config {
    preemptible  = true
    machine_type = "e2-standard-2"

    # Google recommends custom service accounts that have cloud-platform scope and permissions granted via IAM Roles.
    service_account = google_service_account.default.email
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]
  }
}

Debug Output

No response

Expected Behavior

In case of a nodepool change, that requires changing the underlying template (which is immutable), the provider would ideally:

  • create a new nodepool with the new template
  • drain the old pool
  • delete the old pool

This is unfortunately not the current behavior probably because the above process would require the 'interaction' of multiple providers (google provider to manage the nodepools and the kubernetes provider to drain the nodes. Not sure if the nodepool resize operation on the GCP API could be abused for draining...?). Considering the nodepool upgrades work as expected following the above approach I think it would worth to emphasize in the documentation what would happen in case of non-version related changes to the node pool resource.

This would be especially important as not being aware of the above can contain service outage (see actual behavior for details)

Actual Behavior

Without create_before_destroy flag the nodepool is deleted and then recreated resulting in complete outage with regard to the services bound to the nodepool. In case it is set, there will still be outage of all services as:

  • the old pool is not drained but terminated abruptly (well, it seems, that scheduling is disabled immediately once a pool is to be deleted, but pods doesn't seem to be waited to be migrated to new pools (respecting poddisruptionbudget), but killed immediately, the autoscaler is trying to create new nodes even in the pool under deletion, which get deleted as well...)
  • the new node pool's initial node count might be significantly lower than the current size (cluster autoscaling) of the old one as such pods would need to be waiting for the autoscaler before they could be scheduled

Steps to reproduce

  1. terraform apply

Important Factoids

No response

References

#10895

b/331218437

@nemethloci nemethloci added the bug label Mar 8, 2024
@github-actions github-actions bot added forward/review In review; remove label to forward service/container labels Mar 8, 2024
@nemethloci
Copy link
Author

Just to make clear: this is a request to improve documentation. Solving the issue behind this goes beyond the scope of this ticket and probably requires the cooperation of the GCP terraform provider developers and GCP itself.

@ggtisc
Copy link
Collaborator

ggtisc commented Mar 8, 2024

Hi @nemethloci you can found an alternative documented here for this scenario:
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/container_node_pool

So you can approach features like strategy and blue_green_settings for upgrades adjusting it according to your requirements.

@nemethloci
Copy link
Author

Hi @ggtisc the feature you referenced is valid only with regard to upgrades and unfortunately does not apply when changing other aspects of the nodepool, which my request is about (ie: disk type, node type...)

@ggtisc ggtisc assigned NickElliot and unassigned NickElliot Mar 14, 2024
@ggtisc
Copy link
Collaborator

ggtisc commented Mar 14, 2024

@nemethloci could you please provide the code of the resource google_container_cluster.primary?

@nemethloci
Copy link
Author

Here you go:

resource "google_container_cluster" "primary" {
  provider = google-beta
  project  = var.project_id
  location = var.region
  name     = var.cluster_name

  network    = var.cluster_network
  subnetwork = var.cluster_subnetwork

  logging_service    = var.logging_service
  monitoring_service = var.monitoring_service
  min_master_version = local.kubernetes_version
  release_channel {
    channel = "UNSPECIFIED"
  }

  enable_shielded_nodes = var.enable_shielded_nodes

  dynamic "master_authorized_networks_config" {
    for_each = var.master_authorized_networks_config
    content {
      dynamic "cidr_blocks" {
        for_each = lookup(master_authorized_networks_config.value, "cidr_blocks", [])
        content {
          cidr_block   = cidr_blocks.value.cidr_block
          display_name = lookup(cidr_blocks.value, "display_name", null)
        }
      }
    }
  }

  ip_allocation_policy {
    cluster_secondary_range_name  = var.cluster_ip_range_pods
    services_secondary_range_name = var.cluster_ip_range_services
  }

  # We can't create a cluster with no node pool defined, but we want to only use
  # separately managed node pools. So we create the smallest possible default
  # node pool and immediately delete it.
  node_pool {
    name               = "default-pool"
    initial_node_count = 1
  }
  remove_default_node_pool = true

  # We can optionally control access to the cluster
  # See https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters
  private_cluster_config {
    enable_private_endpoint = var.disable_public_endpoint
    enable_private_nodes    = var.enable_private_nodes
    master_ipv4_cidr_block  = var.master_ipv4_cidr_block
  }

  master_auth {
    client_certificate_config {
      issue_client_certificate = false
    }
  }

  dynamic "authenticator_groups_config" {
    for_each = var.enable_authenticator_groups_config ? [1] : []
    content {
      security_group = var.authenticator_groups_config_security_group
    }
  }

  binary_authorization {
    evaluation_mode = var.enable_binary_authorization
  }

  pod_security_policy_config {
    enabled = var.enable_pod_security_policy_config
  }

  // Configure the workload identity "identity namespace".  Requires additional
  // configuration on the node pool for workload identity to function.
  // See PATT-24
  workload_identity_config {
    workload_pool = "${var.project_id}.svc.id.goog"
  }

  addons_config {

    http_load_balancing {
      disabled = !var.http_load_balancing
    }

    horizontal_pod_autoscaling {
      disabled = !var.horizontal_pod_autoscaling
    }

    network_policy_config {
      disabled = !var.enable_network_policy
    }

    istio_config {
      disabled = !var.enable_managed_istio
      auth     = var.managed_istio_auth
    }

    config_connector_config {
      enabled = var.enable_config_connector
    }

  }

  network_policy {
    # Based on: https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/issues/656#issuecomment-910108440
    # Enabling NetworkPolicy for clusters with DatapathProvider=ADVANCED_DATAPATH is not allowed (yields error)
    enabled = var.enable_dataplane_v2 ? false : (var.enable_network_policy ? true : false)

    # Tigera (Calico Felix) is the only provider
    # CALICO provider overrides datapath_provider setting, leaving Dataplane v2 disabled
    provider = var.enable_dataplane_v2 ? "PROVIDER_UNSPECIFIED" : (var.enable_network_policy ? "CALICO" : "PROVIDER_UNSPECIFIED")
  }

  # This is where Dataplane V2 is enabled.
  datapath_provider = var.enable_dataplane_v2 ? "ADVANCED_DATAPATH" : "DATAPATH_PROVIDER_UNSPECIFIED"

  dynamic "resource_usage_export_config" {
    for_each = var.enable_metering ? [1] : []

    content {
      enable_network_egress_metering       = true
      enable_resource_consumption_metering = true

      bigquery_destination {
        dataset_id = google_bigquery_dataset.cluster_resource_usage[0].dataset_id
      }
    }
  }

  lifecycle {
    ignore_changes = [node_pool]
  }

  maintenance_policy {
    daily_maintenance_window {
      start_time = var.maintenance_start_time
    }
  }

  timeouts {
    create = "30m"
    update = "30m"
    delete = "30m"
  }
}

@nemethloci
Copy link
Author

One more update: google support has confirmed, that it is expected, that both upon nodepool deletion and manual draining of all nodes of a nodepool, unless the autoscaler is disabled first, new nodes would be spinned of without being cordoned by the autoscaler. As such the 'workaround' process to be used is:

  • spin up new nodepool and at the same time disable autoscaling for the old
  • drain old nodepool nodes manually and once finished (ie: via kubectl)
  • delete the old nodepool

@ggtisc
Copy link
Collaborator

ggtisc commented Mar 25, 2024

@nemethloci is this the same issue of the 17668?

@melinath
Copy link
Collaborator

Note from triage: this doesn't look like a duplicate - that issue was raised by the same author.

@melinath melinath added size/s and removed forward/review In review; remove label to forward labels Mar 25, 2024
@melinath melinath added this to the Goals milestone Mar 25, 2024
@nemethloci
Copy link
Author

@nemethloci is this the same issue of the 17668?

Yes and no:

  • yes: both related to the same workflow (deleting/changing a nodepool)
  • no: The other item intends to solve one problem of many (by turning off autoscaling before deletion), while this one is about to document how deletion of a nodepool should be done in general due to that and other deficiencies, which can not be solved without google changing their APIs and behavior. If the other bugreport would be implemented, there would be one issue less to include in the updated documentation.

Why I have raised two tickets: when I originally reported the documentation related (this) ticket, I thought (based on some early feedback from google engineers), that google will admit, that this is a bug and they will fix it themselves, hence there would be no need to alter the terraform provider itself.

@nemethloci
Copy link
Author

FYI: google have created a feature request (without any commitment with regard to timeline):
https://issuetracker.google.com/issues/331403526

@wyardley
Copy link

wyardley commented Oct 21, 2024

I think GoogleCloudPlatform/magic-modules#12014 will at least partially help, by allowing in-place updates (without recreating the node pool) of at least some of the fields in google_container_cluster.node_config

I didn't mark this as being "fixed by" that, because I'm not 100% sure it solves all cases and if it's the same underlying issue, but feel free to update that and / or comment here if you think that should fix it.

@roaks3
Copy link
Collaborator

roaks3 commented Oct 23, 2024

Agreed that in-place updates help, but there are still some ForceNew fields within node_config, so I think we should keep this open as a request to mention the dangers (like abrupt node termination) of changing those fields on this resource.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants