Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Error: The terraform-provider-aws_v5.42.0_x5 plugin crashed! #37289

Open
marcellpatonay opened this issue May 6, 2024 · 5 comments
Open
Labels
bug Addresses a defect in current functionality. crash Results from or addresses a Terraform crash or kernel panic. service/cloudsearch Issues and PRs that pertain to the cloudsearch service.

Comments

@marcellpatonay
Copy link

marcellpatonay commented May 6, 2024

Terraform Core Version

1.5.7

AWS Provider Version

5.42.0

Affected Resource(s)

aws_subnets
aws_iam_policy
aws_iam_role
aws_security_group
aws_kms_key
aws_vpc
aws_cloudsearch_domain

Expected Behavior

Expected plan to complete

Actual Behavior

Terraform failed with the following error:

Error: The terraform-provider-aws_v5.42.0_x5 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

attached is a partial debug log

Relevant Error/Panic Output Snippet

Stack trace from the terraform-provider-aws_v5.42.0_x5 plugin:

panic: set item just set doesn't exist

goroutine 219 [running]:
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*MapFieldWriter).setSet(0x14002a44bd0, {0x14002fcced0, 0x1, 0x1}, {0x112866ec0, 0x140028f4ab0}, 0x14001317cc0)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/field_writer_map.go:330 +0x720
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*MapFieldWriter).set(0x14002a44bd0, {0x14002fcced0, 0x1, 0x1}, {0x112866ec0, 0x140028f4ab0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/field_writer_map.go:110 +0x120
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*MapFieldWriter).WriteField(0x14002a44bd0, {0x14002fcced0, 0x1, 0x1}, {0x112866ec0, 0x140028f4ab0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/field_writer_map.go:92 +0x388
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*ResourceData).Set(0x14002f58100, {0x110792897, 0xb}, {0x112866ec0, 0x140028f4ab0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/resource_data.go:230 +0x1a0
github.com/hashicorp/terraform-provider-aws/internal/service/cloudsearch.resourceDomainRead({0x11560d2a8, 0x14002f5a810}, 0x14002f58100, {0x1153e1160?, 0x140025bd420?})
        github.com/hashicorp/terraform-provider-aws/internal/service/cloudsearch/domain.go:333 +0x1470
github.com/hashicorp/terraform-provider-aws/internal/provider.New.(*wrappedResource).Read.interceptedHandler[...].func9(0x0?, {0x1153e1160?, 0x140025bd420?})
        github.com/hashicorp/terraform-provider-aws/internal/provider/intercept.go:113 +0x1d4
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0x11560d2a8?, {0x11560d2a8?, 0x14002f40ed0?}, 0xd?, {0x1153e1160?, 0x140025bd420?})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/resource.go:790 +0x64
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).RefreshWithoutUpgrade(0x14001312a80, {0x11560d2a8, 0x14002f40ed0}, 0x140029a9ee0, {0x1153e1160, 0x140025bd420})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/resource.go:1089 +0x430
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadResource(0x140033290c8, {0x11560d2a8?, 0x14002f40de0?}, 0x14002f1c640)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/grpc_provider.go:667 +0x3e4
github.com/hashicorp/terraform-plugin-mux/tf5muxserver.(*muxServer).ReadResource(0x11560d2e0?, {0x11560d2a8?, 0x14002f40ae0?}, 0x14002f1c640)
        github.com/hashicorp/terraform-plugin-mux@v0.15.0/tf5muxserver/mux_server_ReadResource.go:35 +0x184
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource(0x14000b25ea0, {0x11560d2a8?, 0x14002f40330?}, 0x14002c6b380)
        github.com/hashicorp/terraform-plugin-go@v0.22.0/tfprotov5/tf5server/server.go:775 +0x3c4
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler({0x11518ab20?, 0x14000b25ea0}, {0x11560d2a8, 0x14002f40330}, 0x140029a3e00, 0x0)
        github.com/hashicorp/terraform-plugin-go@v0.22.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:482 +0x164
google.golang.org/grpc.(*Server).processUnaryRPC(0x14001584400, {0x11560d2a8, 0x14002f402a0}, {0x115646138, 0x1400270a1a0}, 0x14002f3cc60, 0x14002689020, 0x11d621cc8, 0x0)
        google.golang.org/grpc@v1.62.0/server.go:1383 +0xb8c
google.golang.org/grpc.(*Server).handleStream(0x14001584400, {0x115646138, 0x1400270a1a0}, 0x14002f3cc60)
        google.golang.org/grpc@v1.62.0/server.go:1794 +0xc70
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        google.golang.org/grpc@v1.62.0/server.go:1027 +0x8c
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 25
        google.golang.org/grpc@v1.62.0/server.go:1038 +0x150

Terraform Configuration Files

example:

################################################################################
# EKS Module
################################################################################
module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "20.8.4"

  cluster_name                   = var.cluster_name
  cluster_version                = "1.29"
  cluster_endpoint_public_access = true

  enable_cluster_creator_admin_permissions = true

  # Enable EFA support by adding necessary security group rules
  # to the shared node security group
  enable_efa_support = true

  cluster_addons = {
    coredns = {
      most_recent = true
    }
    kube-proxy = {
      most_recent = true
    }
    vpc-cni = {
      most_recent = true
    }
  }

  vpc_id                   = data.aws_vpc.aws-vpc.id
  subnet_ids               = data.aws_subnets.k8s_subnets_ids.ids
  control_plane_subnet_ids = data.aws_subnets.k8s_subnets_ids.ids

  # External encryption key
  create_kms_key = false
  cluster_encryption_config = {
    resources        = ["secrets"]
    provider_key_arn = module.kms.key_arn
  }

  self_managed_node_group_defaults = {
    # enable discovery of autoscaling groups by cluster-autoscaler
    autoscaling_group_tags = {
      "k8s.io/cluster-autoscaler/enabled" : true,
      "k8s.io/cluster-autoscaler/${var.cluster_name}" : "owned",
    }
  }

  self_managed_node_groups = {
    # Default node group - as provisioned by the module defaults
    #default_node_group = {}

    # Complete
    default_node_group = {
      name                      = "${var.cluster_name}-node-group"
      use_name_prefix           = true
      wait_for_capacity_timeout = "0"

      subnet_ids = data.aws_subnets.k8s_subnets_ids.ids

      min_size     = 2
      max_size     = 3
      desired_size = 3

      ami_id = "${data.aws_ami.eks_node.id}"

      pre_bootstrap_user_data = <<-EOT
        export FOO=bar
      EOT

      post_bootstrap_user_data = <<-EOT
        echo "you are free little kubelet!"
      EOT

      instance_type = "m6i.large"
      key_name      = var.cluster_name

      launch_template_name            = "${var.cluster_name}-node-lt"
      launch_template_use_name_prefix = true
      launch_template_description     = "node group launch template"

      ebs_optimized     = true
      enable_monitoring = true

      block_device_mappings = {
        xvda = {
          device_name = "/dev/xvda"
          ebs = {
            volume_size           = 20
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 150
            delete_on_termination = true
          }
        }
      }


      metadata_options = {
        http_endpoint               = "enabled"
        http_tokens                 = "required"
        http_put_response_hop_limit = 2
        instance_metadata_tags      = "disabled"
      }

      create_iam_role          = true
      iam_role_name            = "${var.cluster_name}-node-role"
      iam_role_use_name_prefix = false
      iam_role_description     = "node group iam role"
      iam_role_tags = {
        terraform = true
        env       = var.env
        org       = var.org
      }
      iam_role_additional_policies = {
        AmazonEC2ContainerRegistryReadOnly                  = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
        AmazonEKS_CNI_Policy                                = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
        AmazonEC2ContainerRegistryFullAccess                = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess"
        AmazonEKSWorkerNodePolicy                           = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
        AmazonSSMManagedInstanceCore                        = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
        EC2InstanceProfileForImageBuilderECRContainerBuilds = "arn:aws:iam::aws:policy/EC2InstanceProfileForImageBuilderECRContainerBuilds"
        AmazonEBSCSIDriverPolicy                            = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
        EKSAutoScalingPolicy                                = "${module.iam_policy.arn}"
        NodeWorkerPolicy                                    = "${module.iam_eks_policy.arn}"
      }

      tags = {
        terraform = true
        env       = var.env
        org       = var.org
      }
    }

  }

  tags = {
    terraform = true
    env       = var.env
    org       = var.org
  }
}

Please note that if run against an empty state the configuration successfully applies

Steps to Reproduce

Would be hard to reproduce, If configuration is run against an empty state the issue described above doesn't appear.

Debug Output

debug.log

Panic Output

No response

Important Factoids

State is managed by Gitlab, besides that it's pure terraform.
The issue happens both locally and in gitlab pipelines.
Local env: arm macs
Gitlab pipelines: saas-linux-small-amd64

References

similar issues:
#36588
#32212

Would you like to implement a fix?

None

@marcellpatonay marcellpatonay added the bug Addresses a defect in current functionality. label May 6, 2024
Copy link

github-actions bot commented May 6, 2024

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added the crash Results from or addresses a Terraform crash or kernel panic. label May 6, 2024
@terraform-aws-provider terraform-aws-provider bot added the needs-triage Waiting for first response or review from a maintainer. label May 6, 2024
@aristosvo
Copy link
Contributor

Hi @marcellpatonay!

Can you share a cleaned up version of your aws_cloudsearch_domain config? It seems that the crash has to do with that resource only.

@marcellpatonay
Copy link
Author

marcellpatonay commented May 6, 2024

Hi @aristosvo
here's the config:

resource "aws_cloudsearch_domain" "domain" {
  name = var.domain_name

  scaling_parameters {
    desired_instance_type = var.instance_type
  }

  endpoint_options {
    enforce_https = var.enforce_https
  }

  dynamic "index_field" {
    for_each = var.indexes
    content {
      name            = index_field.value["name"]
      type            = index_field.value["type"]
      search          = index_field.value["search"]
      return          = index_field.value["return"]
      sort            = index_field.value["sort"]
      highlight       = index_field.value["highlight"]
      analysis_scheme = index_field.value["analysis_scheme"]
    }
  }
}

variable "enforce_https" {
  description = "Whether to enforce HTTPS on the domain"
  type        = bool
  default     = false
}

variable "domain_name" {
  description = "The CloudSearch domain name"
  type        = string
}

variable "instance_type" {
  description = "Size of the instance to use for the CloudSearch domain"
  type        = string
  default     = "search.large"
}
module "cloudsearch" {

  source  = "gitlab.com/cardmarket/terraform-modules/aws//cloudsearch"
  version = "1.0.1-1-beta"

  domain_name   = "product-dev"
  instance_type = "search.small"
  enforce_https = "false"
  multi_az      = "false"

  indexes = [
    {
      name            = "title"
      type            = "text"
      search          = true
      return          = true
      sort            = false
      highlight       = false
      analysis_scheme = "_en_default_"
    },
  ]

}

@justinretzolk justinretzolk added service/cloudsearch Issues and PRs that pertain to the cloudsearch service. and removed needs-triage Waiting for first response or review from a maintainer. labels May 6, 2024
@aristosvo
Copy link
Contributor

aristosvo commented May 7, 2024

@marcellpatonay I cannot really locate the issue without extra information, I'm afraid. I cannot replicate the issue in a test.

Has there been external interaction with the indexes on de cloudsearch domain resource or any service which might have interacted with it? Is the resource imported by any chance?

@marcellpatonay
Copy link
Author

@aristosvo We did some futher testing.

  1. Copied over the state to my local machine and performed operations against that.

    • removed everything related to cloudsearch with terraform state rm
    • plan immediately succeeded afterwards
  2. The issue also resolved itself somehow. Today, an hour ago, we noticed that our pipelines in gitlab were passing.

    • note that we haven't made a single change to gitlab managed state.
    • We are still looking into what exactly could've caused this.
    • Could this be something on aws api side?

And to answer your questions:

  • I can't say for sure yet. We suspect something similar too, but are not seeing any configuration changes compared to what we have in terraform.
  • The resource was not imported.

+1 Really appreciate the help! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. crash Results from or addresses a Terraform crash or kernel panic. service/cloudsearch Issues and PRs that pertain to the cloudsearch service.
Projects
None yet
Development

No branches or pull requests

3 participants