-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Increases the timeout for TransitGateway routes #8417
Conversation
Hi @gui-don 👋 Thanks for submitting this and sorry for any trouble caused here. Would you be able to quickly verify which error is being returned by EC2 during the 1 minute retry loop? This can be seen by enabling Terraform debug logging, e.g. If the EC2 error is something related to a throttling error or rate limiting error, then I believe a fix like #7873 is a little more appropriate because it removes the time bounding problem where any amount of timeout increases may only help in some situations. Thanks and hopefully we can get to the bottom of this. |
Hi @bflad. Sure, I understand. I’ll do the test today. |
Here are the WARNs I get, confirming the issue. I have as many logs like this as the number of route I create:
Here is the “minimal” configuration to reproduce the issue. Oddly, I also noted that the
|
Hi again @gui-don 👋 I'm still having trouble replicating this issue with our acceptance testing frameworking. Using a configuration similar to yours above passes the testing: // Reference: https://github.com/terraform-providers/terraform-provider-aws/pull/8417
// This acceptance test is for verifying concurrent route creation
func TestAccAWSEc2TransitGatewayRoute_concurrency(t *testing.T) {
var transitGatewayRoute ec2.TransitGatewayRoute
rBgpAsn := acctest.RandIntRange(64512, 65534)
resource.ParallelTest(t, resource.TestCase{
PreCheck: func() { testAccPreCheck(t); testAccPreCheckAWSEc2TransitGateway(t) },
Providers: testAccProviders,
CheckDestroy: testAccCheckAWSEc2TransitGatewayRouteDestroy,
Steps: []resource.TestStep{
{
Config: testAccAWSEc2TransitGatewayRouteConfigConcurrency(rBgpAsn),
Check: resource.ComposeTestCheckFunc(
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.0", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.1", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.2", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.3", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.4", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.5", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.6", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.7", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.8", &transitGatewayRoute),
testAccCheckAWSEc2TransitGatewayRouteExists("aws_ec2_transit_gateway_route.test.9", &transitGatewayRoute),
),
},
},
})
}
func testAccAWSEc2TransitGatewayRouteConfigConcurrency(rBgpAsn int) string {
return fmt.Sprintf(`
data "aws_availability_zones" "available" {
# IncorrectState: Transit Gateway is not available in availability zone us-west-2d
blacklisted_zone_ids = ["usw2-az4"]
state = "available"
}
resource "aws_vpc" "test" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "tf-acc-test-ec2-transit-gateway-route"
}
}
resource "aws_subnet" "test" {
availability_zone = "${data.aws_availability_zones.available.names[0]}"
cidr_block = "10.0.0.0/24"
vpc_id = "${aws_vpc.test.id}"
tags = {
Name = "tf-acc-test-ec2-transit-gateway-route"
}
}
resource "aws_ec2_transit_gateway" "test" {
auto_accept_shared_attachments = "enable"
}
resource "aws_ec2_transit_gateway_vpc_attachment" "test" {
subnet_ids = ["${aws_subnet.test.id}"]
transit_gateway_id = "${aws_ec2_transit_gateway.test.id}"
vpc_id = "${aws_vpc.test.id}"
}
resource "aws_customer_gateway" "test" {
bgp_asn = %[1]d
ip_address = "178.0.0.1"
type = "ipsec.1"
tags = {
Name = "tf-acc-test-ec2-transit-gateway-route"
}
}
resource "aws_vpn_connection" "test" {
customer_gateway_id = "${aws_customer_gateway.test.id}"
static_routes_only = true
transit_gateway_id = "${aws_ec2_transit_gateway.test.id}"
type = "${aws_customer_gateway.test.type}"
}
resource "aws_ec2_transit_gateway_route" "test" {
count = 10
destination_cidr_block = "10.${count.index}.0.0/16"
transit_gateway_attachment_id = "${aws_vpn_connection.test.transit_gateway_attachment_id}"
transit_gateway_route_table_id = "${aws_ec2_transit_gateway.test.association_default_route_table_id}"
}
`, rBgpAsn)
} $ TF_ACC=1 go test ./aws -v -timeout 120m -parallel 20 -run='TestAccAWSEc2TransitGatewayRoute_concurrency'
=== RUN TestAccAWSEc2TransitGatewayRoute_concurrency
=== PAUSE TestAccAWSEc2TransitGatewayRoute_concurrency
=== CONT TestAccAWSEc2TransitGatewayRoute_concurrency
--- PASS: TestAccAWSEc2TransitGatewayRoute_concurrency (484.70s) I do notice something odd about your output above, that the destination CIDR reference in the warning log does not match your local: locals {
routes = [
"192.168.10.0/24",
"192.168.11.0/24",
"192.168.12.0/24",
"192.168.13.0/24",
"192.168.14.0/24",
"192.168.15.0/24",
"192.168.16.0/24",
"192.168.17.0/24",
"192.168.18.0/24",
"192.168.19.0/24",
"192.168.20.0/24",
"192.168.21.0/24",
]
}
# 2019-04-25T10:47:58.150-0400 [DEBUG] plugin.terraform-provider-aws_v2.7.0_x4: 2019/04/25 10:47:58 [WARN] EC2 Transit Gateway Route (tgw-rtb-00b582c9e05bbc2bd_192.168.4.0/24) not found, removing from state Are there other contributing factors that might be present? |
Hi! Our issue was also fixed auto-magically. It’s been a week since I last hit this problem. The problem is that it might appear again if the the creation time hit the threshold of 1m again. It is indeed very hard to come up with a test that is replicable with a moving target like the time required to create a resource… I would suggest to modify the code to allow more than 1m to create the routes but at the same time the problem appears to be fixed. |
Hi @gui-don - I'm glad to hear this issue has resolved itself for you! Since that's the case, and because I added an additional retry with #8726 that should help in general, I'm going to go ahead and close this PR. However, if this problem comes back up for you, please feel free to open a new issue and we can take a further look. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks! |
When creating a lot of routes (+4), idempotency is broken. This is what happen:
There maybe a better solution to this problem, but this quick fix works.
Community Note