Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helper/resource: Require explicit usage of NonRetryableError and RetryableError #199

Merged
merged 1 commit into from
Feb 6, 2020

Conversation

bflad
Copy link
Contributor

@bflad bflad commented Oct 8, 2019

Reference: hashicorp/terraform#17220
Reference: hashicorp/terraform-provider-aws#9779 (comment)
Reference: hashicorp/terraform-provider-aws#9812 (comment)

It is currently possible to introduce subtle bugs or crashes when using resource.RetryableError and resource.NonRetryableError and allowing a nil error input. This PR proposes behavior that requires providers to be explicit with their usage of these errors, otherwise returns a bug reporting message to the operator.

For example (https://github.com/terraform-providers/terraform-provider-aws/blob/4c0387645f982ddcd51b3ffe2cc8992c06fb9c2c/aws/resource_aws_elastic_beanstalk_application.go#L105):

	var app *elasticbeanstalk.ApplicationDescription
	err := resource.Retry(30*time.Second, func() *resource.RetryError {
		var err error
		app, err = getBeanstalkApplication(d.Id(), conn)
		if err != nil {
			return resource.NonRetryableError(err)
		}

		if app == nil {
			if d.IsNewResource() {
				return resource.RetryableError(fmt.Errorf("Elastic Beanstalk Application %q not found", d.Id()))
			}
			// err is nil here
			return resource.NonRetryableError(err)
		}
		return nil
	})
	if err != nil {
		return err
	}
	// app can be nil here, so this can crash Terraform
	d.Set("name", app.ApplicationName)

Another example (https://github.com/terraform-providers/terraform-provider-alicloud/blob/927a8e82386e0b32718aa5eef254255fdc36b070/alicloud/resource_alicloud_disk_attachment.go#L112):

	return resource.Retry(5*time.Minute, func() *resource.RetryError {
		err := conn.DetachDisk(instanceID, diskID)
		if err != nil {
			if IsExceptedError(err, DiskIncorrectStatus) || IsExceptedError(err, InstanceLockedForSecurity) ||
				IsExceptedError(err, DiskInvalidOperation) {
				return resource.RetryableError(fmt.Errorf("Detach Disk timeout and got an error: %#v", err))
			}
		}

		disks, _, descErr := conn.DescribeDisks(&ecs.DescribeDisksArgs{
			RegionId: getRegion(d, meta),
			DiskIds:  []string{diskID},
		})

		if descErr != nil {
			log.Printf("[ERROR] Disk %s is not detached.", diskID)
			// err can be nil here
			return resource.NonRetryableError(err)
		}

		for _, disk := range disks {
			if disk.Status != ecs.DiskStatusAvailable {
				return resource.RetryableError(fmt.Errorf("Detach Disk timeout and got an error: %#v", err))
			}
		}
		return nil
	})

Another example (https://github.com/terraform-providers/terraform-provider-aws/blob/75c32b375140813b7994f8031e74b8588a08035a/aws/resource_aws_kms_alias.go#L176-L190):

func retryFindKmsAliasByName(conn *kms.KMS, name string) (*kms.AliasListEntry, error) {
	var resp *kms.AliasListEntry
	err := resource.Retry(1*time.Minute, func() *resource.RetryError {
		var err error
		resp, err = findKmsAliasByName(conn, name, nil)
		if err != nil {
			return resource.NonRetryableError(err)
		}
		if resp == nil {
			// err is nil here, so returns nil
			return resource.RetryableError(err)
		}
		return nil
	})
	return resp, err
}

Another example (https://github.com/terraform-providers/terraform-provider-aws/blob/00909998d919faf5494ab8f6b38241deb1957d99/aws/resource_aws_internet_gateway.go#L55-L65):

	err = resource.Retry(5*time.Minute, func() *resource.RetryError {
		igRaw, _, err := IGStateRefreshFunc(conn, d.Id())()
		if igRaw != nil {
			return nil
		}
		if err == nil {
			// err is always nil here, so it will never retry
			return resource.RetryableError(err)
		} else {
			return resource.NonRetryableError(err)
		}
	})

@bflad bflad added enhancement New feature or request breaking-change Implementation which breaks compatibility or other promises labels Oct 8, 2019
@paultyng paultyng added this to the v2.0.0 milestone Nov 12, 2019
@appilon appilon self-assigned this Feb 6, 2020
@appilon
Copy link
Contributor

appilon commented Feb 6, 2020

@bflad I am going to pickup this PR and get it merged into the V2 branch

…yableError

Reference: hashicorp/terraform#17220
Reference: hashicorp/terraform-provider-aws#9779 (comment)
Reference: hashicorp/terraform-provider-aws#9812 (comment)

It is currently possible to introduce subtle bugs or crashes when using `resource.RetryableError` and `resource.NonRetryableError` and allowing a `nil` error input. This PR proposes behavior that requires providers to be explicit with their usage of these errors, otherwise returns a bug reporting message to the operator.

For example (https://github.com/terraform-providers/terraform-provider-aws/blob/4c0387645f982ddcd51b3ffe2cc8992c06fb9c2c/aws/resource_aws_elastic_beanstalk_application.go#L105):

```go
	var app *elasticbeanstalk.ApplicationDescription
	err := resource.Retry(30*time.Second, func() *resource.RetryError {
		var err error
		app, err = getBeanstalkApplication(d.Id(), conn)
		if err != nil {
			return resource.NonRetryableError(err)
		}

		if app == nil {
			if d.IsNewResource() {
				return resource.RetryableError(fmt.Errorf("Elastic Beanstalk Application %q not found", d.Id()))
			}
			// err is nil here
			return resource.NonRetryableError(err)
		}
		return nil
	})
	if err != nil {
		return err
	}
	// app can be nil here, so this can crash Terraform
	d.Set("name", app.ApplicationName)
```

Another example (https://github.com/terraform-providers/terraform-provider-alicloud/blob/927a8e82386e0b32718aa5eef254255fdc36b070/alicloud/resource_alicloud_disk_attachment.go#L112):

```go
	return resource.Retry(5*time.Minute, func() *resource.RetryError {
		err := conn.DetachDisk(instanceID, diskID)
		if err != nil {
			if IsExceptedError(err, DiskIncorrectStatus) || IsExceptedError(err, InstanceLockedForSecurity) ||
				IsExceptedError(err, DiskInvalidOperation) {
				return resource.RetryableError(fmt.Errorf("Detach Disk timeout and got an error: %#v", err))
			}
		}

		disks, _, descErr := conn.DescribeDisks(&ecs.DescribeDisksArgs{
			RegionId: getRegion(d, meta),
			DiskIds:  []string{diskID},
		})

		if descErr != nil {
			log.Printf("[ERROR] Disk %s is not detached.", diskID)
			// err can be nil here
			return resource.NonRetryableError(err)
		}

		for _, disk := range disks {
			if disk.Status != ecs.DiskStatusAvailable {
				return resource.RetryableError(fmt.Errorf("Detach Disk timeout and got an error: %#v", err))
			}
		}
		return nil
	})
```

Another example (https://github.com/terraform-providers/terraform-provider-aws/blob/75c32b375140813b7994f8031e74b8588a08035a/aws/resource_aws_kms_alias.go#L176-L190):

```go
func retryFindKmsAliasByName(conn *kms.KMS, name string) (*kms.AliasListEntry, error) {
	var resp *kms.AliasListEntry
	err := resource.Retry(1*time.Minute, func() *resource.RetryError {
		var err error
		resp, err = findKmsAliasByName(conn, name, nil)
		if err != nil {
			return resource.NonRetryableError(err)
		}
		if resp == nil {
			// err is nil here, so returns nil
			return resource.RetryableError(err)
		}
		return nil
	})
	return resp, err
}
```

Another example (https://github.com/terraform-providers/terraform-provider-aws/blob/00909998d919faf5494ab8f6b38241deb1957d99/aws/resource_aws_internet_gateway.go#L55-L65):

```go
	err = resource.Retry(5*time.Minute, func() *resource.RetryError {
		igRaw, _, err := IGStateRefreshFunc(conn, d.Id())()
		if igRaw != nil {
			return nil
		}
		if err == nil {
			// err is always nil here, so it will never retry
			return resource.RetryableError(err)
		} else {
			return resource.NonRetryableError(err)
		}
	})
```
@appilon appilon force-pushed the f-helper-resource-explicit-RetryError branch from 1e3a363 to 4f58f0e Compare February 6, 2020 23:08
@appilon appilon changed the base branch from master to version2 February 6, 2020 23:09
@ghost
Copy link

ghost commented Mar 8, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Mar 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
breaking-change Implementation which breaks compatibility or other promises enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants