Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout error retries for SSM resources #8992

Merged
merged 3 commits into from
Jun 19, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions aws/resource_aws_ssm_document.go
Original file line number Diff line number Diff line change
Expand Up @@ -164,21 +164,26 @@ func resourceAwsSsmDocumentCreate(d *schema.ResourceData, meta interface{}) erro
}

log.Printf("[DEBUG] Waiting for SSM Document %q to be created", d.Get("name").(string))
var resp *ssm.CreateDocumentOutput
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

err := resource.Retry(5*time.Minute, func() *resource.RetryError {
resp, err := ssmconn.CreateDocument(docInput)
var err error
resp, err = ssmconn.CreateDocument(docInput)

if err != nil {
return resource.NonRetryableError(err)
}

d.SetId(*resp.DocumentDescription.Name)
return nil
})

if isResourceTimeoutError(err) {
resp, err = ssmconn.CreateDocument(docInput)
}
if err != nil {
return fmt.Errorf("Error creating SSM document: %s", err)
}

d.SetId(*resp.DocumentDescription.Name)

if v, ok := d.GetOk("permissions"); ok && v != nil {
if err := setDocumentPermissions(d, meta); err != nil {
return err
Expand Down Expand Up @@ -369,6 +374,11 @@ func resourceAwsSsmDocumentDelete(d *schema.ResourceData, meta interface{}) erro

return resource.RetryableError(fmt.Errorf("SSM Document (%s) still exists", d.Id()))
})
if isResourceTimeoutError(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woo! Okay this one is strange currently, let's fix it!

The resource.Retry() function above is using DescribeDocument calls to wait explicitly for the InvalidDocument error from the API to say "hey! this is actually gone after you deleted it". So in our case, we likely will need some additional logic here, not only to retry the last call, but also say "great! you deleted it!" after this new DescribeDocument call.

So maybe this logic can look something like:

	input := &ssm.DescribeDocumentInput{
		Name: aws.String(d.Get("name").(string)),
	}

	log.Printf("[DEBUG] Waiting for SSM Document %q to be deleted", d.Get("name").(string))
	err = resource.Retry(10*time.Minute, func() *resource.RetryError {
		_, err := ssmconn.DescribeDocument(input)

		if isAWSErr(err, ssm.ErrCodeInvalidDocument, "") {
			return nil
		}

		if err != nil {
			return resource.NonRetryableError(err)
		}

		return resource.RetryableError(fmt.Errorf("SSM Document (%s) still exists", d.Id()))
	})

	if isResourceTimeoutError(err) {
		_, err = ssmconn.DescribeDocument(input)
	}

	if isAWSErr(err, ssm.ErrCodeInvalidDocument, "") {
		return nil
	}

	if err != nil {
		return fmt.Errorf("error waiting for SSM Document (%s) deletion: %s", d.Id(), err)
	}

Please reach out if you have any questions 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bflad gotcha! Pushed an update to this, but I am left with one question. After the retry, we return nil if it was an ErrCodeInvalidDocument (indicating it was deleted), and return an error if there was one. But, if there wasn't an error, wouldn't that imply that it still described the document and so it wasn't deleted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aye, that's the rabbit hole with this type of deletion logic. You are correct that it should probably return an error indicating if the document was still found, but up to you if you'd like to implement that now this logic is considered best effort. 😅

_, err = ssmconn.DescribeDocument(&ssm.DescribeDocumentInput{
Name: aws.String(d.Get("name").(string)),
})
}
if err != nil {
return fmt.Errorf("error waiting for SSM Document (%s) deletion: %s", d.Id(), err)
}
Expand Down
11 changes: 7 additions & 4 deletions aws/resource_aws_ssm_resource_data_sync.go
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,12 @@ func resourceAwsSsmResourceDataSync() *schema.Resource {

func resourceAwsSsmResourceDataSyncCreate(d *schema.ResourceData, meta interface{}) error {
conn := meta.(*AWSClient).ssmconn
input := &ssm.CreateResourceDataSyncInput{
S3Destination: expandSsmResourceDataSyncS3Destination(d),
SyncName: aws.String(d.Get("name").(string)),
}

err := resource.Retry(1*time.Minute, func() *resource.RetryError {
input := &ssm.CreateResourceDataSyncInput{
S3Destination: expandSsmResourceDataSyncS3Destination(d),
SyncName: aws.String(d.Get("name").(string)),
}
_, err := conn.CreateResourceDataSync(input)
if err != nil {
if isAWSErr(err, ssm.ErrCodeResourceDataSyncInvalidConfigurationException, "S3 write failed for bucket") {
Expand All @@ -82,6 +82,9 @@ func resourceAwsSsmResourceDataSyncCreate(d *schema.ResourceData, meta interface
}
return nil
})
if isResourceTimeoutError(err) {
_, err = conn.CreateResourceDataSync(input)
}

if err != nil {
return err
Expand Down