Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openstack destroy order ends in error_deleting volume #8030

Closed
KostyaSha opened this issue Aug 7, 2016 · 19 comments
Closed

Openstack destroy order ends in error_deleting volume #8030

KostyaSha opened this issue Aug 7, 2016 · 19 comments
Labels

Comments

@KostyaSha
Copy link

Terraform Version

Terraform v0.6.14

Affected Resource(s)

Please list the resources as a list, for example:

  • openstack_blockstorage_volume_v2
  • openstack_compute_instance_v2

If this issue appears to affect multiple resources, it may be an issue with Terraform's core, so please mention this.

Terraform Configuration Files

variable "count_offset" {
  // first instance num
  default = "9"
}

variable "servers_count" {
  default = 7
}

resource "openstack_blockstorage_volume_v2" "volume_srv" {
  count = "${var.servers_count}"
  name = "${format(var.servername_mask, count.index + var.count_offset)}_srv"
  size = 80
  volume_type = "Latency optimized"
  availability_zone = "nova"
}

resource "openstack_blockstorage_volume_v2" "volume_docker" {
  count = "${var.servers_count}"
  name = "${format(var.servername_mask, count.index + var.count_offset)}_docker"
  size = 80
  volume_type = "Latency optimized"
  region = "Lithuania"
}

# Create teamcity agents
resource "openstack_compute_instance_v2" "vm_tcalin" {
  count = "${var.servers_count}"
  name = "${format(var.servername_mask, count.index + var.count_offset)}"
  image_name = ".centos-7.2"
  flavor_name = "c8.m8.10g"
  availability_zone = "ZONE1"
  network {
    name = "NETWORK"
  }
  volume {
    volume_id = "${element(openstack_blockstorage_volume_v2.volume_srv.*.id, count.index)}"
  }
  volume {
    volume_id = "${element(openstack_blockstorage_volume_v2.volume_docker.*.id, count.index)}"
  }
}

Debug Output

Can make later.

Expected Behavior

  • Un-attach volumes
  • delete vm
  • delete volumes

Actual Behavior

  • delete vm
  • fail delete volume

Steps to Reproduce

  1. terraform apply
  2. terraform destroy

Important Factoids

EMC scaleio backend that doesn't allow mapped volume deletion.

@jtopjian
Copy link
Contributor

jtopjian commented Aug 7, 2016

Thanks for reporting this.

EMC scaleio backend that doesn't allow mapped volume deletion.

Can you provide some more details about this? I'm not familiar with this Cinder driver. :(

@jtopjian
Copy link
Contributor

jtopjian commented Aug 7, 2016

OK, so when terraform destroy is finished and you do cinder list, you still see the two volumes? If so, do they have a status of in-use?

Does Terraform report that everything succeeded? Can you provide debug output of both terraform apply and terraform destroy (a Github gist is the best way to share due to size).

@KostyaSha
Copy link
Author

Sorry, forgot attach destroy log

openstack_compute_instance_v2.vm_tcalin.4: Refreshing state... (ID: 53d17fc5-17eb-47ae-ad53-05963da2d3b0)
openstack_compute_instance_v2.vm_tcalin.1: Refreshing state... (ID: c6d0730b-8731-4d09-87ed-6a23b2c04a1b)
openstack_compute_instance_v2.vm_tcalin.5: Refreshing state... (ID: 2857091b-b5ee-4ab6-9620-ee65e013281a)
openstack_compute_instance_v2.vm_tcalin.6: Refreshing state... (ID: d8dee22a-a5f5-4bf0-95da-c249f5d3906e)
openstack_compute_instance_v2.vm_tcalin.2: Refreshing state... (ID: faf39036-5b31-46e0-8c80-5981712533fb)
openstack_compute_instance_v2.vm_tcalin.3: Refreshing state... (ID: 0d6ba35c-3ea6-4090-bf9c-2795bdf655ea)
openstack_compute_instance_v2.vm_tcalin.0: Refreshing state... (ID: 15734af5-b863-4e8a-b9c6-39ddfd3b2a0e)
openstack_compute_instance_v2.vm_tcalin.2: Destroying...
openstack_compute_instance_v2.vm_tcalin.1: Destroying...
openstack_compute_instance_v2.vm_tcalin.0: Destroying...
openstack_compute_instance_v2.vm_tcalin.5: Destroying...
openstack_compute_instance_v2.vm_tcalin.6: Destroying...
openstack_compute_instance_v2.vm_tcalin.4: Destroying...
openstack_compute_instance_v2.vm_tcalin.3: Destroying...
openstack_compute_instance_v2.vm_tcalin.0: Destruction complete
openstack_compute_instance_v2.vm_tcalin.6: Destruction complete
openstack_compute_instance_v2.vm_tcalin.5: Destruction complete
openstack_compute_instance_v2.vm_tcalin.3: Destruction complete
openstack_compute_instance_v2.vm_tcalin.4: Destruction complete
openstack_compute_instance_v2.vm_tcalin.2: Destruction complete
openstack_compute_instance_v2.vm_tcalin.1: Destruction complete
openstack_blockstorage_volume_v1.volume_srv.2: Destroying...
openstack_blockstorage_volume_v1.volume_srv.4: Destroying...
openstack_blockstorage_volume_v1.volume_srv.3: Destroying...
openstack_blockstorage_volume_v1.volume_srv.6: Destroying...
openstack_blockstorage_volume_v1.volume_srv.5: Destroying...
openstack_blockstorage_volume_v1.volume_srv.1: Destroying...
openstack_blockstorage_volume_v1.volume_docker.0: Destroying...
openstack_blockstorage_volume_v1.volume_docker.6: Destroying...
openstack_blockstorage_volume_v1.volume_srv.0: Destroying...
openstack_blockstorage_volume_v1.volume_docker.3: Destroying...
openstack_blockstorage_volume_v1.volume_srv.6: Destruction complete
openstack_blockstorage_volume_v1.volume_docker.5: Destroying...
openstack_blockstorage_volume_v1.volume_srv.4: Destruction complete
openstack_blockstorage_volume_v1.volume_docker.4: Destroying...
openstack_blockstorage_volume_v1.volume_srv.5: Destruction complete
openstack_blockstorage_volume_v1.volume_docker.1: Destroying...
openstack_blockstorage_volume_v1.volume_srv.1: Destruction complete
openstack_blockstorage_volume_v1.volume_srv.3: Destruction complete
openstack_blockstorage_volume_v1.volume_docker.2: Destroying...
openstack_blockstorage_volume_v1.volume_docker.3: Destruction complete
openstack_blockstorage_volume_v1.volume_docker.6: Destruction complete
openstack_blockstorage_volume_v1.volume_srv.2: Destruction complete
openstack_blockstorage_volume_v1.volume_docker.4: Destruction complete
openstack_blockstorage_volume_v1.volume_docker.2: Destruction complete
Error applying plan:

4 error(s) occurred:

* openstack_blockstorage_volume_v1.volume_srv.0: Error waiting for volume (317653dc-c90e-4fe3-8b8c-70806736bcdf) to delete: unexpected state 'error_deleting', wanted target '[deleted]'
* openstack_blockstorage_volume_v1.volume_docker.0: Error waiting for volume (e19fbe94-7ac5-4c81-8549-5953045d1011) to delete: unexpected state 'error_deleting', wanted target '[deleted]'
* openstack_blockstorage_volume_v1.volume_docker.5: Error waiting for volume (390a99aa-54eb-454e-a837-d280bd084fa3) to delete: unexpected state 'error_deleting', wanted target '[deleted]'
* openstack_blockstorage_volume_v1.volume_docker.1: Error waiting for volume (6718baf4-4e95-43a2-a983-85a036e33927) to delete: unexpected state 'error_deleting', wanted target '[deleted]'

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

@KostyaSha
Copy link
Author

Then volumes stuck in error_deleting and only admin can hack DB.
According to admin commands ceph storage allows deleting mapped volumes, while EMC doesn't. Terraform should somehow allow define destroy order behaviour.

@jtopjian
Copy link
Contributor

jtopjian commented Aug 7, 2016

Ah, I think I see what's going on.

Terraform should somehow allow define destroy order behaviour.

Terraform will destroy resources in the opposite direction of building them. Since you are creating volumes and then specifying their IDs in the instance, this is creating an implicit dependency on the volumes and instance, so first the volumes are created then the instance. When you destroy, first the instance will be destroyed and then the volumes.

This seems to be fine for most Cinder drivers, but it looks like the EMC driver might want the volumes to be explicitly detached. Therefore, the Instance should explicitly detach all volumes during its destroy action.

I'll look into this. Thank you for reporting it. 😄

@jtopjian
Copy link
Contributor

jtopjian commented Aug 7, 2016

Side note: does the cinder reset-state command help avoid the DB hack? Or does it refuse to update the status because the volume is in an error state?

@wizardmatas
Copy link

Update: while EMC volume still is in attached state ( from storage side), cinder commands doesn't help either. Cinder reset-state , does what it should, but if you try to delete, you will get the same error, as volume is still physically attached, so correct order should be detach firstly.

@jtopjian
Copy link
Contributor

jtopjian commented Aug 7, 2016

Understood. Thanks for the info!

@jtopjian
Copy link
Contributor

I just created #8172. Would you be able to test these changes locally and see if they fix this problem?

@mvaitiekunas
Copy link

Terraform v0.7.1 is destroying EMC volumes without any error. Looks like your fix is working. Thank You!

@jtopjian
Copy link
Contributor

But the fix hasn't been merged yet...

@KostyaSha
Copy link
Author

@jtopjian looks like our cloud team applied some fix to prevent locked state. @wizardmatas ?

@mvaitiekunas
Copy link

Yes, we applied "sio_unmap_volume_before_deletion=True" for openstack cinder. Then this option did a trick :)

@KostyaSha
Copy link
Author

@mvtks but this PR is about feature in terraform to do unmounts before to exclude such situations https://raymii.org/s/articles/Fix_inconsistent_Openstack_volumes_and_instances_from_Cinder_and_Nova_via_the_database.html

@mvaitiekunas
Copy link

mvaitiekunas commented Aug 24, 2016

Yes, they should do a fix.

And we do not edit database! When volume gets "error deleting" status, we go to EMC scaleio and tell to unmap it. Then after state reset volume could be deleted.

@jtopjian
Copy link
Contributor

That's great news regarding sio_unmap_volume_before_deletion. Regarding the PR in question, I will be merging that shortly.

@jtopjian
Copy link
Contributor

The PR has been merged and will be available in 0.7.2 (or available now if you compile the master branch from source.

I'm going to close this issue, but do let me know if this didn't fix the problem or if the PR actually caused more problems...

Thank you for reporting this. :)

@ghost
Copy link

ghost commented Apr 23, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants