Skip to content
This repository has been archived by the owner on Mar 1, 2024. It is now read-only.

Replacing volume fails #154

Open
andrewbaxter opened this issue Feb 12, 2023 · 4 comments · May be fixed by #190
Open

Replacing volume fails #154

andrewbaxter opened this issue Feb 12, 2023 · 4 comments · May be fixed by #190

Comments

@andrewbaxter
Copy link

When replacing a volume attached to a machine, the delete fails with an error like "cannot delete volume, still in use". The volume property on the machine should probably be marked as "forces replacement".

@OJFord
Copy link
Contributor

OJFord commented Feb 22, 2023

Additionally, changing e.g. size on the volume itself:

╷
│ Error: The fly api does not allow updating volumes once created
│
│   with [resource.path].fly_volume.data,
│   on [path].tf line 18, in resource "fly_volume" "data":
│   18: resource "fly_volume" "data" {
│
│ Try deleting and then recreating a volume with new options
╵

It shouldn't say 'try deleting and then recreating' (manually), it should just force a replacement in the plan.

OJFord added a commit to OJFord/terraform-provider-fly that referenced this issue Feb 22, 2023
OJFord added a commit to OJFord/terraform-provider-fly that referenced this issue Feb 22, 2023
@OJFord
Copy link
Contributor

OJFord commented Feb 22, 2023

Regarding the:

╷
│ Error: Delete volume failed
│
│ input:3: deleteVolume volume is currently bound to machine: [id]
│
╵
╷

error - I think the best way to address this would be a separate fly_volume_attachment resource (replacing mounts in fly_machine) that would be replaced on volume ID (or machine ID) change.

Currently the fly_machine update (and we do want it to update, not force a whole replacement just because the volume changed) has to wait for the fly_volume replacement to know the new ID to mount.

With the attachment instead handled through a third resource, the forced replacement of which would allow it to be destroyed (volume detached from machine) thus allowing deleteVolume, the replacement of the fly_volume, and finally the recreation of the attachment resource.

This is how such relationships are handled typically in the AWS provider, for example.

@OJFord
Copy link
Contributor

OJFord commented Feb 22, 2023

Actually, it seems (superfly/flyctl#1758) that there's no way to detach a volume without destroying the machine anyway, so perhaps it should/needs to force replacement of fly_machine, at least for now, even though that seems undesirable really.

I'll add it to #157.

OJFord added a commit to OJFord/terraform-provider-fly that referenced this issue Feb 22, 2023
floydspace pushed a commit to floydspace/terraform-provider-fly that referenced this issue Apr 8, 2023
zxaos pushed a commit that referenced this issue Jun 6, 2023
@zxaos zxaos linked a pull request Jun 6, 2023 that will close this issue
@mootari
Copy link

mootari commented Jul 3, 2023

There seems to be a delay of up to a minute after a machine has been destroyed before the volume is no longer registered as attached.

I'm applying the following workaround which has worked reliably over several runs now:

resource "fly_volume" "db" {
  # ...
}

resource "time_sleep" "db" {
  destroy_duration = "60s"
  lifecycle {
    replace_triggered_by = [ fly_volume.db ]
  }
}

resource "fly_machine" "db" {
  # ...
  mounts = [{
    # ...
    volume = fly_volume.db.id
  }]
  
  lifecycle {
    replace_triggered_by = [ time_sleep.db ]
  }
}

Run:

% terraform taint fly_volume.db && terraform apply -auto-approve

Plan: 3 to add, 0 to change, 3 to destroy.
fly_machine.db: Destroying... [id=**************]
fly_machine.db: Destruction complete after 7s
time_sleep.db: Destroying... [id=2023-07-03T17:37:10Z]
time_sleep.db: Still destroying... [id=2023-07-03T17:37:10Z, 10s elapsed]
time_sleep.db: Still destroying... [id=2023-07-03T17:37:10Z, 20s elapsed]
time_sleep.db: Still destroying... [id=2023-07-03T17:37:10Z, 30s elapsed]
time_sleep.db: Still destroying... [id=2023-07-03T17:37:10Z, 40s elapsed]
time_sleep.db: Still destroying... [id=2023-07-03T17:37:10Z, 50s elapsed]
time_sleep.db: Destruction complete after 1m0s
fly_volume.db: Destroying... [id=**************]
fly_volume.db: Destruction complete after 3s
fly_volume.db: Creating...
fly_volume.db: Creation complete after 4s [id=**************]
time_sleep.db: Creating...
time_sleep.db: Creation complete after 0s [id=2023-07-03T17:41:15Z]
fly_machine.db: Creating...
fly_machine.db: Still creating... [10s elapsed]
fly_machine.db: Creation complete after 19s [id=**************]

(This is in a CI environment where the downtime is acceptable.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants