-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add configurable timeout for VM PowerOn operations #990
Conversation
There is a conflict right now between the way the vsphere provider builds VMs and how vSphere expects VMs that are part of a vApp to behave, especially in the way things are powered on. vApps expect that the VMs will be built but not powered on, because it handles powering on the VM on its own in a specific order. On the other hand the provider will build the VM and instantly power it on regardless of where it belongs. The result of this conflict is that vsphere will not allow any VM to power on until all VMs in the vApp are ready to be powered on. So for example while cloning two VMs in the same vApp, we are only allowed to power on the first VM once both of them are done cloning. Any attempts to power on a VM earlier result in a "InvalidState" response. This commit introduces a configurable timeout, `poweron_timeout`, that represents the amount of time we are going to give a VM to power on. The default (and minimum) is 300 seconds (5 minutes). The full power on process goes like this: - Timer starts (for as long as poweron_timeout is set to). - First we poll the `Disabled Methods` list for the VM object until PowerOnVM_Task is no longer in it. - Next we enter a loop for the remainder of the time where we issue PowerOnVM requests, and check the response. If it's successful we return, otherwise we check the error message. If the error message is about the VM being in an invalid state then we sleep for 500ms and retry, otherwise we return an error. - If we hit the timeout and we have not been able to successfully power on the VM so far, we return an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple minor comments, but looks good!
//tctx, tcancel := context.WithTimeout(context.Background(), provider.DefaultAPITimeout) | ||
//defer tcancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the original lines got left in, but commented out. These can be removed if they aren't needed in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, will do.
pTimeoutStr := fmt.Sprintf("%ds", d.Get("poweron_timeout").(int)) | ||
pTimeout, err := time.ParseDuration(pTimeoutStr) | ||
if err != nil { | ||
return nil, fmt.Errorf("failed to parse poweron_timeout as a valid duration: %s", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth adding an IntToString function to the structure helper since you've its been used a couple times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to change the way the timeout is calculated, so this is no longer used.
@koikonom Why are you resorting to time outs? Why not check if the VM is part of a vApp and skip power on? Wouldn't it make sense to treat vApp as a single entity. |
Hey @rchadalawada, great question! In terraform we have two vApp related resources, one is the vapp_container and the other is the vapp_entity. Unfortunately none of them is able to control powering on the VM because neither of them is aware of the full picture. The container just creates the vApp and vapp_entity can be used to associate VMs to the vApp, manage their startup order etc. When the provider runs each resource is only given its own configuration, as a result neither of those entities is aware of the full picture. You are correct when you say that I could skip powering on the VM if it's a member of a vApp, but then there is no resource to control the power status of the vApp once everything is built. In order to do this right we would have to revisit many basic elements of how the VMs are built and how we manage vApps as a whole and this is a much larger project that needs to be prioritised accordingly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks! |
There is a conflict right now between the way the vsphere provider
builds VMs and how vSphere expects VMs that are part of a vApp to
behave, especially in the way things are powered on.
vApps expect that the VMs will be built but not powered on, because it
handles powering on the VM on its own in a specific order. On the other
hand the provider will build the VM and instantly power it on regardless
of where it belongs.
The result of this conflict is that vsphere will not allow any VM to
power on until all VMs in the vApp are ready to be powered on. So for
example while cloning two VMs in the same vApp, we are only allowed to
power on the first VM once both of them are done cloning. Any attempts
to power on a VM earlier result in a "InvalidState" response.
This commit introduces a configurable timeout,
poweron_timeout
, thatrepresents the amount of time we are going to give a VM to power on. The
default (and minimum) is 300 seconds (5 minutes).
The full power on process goes like this:
Disabled Methods
list for the VM object untilPowerOnVM_Task is no longer in it.
PowerOnVM requests, and check the response. If it's successful we
return, otherwise we check the error message. If the error message is
about the VM being in an invalid state then we sleep for 500ms and
retry, otherwise we return an error.
on the VM so far, we return an error.