-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve vmpooler scheduling logic #167
Conversation
fba1f73
to
3492f66
Compare
This change is ready for review, but is blocked from moving forward until I present the changes next week. Please feel free to offer any review in the interim. |
3135ca2
to
c36fd3a
Compare
@@ -228,6 +228,7 @@ | |||
allow(redis).to receive(:smembers).with('vmpooler__running__pool1').and_return([]) | |||
allow(redis).to receive(:smembers).with('vmpooler__completed__pool1').and_return([]) | |||
allow(redis).to receive(:smembers).with('vmpooler__discovered__pool1').and_return([]) | |||
allow(redis).to receive(:smembers).with('vmpooler__migrating__pool1').and_return([]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pool manager has the only remaining hive of "mockist" specs in the pooler (I converted the remaining specs over to non-mockist specs). The "mockist" specs don't really exercise much of anything, but I stopped short here because rewriting these specs is a pain.
It would not be a blocker for this PR to not put in some real specs for the new functionality, but I would definitely give big ✨ / 🎉 to the effort.
If you end up wanting to go that route, there are some helpers in https://github.com/puppetlabs/vmpooler/blob/master/spec/helpers.rb that are designed to set up actual vm state in tests, which allows us to then call our methods to manipulate and then assert that things look right in redis.
For example, here's a fairly stateful non-mockist spec: https://github.com/puppetlabs/vmpooler/blob/master/spec/vmpooler/api/v1/vm_template_spec.rb#L173-L184
Anyway, don't worry about it if it's a pain (it probably is). It's on our (my?) wishlist at some point to have our suite be more real, and to evolve the pool manager so that it has a reasonable domain model reflected in the code and those test helpers (or something similar) migrate down into the code itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do want to setup some real tests for the redis usage as I think it will uncover some bugs. I would like to move this change forward separately from that, but I am definitely happy to put in some time to do this since I think it will help identify some of the problem areas of the current redis usage and increase confidence in future changes. The examples are very helpful for me since I have not written much in the way of ruby tests. I'm happy to learn though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started adding some specs here and making an effort to test things a little more thoroughly. I have not written rspec tests before, so I would like to get some feedback about how I'm approaching this to identify if I'm on the right path, or if I need to adjust the approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through some months ago and converted most of our specs away from mocking/stubbing on redis, etc. I stopped short on the pool_manager specs because there's just a lot of mess in that library and it needs deeper work (I think) to clean things up.
I'm fine here with continuing in the existing style (because the cost of not doing so is large), even if the tests end up less than ideal: brittle, not fully exercising functionality.
|
||
def _migrate_vm(vm, pool) | ||
$redis.srem('vmpooler__migrating__' + pool, vm) | ||
vm_object = $vsphere[pool].find_vm(vm) || $vsphere[pool].find_vm_heavy(vm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again (see below), not a priority to fix here, but if it's easy, this incantation could become a method with a name (e.g., find_vsphere_pool_vm(pool, vm)
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this as suggested.
if host == parent_host | ||
$logger.log('s', '[ ] [' + pool + "] No migration required for '" + vm + "'") | ||
else | ||
start = Time.now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a good place to put a $stats
timing block and emit migration time stats to graphite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been added. It looks like it follows the currently used patterns, but please let me know if I've missed something or if it needs further adjustment.
if inventory[vm] | ||
begin | ||
migrate_vm(vm, pool['name']) | ||
rescue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be handy for logging purposes here to make this:
begin
migrate_vm(vm, pool['name'])
rescue => err
$logger.log('s', '[x] [' + pool['name'] + "] '" + vm + "' failed to migrate: #{err}")
end
The log line could also be written as:
$logger.log('s', "[x] [#{pool['name']}] '#{vm}' failed to migrate: #{err}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the context of the failure to the log message and cleaned up the logger string.
end | ||
return nil | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A potential refactoring of this method (untested):
def get_host_utilization(host, model=nil, limit=90)
return nil unless model
return nil unless host_has_cpu_model? host, model
return nil if host.runtime.inMaintenanceMode
return nil unless host.overallStatus == 'green'
cpu_utilization = cpu_utilization_for host
memory_utilization = memory_utilization_for host
return nil if cpu_utilization > limit
return nil if memory_utilization > limit
[ cpu_utilization + memory_utilization, host ]
end
def host_has_cpu_model?(host, model)
host.hardware.cpuPkg[0].description.split().include? model
end
def cpu_utilization_for(host)
cpu_usage = host.summary.quickStats.overallCpuUsage
cpu_size = host.summary.hardware.cpuMhz * host.summary.hardware.numCpuCores
(cpu_usage.to_f / cpu_size.to_f) * 100
end
def memory_utilization_for(host)
memory_usage = host.summary.quickStats.overallMemoryUsage
memory_size = host.summary.hardware.memorySize / 1024 / 1024
(memory_usage.to_f / memory_size.to_f) * 100
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran through and put together something that looks nearly like this and still does the same thing. I think it captures what you were suggesting and helps clean it up. Thank you.
|
||
hosts[hosts_sort.sort_by { |_k, v| v }[0][0]] | ||
def find_least_used_compatible_host(vm) | ||
begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block is a bit cryptic -- I think we're basically calling a thing for the side effect of throwing an error if uninitialized and then initializing in that case. Probably could extract it to a method and give it an intention-revealing name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It's a existing pattern that I duplicated here that tries to see if a connection is alive and available for reuse. I moved it to a method and replaced usage of this pattern with a call of the method.
end | ||
|
||
source_host = vm.summary.runtime.host | ||
model = source_host.hardware.cpuPkg[0].description.split()[4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit: host.hardware.cpuPkg[0].description.split()
turns up in multiple places. What is this thing? Can we use the name for the meaning of it, wrap it in a method, and call it? The [4]
bit is a bit cryptic. If we extract to a method we could document there what the structure is. Maybe even make a method with a clear name that does the [4]
part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made an attempt to clarify this with methods. I think it should be more clear now, but am happy to increase clarity if it still requires improvement.
host_usage = get_host_utilization(host, model) | ||
target_hosts << host_usage if host_usage | ||
end | ||
return target_hosts.sort[0][1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: we don't need return
at the end of a method in ruby, it's enough just to say target_hosts.sort[0][1]
.
Also, I notice that we're doing similar things here as we are higher up in the diff. Makes me think there's a more fundamental operation happening that we could extract to a method and share?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I wasn't sure what the best practice is and have adjusted this.
Mostly style / refactoring / readability / nitpick comments. Nothing that I mentioned would block this PR. Am totally game to pair on this if you want to at some point. |
@@ -196,31 +196,62 @@ def find_folder(foldername) | |||
base | |||
end | |||
|
|||
def get_host_utilization(host, model=nil, limit=90) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method could use some doc strings to
- identify what
model
andlimit
mean. - indicate what kind of return values to expect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments to try and provide some context for what these means. Let me know if you think further comments are necessary, or would be helpful.
end | ||
end | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it's not necessarily part of the goal of the current work you are doing, I think this would be a good place (and time) to add hypervisor data to the vmpooler log regardless of whether a migration happened...see https://tickets.puppetlabs.com/browse/POOLER-29. Not a blocker for this PR, but worth considering.
As for the "graphite stats collection" part of that ticket, that probably isn't necessary unless you can think of a good way to represent individual vm <-> hypervisor mapping in graphite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What hypervisor data are you hoping to see? The only thing that stands out to me is if you wanted to know the host your VM is running on we could report that when logging that no migration is required. This will already be reported when a migration is performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattkirby VM<->host mapping is the hypervisor data I would like to see...and reporting it when logging that no migration is required sounds sufficient to meet that need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@waynr, I have added this change.
@@ -455,6 +455,31 @@ def _check_snapshot_queue | |||
end | |||
end | |||
|
|||
def migrate_vm(vm, pool) | |||
Thread.new do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is this creating a new thread for each VM that needs to be migrated? What's the likelihood that this might happen simultaneously for a large number of VMs at once leading to post-migration overutilization on a host that was previously underutilized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a new thread for each VM that needs to be migrated. The limitation ends up being placed by the number of vsphere connections, and there are shared vsphere connections per pool that end up being used (I'm not sure how many). There's a concept of limiting concurrent tasks in vmpooler, but I haven't gotten into the nuts and bolts of how that works yet. I suspect there is some room for optimization. It looks like the model of vmpooler is to always create more threads, so this is consistent with how the application is handling clone operations, and other actions of that nature. The number of operations able to be performed at any one time will always be limited by the number of vsphere connections.
Since the question of which host is the best target for my VM right now is asked for every VM, and the answer is not shared between migration jobs, then it should be able to accurately track the utilization of hosts and deploy new jobs to lesser used hosts minimizing the possibility of hot spots. I would expect a more even distribution of CPU ready time across the cluster to be representative of this.
c75a0f1
to
23055c1
Compare
I have added an additional commit that makes VM migration at checkout optional, defaulting to disabled. When migration_limit is in the config section of vmpooler.yaml, and the value is greater than 0, migration at checkout will happen. In either case this change adds logging to vmpooler.yaml of a VM parent host, whether involving a migration or not. |
I'm 👍 on getting this in and then tuning. I have not done a deep review, but if @rick has looked it over that's good enough for me |
@@ -225,6 +225,13 @@ | |||
# If set, prefixes all created VMs with this string. This should include | |||
# a separator. | |||
# (optional; default: '') | |||
# | |||
# - migration_limit | |||
# When set to any value greateer than 0 enable VM migration at checkout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "greateer"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed it.
One typo in a comment, otherwise: LGTM. ✨ 🚀 🎸 🤘 |
This looks pretty good to me. @mattkirby is there a particular dashboard we should watch once this change rolls out to production? |
23055c1
to
e9d8eab
Compare
👍 from me to move this forward to testing on vmpooler-dev. Could also coordinate with @puppetlabs/cinext about trying it on the cinext vmpooler. If it's looking good there, then prod vmpooler (branch deploy during time of low activity) and eventually merge. |
e9d8eab
to
ff46bfc
Compare
end | ||
end | ||
datacenter.hostFolder.children.each do |cluster_object| | ||
return cluster_object if cluster_object.name == cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could shorten this to:
datacenter.hostFolder.childen.find { |cluster_object| cluster_object.name == cluster }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have shortened this as suggested and updated the PR.
ff46bfc
to
de8560c
Compare
a52391a
to
6f51075
Compare
A few comments inline, but nothing really blocking. If this has been tested to behave as desired, I'm 👍 |
I updated pool_manager_spec for the method changes implemented. All tests pass now. This is because a part of the changes I made involve many methods receiving the vsphere connection as an argument, rather than assuming it knows where to find it. |
I'm 👍 |
5770331
to
8434048
Compare
|
||
$logger.log('d', "[!] [#{pool}] '#{vm}' marked as 'failed' after #{timeout} minutes") | ||
def fail_pending_vm(vm, pool, timeout, exists=true) | ||
begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One tiny nit -- ruby lets us leave off a begin
...end
block for the purposes of enabling rescue
in the case of a method definition (i.e., a method definition acts like a begin
...end
block).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's handy. I have updated the fail_pending_vm method accordingly. Let me know if it doesn't look right.
👍 on the lost VM changes. |
👍 on |
user: vsphere['username'], | ||
password: vsphere['password'], | ||
insecure: true | ||
def initialize(credentials) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One benefit to this change is that connections are no longer made to vsphere when a thread is initialized at application start time. Instead, they are opened when work requiring a vsphere connection is first performed for the thread. The result is the application and its threads are available faster, and we do not open as many connections as pools when the application starts.
This model still doesn't do anything to prevent an unnecessarily large amount of connections from existing to vsphere, but the logic change does reduce the immediate stress we would otherwise see from opening all connections at once.
yield sock if block_given? | ||
ensure | ||
sock.close | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll have fewer changes if this method either yields the socket or raises an exception (so the caller doesn't need to check the return value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I have updated it and it seems more clear this way.
fail_pending_vm(vm, pool, timeout) if ! socket | ||
move_pending_vm_to_ready(vm, pool, host) | ||
rescue | ||
fail_pending_vm(vm, pool, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for example this could just be:
open_socket vm
move_pending_vm_to_ready(vm, pool, host)
rescue
fail_pending_vm(vm, pool, timeout)
if $redis.smove('vmpooler__ready__' + pool, 'vmpooler__completed__' + pool, vm) | ||
$logger.log('d', "[!] [#{pool}] '#{vm}' is unreachable, removed from 'ready' queue") | ||
else | ||
$logger.log('d', "[!] [#{pool}] '#{vm}' is unreachable, and failed to remove from 'ready' queue") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly this could be:
open_socket
rescue
if $redis.smove('vmpooler__ready__' + pool, 'vmpooler__completed__' + pool, vm)
...
I think this latest commit should address the suggestions in the last few comments @joshcooper. |
I would like to propose merging this on 11/22 at 11am PST if there are no further issues to be raised here. Please let me know if this seems unreasonable and if additional changes are required before merging this. I plan to rebase the changes before merging, but will hold off for tomorrow morning in order to provide reasonable time for feedback.
|
Just to double-check: have you been able to deploy this successfully to vmpooler-dev and observe the migration? |
@rick, yes. The pre-tests incarnation of these changes are running on vmpooler-dev, vmpooler-cinext, and vmpooler-provisioner-dev-2 and have been running for the past several weeks. The latest commit here is running on vmpooler-provisioner-dev-2. I am happy to deploy the latest commit here to dev and vmpooler-cinext before deploying to production if that is helpful as well. My plan was otherwise to update each of those to be running against the merged commit prior to updating production vmpooler. Update: When checking in on vmpooler-cinext it looks like there is a bug causing the queue tracking migrations to have stale entries, and eventually fill up, which results in VMs not being migrated because it believes too many migrations are in progress. I believe that the change since what's running here improves upon this situation and will fix this, but I would like to run the updated version to validate this. I'm going to update which version this is running against to latest and validate the issue does not persist. Update #2: I have confirmed that the latest commit running on vmpooler-cinext resolves the issue with VMs getting stuck in the migration queue thanks to improved error handling. If a VM is deleted before migration happens then it will fail. Previously, that failure would result in the VM being stuck in the queue. Now it logs an error indicating the migration failed because the object could not be found and removes the VM from the migration queue. |
Sounds great to me 👍 |
…ce credentials from the configuration object that itself loads the configuration file when the application starts. Without this change the configuration file is reloaded every time vspherehelper is called. Additionally, this change makes it more straightforward to test vspherehelper connections. A method is added to make more clear what's happening when checking if a socket can be opened to a pending VM on port 22. Additionally, the connection appends domain from the configuration, when present, to the VM name so DNS search is not required.
Remove unneeded begin block in method Fix formatting of rescue block in fail_pending_vm
…o test. Simplify migrate_vm method by breaking out some componenents. Improve error handling around migrate_vm. Add helpers to support setting up redis for migration at checkout testing.
Use mock_redis instead of redis. Make passing of mock redis to helper calls more clear Update pool_manager_spec to specify vsphere argument where applicable. Update pool_helper calls to vsphere where needed for tests to pass. Without this change rspec tests for pool_manager_spec exhibit 12 failures. Update pool_manager_spec test with open_socket Pool_manager_spec stubs a tcpsocket connection to simulate this happening directly within _check_pending_vm. This commit updates this to look more like its usage with open_socket, which allows the test to pass.
…ity. Update migrate_vm to make clear when an if block is the end of the line by returning. Use scard instead of smembers.size() for determining migrations in progress.
This commit updates pool manager to use a method for opening a socket instead of opening it directly from check_pending_vm. Support is added for specifying the domain of the VM to connect to, which lays the groundwork for doing away with the assumption of having DNS search domains set for vmpooler to move VMs to the ready state. Additionally, this commit adds a block to ensure open_socket closes open connections. Without this change sockets are opened to each VM before moving to the ready state, and never explicitly closed. Also, use open socket for check_ready_vm
This commit updates vmpooler to understand how to resolve a situation where a pending VM does not exist. Without this change a pending VM that does not exist in vmware inventory gets stuck in the pending state, preventing the pool from ever reaching its target capacity. As a part of this change the find_vm method is updated to perform a light, then heavy search each time find_vm is called and all usage of find_vm || find_vm_heavy is replaced. This makes find_vm usage consistent across pool_manager. Additionally, open_socket method is updated to resolve an incorrect reference to the host name.
48ad69d
to
02327df
Compare
I have squashed commits to clean up the history in preparation for merge. |
This change implements improved scheduling of pooled VMs by updating the logic used to find the least used host in a target cluster. Without this change host selection is based on VM count, which does not give an indication of VM size or current host utilization.
Additionally, this change implements a queue for migrating VMs at request time. The least used compatible host is selected for migration, which occurs transparently to the consumer, and migration is performed if required.
Without these changes vmpooler may cause uneven load distribution across a cluster of resources since the consumption of VMs is separated from scheduling.