Granular max-in-flight updates #4124

sethboyles · 2024-12-03T23:20:46Z

Scale action now compares number of starting/running non-routable instances to max-in-flight, instead of waiting for all instances to become routable
Scale action does not continue scaling up when any instances are 'unhealthy' (e.g. 'crashed', 'down', etc) as it's difficult to determine if unhealthy instances belong to the 'max in flight' group
Number of desired nondeploying instances now recalculated each iteration instead of decrementing by 'max-in-flight' without checking if it's in a correct state. This mitigates bugs where deployment trains (continually creating new deployments before the previous had completed) could result in number app instances exceeding the max-in-flight limit.

See this thread for some of the issues this PR aims to solve: #4101 (comment)

I have reviewed the contributing guide
I have viewed, signed, and submitted the Contributor License Agreement
I have made this pull request to the main branch
I have run all the unit tests using bundle exec rake
I have run CF Acceptance Tests
I have run BARAS

lib/cloud_controller/deployment_updater/actions/scale.rb

Samze · 2024-12-04T16:55:59Z

lib/cloud_controller/deployment_updater/actions/scale.rb

+          healthy_instances.reject { |_, val| val[:state] == VCAP::CloudController::Diego::LRP_RUNNING && val[:routable] }
+        end
+
+        def routable_instances


thought/non-blocking routable_instances/healthy_instances/unhealthy_instances/reported_instances are all methods for the newly deploying process right? I wonder if theres any way we can be more explicit about that. As technically those functions could apply to the older web process. We could prefix with new or deploying or something....

That's a good point. I do worry the prefixes might hurt readability if every method is called deploying_abc_instances or something.

Maybe an alternative is to separate the scale down and scale up into separate actions entirely? I think their logic is fairly disentangled at this point.

* Scale action now compares number of starting/running non-routable instances to max-in-flight, instead of waiting for all instances to become routable * Scale action does not continue scaling up when any instances are 'unhealthy' (e.g. 'crashed', 'down', etc) as it's difficult to determine if unhealthy instances belong to the 'max in flight' group * Number of desired nondeploying instances now recalculated each iteration instead of decrementing by 'max-in-flight' without checking if it's in a correct state. This mitigates bugs where deployment trains (continually creating new deployments before the previous had completed) could result in number app instances exceeding the max-in-flight limit.

Changes in cloud_controller_ng: - Granular max-in-flight updates PR: cloudfoundry/cloud_controller_ng#4124 Author: Seth Boyles <seth.boyles@broadcom.com>

sethboyles mentioned this pull request Dec 3, 2024

Granular max in flight #4112

Closed

5 tasks

sethboyles force-pushed the granular_max_in_flight branch from c7fc97b to 0e6bbc2 Compare December 4, 2024 00:00

Samze reviewed Dec 4, 2024

View reviewed changes

lib/cloud_controller/deployment_updater/actions/scale.rb Show resolved Hide resolved

Samze reviewed Dec 4, 2024

View reviewed changes

lib/cloud_controller/deployment_updater/actions/scale.rb Outdated Show resolved Hide resolved

Samze reviewed Dec 4, 2024

View reviewed changes

lib/cloud_controller/deployment_updater/actions/scale.rb Outdated Show resolved Hide resolved

Samze reviewed Dec 4, 2024

View reviewed changes

sethboyles force-pushed the granular_max_in_flight branch from 0e6bbc2 to c100b1b Compare December 4, 2024 23:30

sethboyles force-pushed the granular_max_in_flight branch from c100b1b to 3b62205 Compare December 5, 2024 00:13

Samze approved these changes Dec 5, 2024

View reviewed changes

sethboyles merged commit 749e6fc into main Dec 5, 2024
8 checks passed

sethboyles deleted the granular_max_in_flight branch December 5, 2024 23:45

ari-wg-gitbot added a commit to cloudfoundry/capi-release that referenced this pull request Dec 6, 2024

Bump cloud_controller_ng

9c62181

Changes in cloud_controller_ng: - Granular max-in-flight updates PR: cloudfoundry/cloud_controller_ng#4124 Author: Seth Boyles <seth.boyles@broadcom.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Granular max-in-flight updates #4124

Granular max-in-flight updates #4124

sethboyles commented Dec 3, 2024 •

edited

Loading

Samze Dec 4, 2024

sethboyles Dec 4, 2024

Granular max-in-flight updates #4124

Granular max-in-flight updates #4124

Conversation

sethboyles commented Dec 3, 2024 • edited Loading

Samze Dec 4, 2024

Choose a reason for hiding this comment

sethboyles Dec 4, 2024

Choose a reason for hiding this comment

sethboyles commented Dec 3, 2024 •

edited

Loading