Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Granular max-in-flight updates #4124

Merged
merged 1 commit into from
Dec 5, 2024
Merged

Granular max-in-flight updates #4124

merged 1 commit into from
Dec 5, 2024

Conversation

sethboyles
Copy link
Member

@sethboyles sethboyles commented Dec 3, 2024

  • Scale action now compares number of starting/running non-routable instances to max-in-flight, instead of waiting for all instances to become routable
  • Scale action does not continue scaling up when any instances are 'unhealthy' (e.g. 'crashed', 'down', etc) as it's difficult to determine if unhealthy instances belong to the 'max in flight' group
  • Number of desired nondeploying instances now recalculated each iteration instead of decrementing by 'max-in-flight' without checking if it's in a correct state. This mitigates bugs where deployment trains (continually creating new deployments before the previous had completed) could result in number app instances exceeding the max-in-flight limit.

See this thread for some of the issues this PR aims to solve: #4101 (comment)

  • I have reviewed the contributing guide

  • I have viewed, signed, and submitted the Contributor License Agreement

  • I have made this pull request to the main branch

  • I have run all the unit tests using bundle exec rake

  • I have run CF Acceptance Tests

  • I have run BARAS

@sethboyles sethboyles mentioned this pull request Dec 3, 2024
5 tasks
@sethboyles sethboyles force-pushed the granular_max_in_flight branch from c7fc97b to 0e6bbc2 Compare December 4, 2024 00:00
healthy_instances.reject { |_, val| val[:state] == VCAP::CloudController::Diego::LRP_RUNNING && val[:routable] }
end

def routable_instances
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought/non-blocking routable_instances/healthy_instances/unhealthy_instances/reported_instances are all methods for the newly deploying process right? I wonder if theres any way we can be more explicit about that. As technically those functions could apply to the older web process. We could prefix with new or deploying or something....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I do worry the prefixes might hurt readability if every method is called deploying_abc_instances or something.

Maybe an alternative is to separate the scale down and scale up into separate actions entirely? I think their logic is fairly disentangled at this point.

@sethboyles sethboyles force-pushed the granular_max_in_flight branch from 0e6bbc2 to c100b1b Compare December 4, 2024 23:30
* Scale action now compares number of starting/running non-routable instances to max-in-flight, instead of waiting for all instances to become routable
* Scale action does not continue scaling up when any instances are 'unhealthy' (e.g. 'crashed', 'down', etc) as it's difficult to determine if unhealthy instances belong to the 'max in flight' group
* Number of desired nondeploying instances now recalculated each iteration instead of decrementing by 'max-in-flight' without checking if it's in a correct state. This mitigates bugs where deployment trains (continually creating new deployments before the previous had completed) could result in number app instances exceeding the max-in-flight limit.
@sethboyles sethboyles force-pushed the granular_max_in_flight branch from c100b1b to 3b62205 Compare December 5, 2024 00:13
@sethboyles sethboyles merged commit 749e6fc into main Dec 5, 2024
8 checks passed
@sethboyles sethboyles deleted the granular_max_in_flight branch December 5, 2024 23:45
ari-wg-gitbot added a commit to cloudfoundry/capi-release that referenced this pull request Dec 6, 2024
Changes in cloud_controller_ng:

- Granular max-in-flight updates
    PR: cloudfoundry/cloud_controller_ng#4124
    Author: Seth Boyles <seth.boyles@broadcom.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants