Sync with upstream 1.22 #121

himanshu-kun · 2022-05-16T13:00:47Z

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #123

Special notes for your reviewer:

CA IT are running locally
will update SYNC_CHANGES

Release note:

sync the changes till v1.22.0 of upstream autoscaler

This change adds 4 metrics that can be used to monitor the minimum and maximum limits for CPU and memory, as well as the current counts in cores and bytes, respectively. The four metrics added are: * `cluster_autoscaler_cpu_limits_cores` * `cluster_autoscaler_cluster_cpu_current_cores` * `cluster_autoscaler_memory_limits_bytes` * `cluster_autoscaler_cluster_memory_current_bytes` This change also adds the `max_cores_total` metric to the metrics proposal doc, as it was previously not recorded there. User story: As a cluster autoscaler user, I would like to monitor my cluster through metrics to determine when the cluster is nearing its limits for cores and memory usage.

Now supported by magnum. https://review.opendev.org/c/openstack/magnum/+/737580/ If using node group autodiscovery, older versions of magnum will still forbid scaling to zero or setting the minimum node count to zero.

Force refreshing everything at every DeleteNodes calls causes slow down and throttling on large clusters with many ASGs (and lot of activity). That function might be called several times in a row during scale-down (once for each ASG having a node to be removed). Each time the forced refresh will re-discover all ASGs, all LaunchConfigurations, then re-list all instances from discovered ASGs. That immediate refresh isn't required anyway, as the cache's DeleteInstances concrete implementation will decrement the nodegroup size, and we can schedule a grouped refresh for the next loop iteration.

Sets the `kubernetes.io/arch` (and legacy `beta.kubernetes.io/arch`) to the proper instance architecture. While at it, re-gen the instance types list (adding new instance types that were missing)

…WNERS

The current implementation assumes MIG ids have the "https://content.googleapis.com" prefix, while the canonical id format seems to begin with "https://www.googleapis.com". Both formats work while talking to the GCE API, but the API returns the latter and other GCP services seem to assume it as well.

Cluster Autoscaler GCE: change the format of MIG id

…ycloud-provider cloudprovider: add Bizflycloud provider

Remove vivekbagade, add towca as an approver in cluster-autoscaler/OWNERS

Enable magnum provider scale to zero

…piles aws: Don't pile up successive full refreshes during AWS scaledowns

Release leader election lock on shutdown

FetchAllMigs (unfiltered InstanceGroupManagers.List) is costly as it's not bounded to MIGs attached to the current cluster, but rather lists all MIGs in the project/zone, and therefore equally affects all clusters in that project/zone. Running the calls concurrently over the region's zones (so at most, 4 concurrent API calls, about once per minute) contains that impact. findMigsInRegion might be scoped to the current cluster (name pattern), but also benefits from the same improvement, as it's also costly and called at each refreshInterval (1mn). Also: we're calling out GCE mig.Get() API again for each MIG (at ~300ms per API call, in my tests), sequentially and with the global cache lock held (when updateClusterState -> ...-> GetMigForInstance kicks in). Yet we already get that bit of information (MIG's basename) from any other mig.Get or mig.List call, like the one fetching target sizes. Leveraging this helps significantly on large fleets (for instance this shaves 8mn startup time on the large cluster I tested on).

…locatables support "/"separators in custom allocatable overrides via vmss tags

add stable zone labels in azure template generation

Document that TLS bootstrapping may be necessary for scale-up

Enable custom k8s fork in update-vendor.sh

Right now the file is breaking `go mod` commands.

…tream-1.22

ashwani2k

Thanks @himanshu-kun for the PR.
Looks good to me.

mcristina422 and others added 30 commits March 12, 2021 12:51

Release leader election lock on shutdown

4cf9a98

Enable magnum provider scale to zero

d103b70

Now supported by magnum. https://review.opendev.org/c/openstack/magnum/+/737580/ If using node group autodiscovery, older versions of magnum will still forbid scaling to zero or setting the minimum node count to zero.

Fix/dependencies

71353a6

Fix/Provider name

b57ba6e

Fix/Add bizflycloud package in skipped_dirs

90bd1eb

Add Bizfly Cloud provider to README

dd8005d

Update license for Bizfly Cloud dependencies

e959385

add required api resources to hetzner cluster-autoscaler example

ec2676b

aws: support arm64 instances

3ffe4b3

Sets the `kubernetes.io/arch` (and legacy `beta.kubernetes.io/arch`) to the proper instance architecture. While at it, re-gen the instance types list (adding new instance types that were missing)

Cluster Autoscaler: remove vivekbagade, add towca as an approver in O…

249a728

…WNERS

Merge pull request kubernetes#4047 from towca/jtuznik/mig-id

3c28030

Cluster Autoscaler GCE: change the format of MIG id

Merge pull request kubernetes#4009 from bizflycloud/bizflycloud/bizfl…

1330ab1

…ycloud-provider cloudprovider: add Bizflycloud provider

Merge pull request kubernetes#4040 from towca/jtuznik/owner

89b2373

Remove vivekbagade, add towca as an approver in cluster-autoscaler/OWNERS

Merge pull request kubernetes#3995 from tghartland/magnum-scale-to-zero

35b8e30

Enable magnum provider scale to zero

Merge pull request kubernetes#3797 from DataDog/aws-not-refreshes-dog…

6c4101b

…piles aws: Don't pile up successive full refreshes during AWS scaledowns

Merge pull request kubernetes#3940 from mcristina422/patch-1

200415e

Release leader election lock on shutdown

support separators in custom allocatable overrides via vmss tags

3e53369

Merge pull request kubernetes#4056 from marwanad/support-separator-al…

c6d4535

…locatables support "/"separators in custom allocatable overrides via vmss tags

add stable zone labels in azure template generation

dda7db0

Merge pull request kubernetes#4061 from marwanad/stable-zone-labels

67dc894

add stable zone labels in azure template generation

Document that TLS bootstrapping may be necessary for scale-up

1b0aa0c

Merge pull request kubernetes#4067 from dharmab/scale-up-q

b70dce3

Document that TLS bootstrapping may be necessary for scale-up

Enable custom k8s fork in update-vendor.sh

23b4329

Merge pull request kubernetes#4023 from BigDarkClown/update-vendor-fork

2e6ccac

Enable custom k8s fork in update-vendor.sh

Replace package satori/go.uuid for cloudprovider ionoscloud

a1577ef

BizFly: remove go.mod from the inlined "gobizfly" client

a5d2700

Right now the file is breaking `go mod` commands.

gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 24, 2022

Merge branch 'machine-controller-manager-provider' into sync-with-ups…

53c9f33

…tream-1.22

gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 26, 2022

gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 26, 2022

gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 27, 2022

himanshu-kun force-pushed the sync-with-upstream-1.22 branch from f71030b to e6dc0af Compare May 27, 2022 08:42

gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 27, 2022

himanshu-kun added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 30, 2022

gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 30, 2022

himanshu-kun added 2 commits May 30, 2022 18:31

update FAQ to ignore integration package for testing

50ee5e8

update ci files to GO111MODULE=off

03d4e3d

himanshu-kun force-pushed the sync-with-upstream-1.22 branch from 8487624 to 03d4e3d Compare May 30, 2022 13:05

gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022

gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022

update more ci files to GO111MODULE=off

0a846c4

gardener-robot-ci-2 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels May 30, 2022

SYNC_CHANGES updated for v1.22 release

0c0c195

gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022

himanshu-kun requested a review from ashwani2k May 30, 2022 13:27

gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label May 30, 2022

ashwani2k approved these changes Jun 1, 2022

View reviewed changes

himanshu-kun merged commit b000e84 into gardener:machine-controller-manager-provider Jun 1, 2022

gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Jun 1, 2022

himanshu-kun deleted the sync-with-upstream-1.22 branch June 1, 2022 08:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync with upstream 1.22 #121

Sync with upstream 1.22 #121

himanshu-kun commented May 16, 2022 •

edited

Loading

ashwani2k left a comment

Sync with upstream 1.22 #121

Sync with upstream 1.22 #121

Conversation

himanshu-kun commented May 16, 2022 • edited Loading

ashwani2k left a comment

Choose a reason for hiding this comment

himanshu-kun commented May 16, 2022 •

edited

Loading