Skip to content

Commit 0c0214a

Browse files
kishen-vKarthik-K-N
authored andcommitted
Fix KEP doc for node-resize.
1 parent e06e2f6 commit 0c0214a

File tree

1 file changed

+116
-72
lines changed
  • keps/sig-node/3953-dynamic-node-resize

1 file changed

+116
-72
lines changed

keps/sig-node/3953-dynamic-node-resize/README.md

Lines changed: 116 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -14,30 +14,30 @@ tags, and then generate with `hack/update-toc.sh`.
1414
- [Release Signoff Checklist](#release-signoff-checklist)
1515
- [Summary](#summary)
1616
- [Motivation](#motivation)
17-
- [Goals](#goals)
18-
- [Non-Goals](#non-goals)
17+
- [Goals](#goals)
18+
- [Non-Goals](#non-goals)
1919
- [Proposal](#proposal)
20-
- [User Stories (Optional)](#user-stories-optional)
21-
- [Story 1](#story-1)
22-
- [Story 2](#story-2)
23-
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
24-
- [Risks and Mitigations](#risks-and-mitigations)
20+
- [User Stories (Optional)](#user-stories-optional)
21+
- [Story 1](#story-1)
22+
- [Story 2](#story-2)
23+
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
24+
- [Risks and Mitigations](#risks-and-mitigations)
2525
- [Design Details](#design-details)
26-
- [Test Plan](#test-plan)
27-
- [Prerequisite testing updates](#prerequisite-testing-updates)
28-
- [Unit tests](#unit-tests)
29-
- [Integration tests](#integration-tests)
30-
- [e2e tests](#e2e-tests)
31-
- [Graduation Criteria](#graduation-criteria)
32-
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
33-
- [Version Skew Strategy](#version-skew-strategy)
26+
- [Test Plan](#test-plan)
27+
- [Prerequisite testing updates](#prerequisite-testing-updates)
28+
- [Unit tests](#unit-tests)
29+
- [Integration tests](#integration-tests)
30+
- [e2e tests](#e2e-tests)
31+
- [Graduation Criteria](#graduation-criteria)
32+
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy)
33+
- [Version Skew Strategy](#version-skew-strategy)
3434
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
35-
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
36-
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
37-
- [Monitoring Requirements](#monitoring-requirements)
38-
- [Dependencies](#dependencies)
39-
- [Scalability](#scalability)
40-
- [Troubleshooting](#troubleshooting)
35+
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
36+
- [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning)
37+
- [Monitoring Requirements](#monitoring-requirements)
38+
- [Dependencies](#dependencies)
39+
- [Scalability](#scalability)
40+
- [Troubleshooting](#troubleshooting)
4141
- [Implementation History](#implementation-history)
4242
- [Drawbacks](#drawbacks)
4343
- [Alternatives](#alternatives)
@@ -74,52 +74,49 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
7474

7575
## Summary
7676

77-
This proposal aims at enabling dynamic node resizing. This will help in resizing cluster resource capacity by just updating resources of nodes rather than adding new node or removing existing node and
78-
also enable node configurations to be reflected at the node and cluster levels automatically without the need to manually resetting the kubelet
77+
The proposal aims at enabling dynamic node resizing. This will help in updating cluster resource capacity by just resizing compute resources of nodes rather than adding new node or removing existing node from a cluster.
78+
The updated node configurations are to be reflected at the node and cluster levels automatically without the need to reset the kubelet.
7979

80-
This proposal also aims to improvise the initialisation and reinitialisation of resource managers like cpu manager, memory manager with the dynamic change in machine's CPU and memory configurations.
80+
This proposal also aims to improve the initialization and reinitialization of resource managers, such as the CPU manager and memory manager, in response to changes in a node's CPU and memory configurations.
8181

8282
## Motivation
83-
In a typical Kubernetes environment, the cluster resources may need to be altered because of various reasons like
84-
- Incorrect resource assignment while creating a cluster.
85-
- Workload on cluster is increased over time and leading to add more resources to cluster.
86-
- Workload on cluster is decreased over time and leading to resources under utilization.
83+
In a typical Kubernetes environment, the cluster resources may need to be altered due to following reasons:
84+
- Incorrect resource assignment during cluster creation.
85+
- Increased workload over time, leading to the need for additional resources in the cluster.
86+
- Decreased workload over time, leading to resource underutilization in the cluster.
8787

88-
To handle these scenarios currently we can
89-
- Horizontally scale up or down cluster by the addition or removal of compute nodes
90-
- Vertically scale up or down cluster by increasing or decreasing the node’s capacity, but the current workaround for the node resize to be captured by the cluster is only by the means of restarting Kubelet.
88+
To handle these scenarios, we can:
89+
- Horizontally scale up or down the cluster by adding or removing compute nodes.
90+
- Vertically scale up or down the cluster by increasing or decreasing node capacity. However, currently, the workaround for capturing node resizing in the cluster involves restarting the Kubelet.
9191

92-
The dynamic node resize will give advantages in case of scenarios like
93-
- Handling the resource demand with limited set of machines by increasing the capacity of existing machines rather than creating new ones.
94-
- Creating/Deleting new machine takes more time when compared to increasing/decreasing the capacity of existing ones.
92+
Dynamic node resizing will provide advantages in scenarios such as:
93+
- Handling resource demand with a limited set of nodes by increasing the capacity of existing nodes instead of creating new nodes.
94+
- Creating or deleting new nodes takes more time compared to increasing or decreasing the capacity of existing nodes.
9595

9696
### Goals
9797

98-
* Dynamically resize the node without restarting the kubelet
99-
* Add ability to reinitialize resource managers(cpu manager, memory manager) to adopt changes in machine resource
100-
98+
* Dynamically resize the node without restarting the kubelet.
99+
* Ability to reinitialize resource managers (CPU manager, memory manager) to adopt changes in node's resource.
101100

102101
### Non-Goals
103102

104103
* Update the autoscaler to utilize dynamic node resize.
105104

106105
## Proposal
107106

108-
This KEP adds a polling mechanism in kubelet to fetch the machine-info using cadvisor, The information will be fetched repeatedly based on configured time interval.
109-
Later node status updater will take care of updating this information at node level.
107+
This KEP adds a polling mechanism in kubelet to fetch the machine-information using cAdvisor, The information will be fetched periodically based on a configured time interval, after which the node status updater is responsible for updating this information at node level in the cluster.
110108

111-
This KEP also improvises the resource managers like memory manager, cpu manager initialization and reinitialization so that these resource managers will
112-
adapt to the dynamic change in machine configurations.
109+
Additionally, this KEP aims to improve the initialization and reinitialization of resource managers, such as the memory manager and CPU manager, so that they can adapt to changes in node's configurations.
113110

114111
### User Stories (Optional)
115112

116113
#### Story 1
117114

118-
As a cluster admin, I want to increase the cluster resource capacity without adding a new node to the cluster.
115+
As a cluster admin, I must be able to increase the cluster resource capacity without adding a new node to the cluster.
119116

120117
#### Story 2
121118

122-
As a cluster admin, I want to decrease the cluster resource capacity without removing an existing node from the cluster.
119+
As a cluster admin, I must be able to decrease the cluster resource capacity without removing an existing node from the cluster.
123120

124121
### Notes/Constraints/Caveats (Optional)
125122

@@ -148,12 +145,12 @@ Consider including folks who also work outside the SIG or subproject.
148145

149146
## Design Details
150147

151-
Below diagram is shows the interaction between kubelet and cadvisor
148+
Below diagram is shows the interaction between kubelet and cAdvisor.
152149

153150
```
154151
+----------+ +-----------+ +-----------+ +--------------+
155152
| | | | | | | |
156-
| node | | kubelet | | cadvisor | | machine-info |
153+
| node | | kubelet | | cAdvisor | | machine-info |
157154
| | | | | | | |
158155
+----+-----+ +-----+-----+ +-----+-----+ +-------+------+
159156
| | | |
@@ -177,7 +174,7 @@ Below diagram is shows the interaction between kubelet and cadvisor
177174
| node status update | | |
178175
|<-------------------------------| | |
179176
| | | |
180-
| | | |
177+
| if shrink in resource | | |
181178
| re-run pod admission | | |
182179
|<-------------------------------| | | |
183180
| | | |
@@ -188,14 +185,76 @@ Below diagram is shows the interaction between kubelet and cadvisor
188185
```
189186

190187
The interaction sequence is as follows
191-
1. Kubelet will be polling cadvisor with interval of configured time like one minute to fetch the machine resource information
192-
2. Cadvisor will fetch and update the machine resource information
193-
3. kubelet cache will be updated with the latest machine resource information
194-
4. node status updater will update the node's status with new resource information
195-
5. In case of shrink in cluster resources will re-run the pod admission to evict pods which lack resources
196-
6. kubelet will reinitialize the resource managers to keep them up to date with dynamic resource changes
188+
1. Kubelet will be polling cAdvisor in interval of configured time to fetch the machine resource information.
189+
2. cAdvisor will fetch and update the machine resource information.
190+
3. Kubelet's cache will be updated with the latest machine resource information.
191+
4. Node status updater will update the node's status with the latest resource information.
192+
5. In case of a shrink in cluster resources rerun the pod admission and the pod admission will evict pods
193+
6. Kubelet will reinitialize the resource managers to keep them up to date with dynamic resource changes.
194+
195+
Note: In case of increase in cluster resources, the scheduler will automatically schedule any pending pods.
196+
197+
**Kubelet Configuration changes**
198+
199+
A new boolean variable will be added to kubelet configuration named "DynamicNodeResize" and will be false by default.
200+
User need to enable this to make use of Dynamic Node Resize.
201+
202+
**Proposed Code changes**
203+
204+
**Dynamic Node resize and Pod Re-admission logic**
205+
206+
```azure
207+
if kl.kubeletConfiguration.DynamicNodeResize {
208+
// Handle the node dynamic resize
209+
machineInfo, err := kl.cadvisor.MachineInfo()
210+
if err != nil {
211+
klog.ErrorS(err, "Error fetching machine info")
212+
} else {
213+
cachedMachineInfo, _ := kl.GetCachedMachineInfo()
214+
215+
if !reflect.DeepEqual(cachedMachineInfo, machineInfo) {
216+
kl.setCachedMachineInfo(machineInfo)
217+
218+
// Resync the resource managers
219+
if err := kl.ResyncComponents(machineInfo); err != nil {
220+
klog.ErrorS(err, "Error resyncing the kubelet components with machine info")
221+
}
222+
223+
//Rerun pod admission only in case of shrink in cluster resources
224+
if machineInfo.NumCores < cachedMachineInfo.NumCores || machineInfo.MemoryCapacity < cachedMachineInfo.MemoryCapacity {
225+
klog.InfoS("Observed shrink in nod resources, rerunning pod admission")
226+
kl.HandlePodAdditions(activePods)
227+
}
228+
}
229+
}
230+
}
231+
```
232+
233+
**Changes to resource managers to adapt to dynamic resize**
234+
235+
1. Adding ResyncComponents() method to ContainerManager interface
236+
```azure
237+
// Manages the containers running on a machine.
238+
type ContainerManager interface {
239+
.
240+
.
241+
// ResyncComponents will resyc the resource managers like cpu, memory and topology managers
242+
// with updated machineInfo
243+
ResyncComponents(machineInfo *cadvisorapi.MachineInfo) error
244+
.
245+
.
246+
)
247+
```
248+
249+
2. Adding a method Sync to all the resource managers and will be invoked once there is dynamic resource change.
250+
251+
```azure
252+
// Sync will sync the CPU Manager with the latest machine info
253+
Sync(machineInfo *cadvisorapi.MachineInfo) error
254+
```
197255

198-
Note: In case of increase in cluster resources scheduler will automatically schedule any pending pods
256+
257+
Note: PoC code changes: https://github.com/kubernetes/kubernetes/pull/115755
199258

200259
### Test Plan
201260

@@ -212,26 +271,11 @@ implementing this enhancement to ensure the enhancements have also solid foundat
212271

213272
##### Unit tests
214273

215-
<!--
216-
In principle every added code should have complete unit test coverage, so providing
217-
the exact set of tests will not bring additional value.
218-
However, if complete unit test coverage is not possible, explain the reason of it
219-
together with explanation why this is acceptable.
220-
-->
221-
222-
<!--
223-
Additionally, for Alpha try to enumerate the core package you will be touching
224-
to implement this enhancement and provide the current unit coverage for those
225-
in the form of:
226-
- <package>: <date> - <current test coverage>
227-
The data can be easily read from:
228-
https://testgrid.k8s.io/sig-testing-canaries#ci-kubernetes-coverage-unit
229-
230-
This can inform certain test coverage improvements that we want to do before
231-
extending the production code to implement this enhancement.
232-
-->
274+
1. Add necessary tests in kubelet_node_status_test.go to check for the node status behaviour with dynamic node resize.
275+
2. Add necessary tests in kubelet_pods_test.go to check for the pod cleanup and pod addition workflow.
276+
3. Add necessary tests in eventhandlers_test.go to check for scheduler behaviour with dynamic node capacity change.
277+
4. Add necessary tests in resource managers to check for managers behaviour to adopt dynamic node capacity change.
233278

234-
- `<package>`: `<date>` - `<test coverage>`
235279

236280
##### Integration tests
237281

0 commit comments

Comments
 (0)