-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
KEP-1972: kubelet exec probe timeouts
Signed-off-by: Andrew Sy Kim <kim.andrewsy@gmail.com>
- Loading branch information
1 parent
1e80263
commit 95a36f6
Showing
2 changed files
with
144 additions
and
0 deletions.
There are no files selected for viewing
108 changes: 108 additions & 0 deletions
108
keps/sig-node/1972-kubelet-exec-probe-timeouts/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
# KEP-1972: Kubelet Exec Probe Timeouts | ||
|
||
<!-- toc --> | ||
- [Release Signoff Checklist](#release-signoff-checklist) | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [Risks and Mitigations](#risks-and-mitigations) | ||
- [Design Details](#design-details) | ||
- [Test Plan](#test-plan) | ||
- [Graduation Criteria](#graduation-criteria) | ||
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) | ||
- [Version Skew Strategy](#version-skew-strategy) | ||
- [Implementation History](#implementation-history) | ||
- [Drawbacks](#drawbacks) | ||
- [Alternatives](#alternatives) | ||
<!-- /toc --> | ||
|
||
## Release Signoff Checklist | ||
|
||
Items marked with (R) are required *prior to targeting to a milestone / release*. | ||
|
||
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) | ||
- [X] (R) KEP approvers have approved the KEP status as `implementable` | ||
- [X] (R) Design details are appropriately documented | ||
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input | ||
- [X] (R) Graduation criteria is in place | ||
- [ ] (R) Production readiness review completed | ||
- [ ] Production readiness review approved | ||
- [ ] "Implementation History" section is up-to-date for milestone | ||
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] | ||
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes | ||
|
||
[kubernetes.io]: https://kubernetes.io/ | ||
[kubernetes/enhancements]: https://git.k8s.io/enhancements | ||
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes | ||
[kubernetes/website]: https://git.k8s.io/website | ||
|
||
## Summary | ||
|
||
Kubelet today does not respect exec probe timeouts. This is considered a bug we should fix since | ||
the timeout value is supported in the Container Probe API. Because exec probe timeouts | ||
were never respected by kubelet, a new feature gate `ExecProbeTimeouts` will be introduced | ||
so users have an easy way to revert back if the newly introduced probe timeout results | ||
in unexpected behavior. | ||
|
||
## Motivation | ||
|
||
Kubelet not respecting the probe timeout is a bug and should be fixed. | ||
|
||
### Goals | ||
|
||
* fix exec probe timeouts in kubelet | ||
|
||
### Non-Goals | ||
|
||
* ensuring exec processes that timed out have been killed by kubelet. | ||
|
||
## Proposal | ||
|
||
### Risks and Mitigations | ||
|
||
* existing workloads on Kubernetes that relied on this bug may unexpectedly see their probes timeout | ||
|
||
## Design Details | ||
|
||
Changes to kubelet: | ||
* Ensure kubelet handles timeout errors and registers them as failing probes. | ||
* Add feature gate `ExecProbeTimeouts` that is GA and on by default. | ||
* If the feature gate `ExecProbeTimeouts` is disabled and an exec probe timeout is reached, add warning logs to inform users that exec probes are timing out. | ||
* Re-enable existing exec liveness probe e2e test. | ||
* Add new exec readiness probe e2e test. | ||
|
||
### Test Plan | ||
|
||
E2E tests: | ||
* re-enable [existing exec liveness probe e2e test](https://github.com/kubernetes/kubernetes/blob/ea1458550077bdf3b26ac34551a3591d280fe1f5/test/e2e/common/container_probe.go#L210-L227) that is currently being skipped | ||
* add new exec readiness probe e2e test. | ||
|
||
### Graduation Criteria | ||
|
||
This is a bug fix so the feature gate will be GA and on by default from the start. | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
N/A | ||
|
||
### Version Skew Strategy | ||
|
||
N/A | ||
|
||
## Implementation History | ||
|
||
* 2020-09-08 - the KEP was merged as implementable for v1.20 | ||
|
||
## Drawbacks | ||
|
||
* Existing workloads may depend on the fact that exec probe timeouts were never respected. Introducing | ||
the timeout now may result in unexpected behavior for some workloads. | ||
|
||
## Alternatives | ||
|
||
Some alternatives that were considered: | ||
1. Increasing the default timeout for exec probes | ||
2. Continuing to ignore the exec probe timeout | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
title: Kubelet Exec Probe Timeouts | ||
kep-number: 1972 | ||
authors: | ||
- "@andrewsykim" | ||
- "@SergeyKanzhelev" | ||
owning-sig: sig-node | ||
participating-sigs: | ||
status: implementable | ||
creation-date: 2020-09-08 | ||
reviewers: | ||
- "@dchen1107" | ||
- "@derekwaynecarr" | ||
approvers: | ||
- "@dchen1107" | ||
- "@derekwaynecarr" | ||
|
||
# The target maturity stage in the current dev cycle for this KEP. | ||
stage: stable | ||
|
||
# The most recent milestone for which work toward delivery of this KEP has been | ||
# done. This can be the current (upcoming) milestone, if it is being actively | ||
# worked on. | ||
latest-milestone: "v1.20" | ||
|
||
# The milestone at which this feature was, or is targeted to be, at each stage. | ||
milestone: | ||
stable: "v1.20" | ||
|
||
# The following PRR answers are required at alpha release | ||
# List the feature gate name and the components for which it must be enabled | ||
feature-gates: | ||
- name: ExecProbeTimeouts | ||
components: | ||
- kubelet | ||
disable-supported: true | ||
|