Skip to content

Commit

Permalink
corev1.PodSucceeded alone should always mean "static pod pending"
Browse files Browse the repository at this point in the history
the following conditions:
- a) pod.Status.Phase == PodSucceeded
- b) pod.Status.Condition[type==corev1.PodReady]: status ==
corev1.ConditionTrue

if kubelet does not update the Pod status (a and b) atomically
then there is a gap between when 1) pod.Status.Phase is set to
PodSucceeded and 2) the PodReady condition is updated to False

if a abd b are true then we may signal that "static pod is ready"
on the designated node, which may cause the installer controller
to create a new installer pod on a different node

if a is true, this alone can be used to say that "static pod is
pending" on the designated node.
  • Loading branch information
tkashem committed Jan 13, 2025
1 parent 5e8bc56 commit acaa152
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ func (c *InstallerController) getStaticPodState(ctx context.Context, nodeName st
}

switch pod.Status.Phase {
case corev1.PodRunning, corev1.PodSucceeded:
case corev1.PodRunning:
for _, c := range pod.Status.Conditions {
if c.Type == corev1.PodReady {
if c.Status == corev1.ConditionTrue {
Expand All @@ -273,11 +273,18 @@ func (c *InstallerController) getStaticPodState(ctx context.Context, nodeName st
return staticPodStatePending, revision, "static pod is not ready", nil, c.LastTransitionTime.Time, nil
}
}

ts := pod.CreationTimestamp.Time
if pod.Status.StartTime != nil {
ts = pod.Status.StartTime.Time
}
return staticPodStatePending, revision, "static pod is not ready", nil, ts, nil
case corev1.PodSucceeded:
ts := pod.CreationTimestamp.Time
if pod.Status.StartTime != nil {
ts = pod.Status.StartTime.Time
}
return staticPodStatePending, revision, "static pod has completed", nil, ts, nil
case corev1.PodFailed:
// we need a definitive timestamp here that does not change again. Because PodFail is a terminal state, we can assume
// the containers don't change anymore. So we use the latest container termination state.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1780,7 +1780,7 @@ func TestCreateInstallerPodMultiNode(t *testing.T) {
newStaticPod(mirrorPodNameForNode("test-pod", "test-node-1"), 1, corev1.PodRunning, true),
newStaticPod(mirrorPodNameForNode("test-pod", "test-node-2"), 1, corev1.PodSucceeded, true),
},
expectedUpgradeOrder: []int{1, 2},
expectedUpgradeOrder: []int{2, 1},
},
{
name: "two nodes at different revisions behind and 1 node on latest available revision",
Expand Down

0 comments on commit acaa152

Please sign in to comment.