Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

terraform: fix ignored ConditionPathExists= in unit files #1518

Merged
merged 2 commits into from
Jun 30, 2021

Conversation

iaguis
Copy link
Contributor

@iaguis iaguis commented Jun 29, 2021

Over time we've added ConditionPathExists= parameters to several unit
files. However, in some instances we were adding them in the [Service]
section instead of the [Unit] section. This meant they were ignored and
didn't have any effect.

This moves those conditions to the [Unit] section everywhere.

Fixes #1066

@iaguis iaguis requested review from pothos and invidian June 29, 2021 09:42
invidian
invidian previously approved these changes Jun 29, 2021
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So confusing diff 😄 LGTM

pothos
pothos previously approved these changes Jun 29, 2021
@iaguis iaguis dismissed stale reviews from pothos and invidian via 3c7122e June 29, 2021 13:45
@iaguis iaguis force-pushed the iaguis/condition-path-exists branch from 3c7122e to ed0bf41 Compare June 29, 2021 13:52
@@ -64,7 +64,6 @@ systemd:
[Unit]
Description=Kubelet
Wants=rpc-statd.service
ConditionPathExists=/etc/kubernetes/kubeconfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine to have the kubelet try to start again and again until the file exists.

I'm not completely sure about this. I think we can remove this from platforms, where we don't copy the kubeconfig over SSH to nodes, as it will be delivered via Ignition.

Removing it everywhere may result in systemd backoff when restarting the service? And generating error logs, which makes noise and make debugging of actual issues more difficult.

I'd look in this direction:

0 ✓ (64.1ms) 20:23:37 invidian@dellxps15mateusz ~/repos/kinvolk/lokomotive/assets/terraform-modules (⎇  pr/1516) (⎈ newrelic-gcp-standard:mat-reroller-test) $ git grep /etc/kubernetes/kubeconfig | grep Cond
aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl:        ConditionPathExists=/etc/kubernetes/kubeconfig
controller/main.tf:        ConditionPathExists=/etc/kubernetes/kubeconfig
node/templates/node.yaml.tmpl:        ConditionPathExists=/etc/kubernetes/kubeconfig
packet/flatcar-linux/kubernetes/cl/controller.yaml.tmpl:        ConditionPathExists=/etc/kubernetes/kubeconfig
0 ✓ (53.6ms) 20:23:40 invidian@dellxps15mateusz ~/repos/kinvolk/lokomotive/assets/terraform-modules (⎇  pr/1516) (⎈ newrelic-gcp-standard:mat-reroller-test) $ git grep /etc/kubernetes/kubeconfig | grep 'mv '
aws/flatcar-linux/kubernetes/ssh.tf:      "sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
bare-metal/flatcar-linux/kubernetes/ssh.tf:      "sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",
platforms/tinkerbell/controllers/ssh.tf:      "sudo mv $HOME/kubeconfig /etc/kubernetes/kubeconfig",

To see where we can remove it and where we cannot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is adding ConditionPathExists= doesn't cause systemd to retry to start the unit until the file is present, it will check for the file the first time the unit is started and if it's not there it will skip it and never try to start it again.

To achieve the desired behavior we might need to combine this with a path unit so systemd tries to start the unit when the file appears in the filesystem. I agree this would be nicer to avoid generating error logs but it is a bigger change not in scope here.

Note removing ConditionPathExists= just preserves the current behavior since the directive is currently ignored.

Regarding backoff, with the current configuration (RestartSec=5 and the default StartLimitIntervalSec=10s and StartLimitBurst=5) it will never be triggered so it will restart indefinitely.

Copy link
Member

@invidian invidian Jun 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explanation @iaguis, I think removal make sense then. I wonder if we should create an issue to eventually address that, though I think we should start adding bootstrap kubeconfig to controllers Ignition, then we don't need this condition at all.

Leaving conversation unresolved so we can decide if we create an issue for this.

invidian
invidian previously approved these changes Jun 30, 2021
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we could swap the order of commits to avoid creating broken commit and to make diff smaller.

@iaguis
Copy link
Contributor Author

iaguis commented Jun 30, 2021

Nit: we could swap the order of commits to avoid creating broken commit and to make diff smaller.

Will do, maybe this time CI passes :D

iaguis added 2 commits June 30, 2021 16:46
It's added to the [Service] section which means it has no effect.
Adding it to the right place (the [Unit] section) means
/etc/kubernetes/kubeconfig must exist the first time the Kubelet starts,
otherwise the unit is skipped and it never tries to start again.

This removes that directive: it's fine to have the kubelet try to start
again and again until the file exists. Note given the current
configuration (RestartSec=5 and the default StartLimitIntervalSec=10s
and StartLimitBurst=5) backoff will never be triggered so it will
restart indefinitely.
Over time we've added ConditionPathExists= parameters to several unit
files. However, in some instances we were adding them in the [Service]
section instead of the [Unit] section. This meant they were ignored and
didn't have any effect.

This moves those conditions to the [Unit] section.
@iaguis iaguis force-pushed the iaguis/condition-path-exists branch from ed0bf41 to 106c59c Compare June 30, 2021 14:49
@iaguis iaguis requested review from invidian and pothos June 30, 2021 14:50
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iaguis
Copy link
Contributor Author

iaguis commented Jun 30, 2021

Since this is a subset of the original PR I'll take @pothos LGTM into account and merge this :D

@iaguis iaguis merged commit ef2a5a8 into master Jun 30, 2021
@iaguis iaguis deleted the iaguis/condition-path-exists branch June 30, 2021 18:19
@invidian invidian added area/kubernetes Core Kubernetes stuff bug Something isn't working labels Jul 16, 2021
@invidian invidian added this to the v0.9.0 milestone Jul 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/kubernetes Core Kubernetes stuff bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

etcd systemd unit should wait for certificate files to become available
3 participants