operator: simplify object creation #586

pohly · 2020-04-16T07:36:51Z

Spreading object creation across phases is unnecessarily complex and would have to be changed anyway when introducing the "is driver running" check.

While working on this I noticed that controller test errors were attributed to the helper functions instead of the actual call site where the assertion was made, so I changed that, too.

When an assertion fails in a helper function, Ginkgo shows the error as belonging to that function, which is not very helpful because it's not visible where that function was called. It's better to skip frames in the call stack such that errors always refer to the test itself.

The behavior was that first secrets were created, then the phase changed without waiting for anything and then the next set of objects were created. This is unnecessarily complex and confusing. Now all objects get created together in the "new" phase and then the deployment goes straight to "running". When introducing the "is driver usable" check, it will first have to go into "initializing" and then to "running" after the check passes.

pohly · 2020-04-16T13:17:42Z

@avalluri this passed testing, can we merge?

avalluri · 2020-04-17T04:55:09Z

pkg/pmem-csi-operator/controller/deployment/controller_driver.go


+// getDeploymentObjects returns all objects that are part of a driver deployment.
+func (d *PmemCSIDriver) getDeploymentObjects(r *ReconcileDeployment) ([]runtime.Object, error) {


@pohly By adding secrets creation to getDeploymentObjects, calling this method more than once creates new keys/certificates for every call in case the deployment chooses to use defaults certificates.

Is that a bad thing? It would ensure that those certificates are always fresh.

We haven't sorted out how to handle periodic updates of those certificates and the current code also doesn't have your "populate deployment from objects" code yet. Let's figure out how to avoid updating the secrets too often later, okay?

Is that a bad thing? It would ensure that those certificates are always fresh.
We haven't sorted out how to handle periodic updates of those certificates and the current code also doesn't have your "populate deployment from objects" code yet. Let's figure out how to avoid updating the secrets too often later, okay?

That depends on how often we call this method, At least I could think of on every restart of operator pod. Updating secrets also lead to the restart of all driver pods.

That depends on how often we call this method, At least I could think of on every restart of operator pod.

Which isn't very often. And even if it was called more often, why would that be a reason to be concerned?

Updating secrets also lead to the restart of all driver pods.

No, it doesn't: https://kubernetes.io/docs/concepts/configuration/secret/#mounted-secrets-are-updated-automatically

It's still not perfect because the certificates may get rolled out on different nodes by kubelet at different times, which leads to a time window where controller and node cannot communicate. When designing the certificate update feature we need to take that into account, for example by keeping old and new CA root as trusted for a certain period of time.

For now I think it is okay that the system eventually recovers.

Which isn't very often. And even if it was called more often, why would that be a reason to be concerned?

Ok, If you feel it's not an issue.

No, it doesn't: https://kubernetes.io/docs/concepts/configuration/secret/#mounted-secrets-are-updated-automatically
It's still not perfect because the certificates may get rolled out on different nodes by kubelet at different times, which leads to a time window where controller and node cannot communicate. When designing the certificate update feature we need to take that into account, for example by keeping old and new CA root as trusted for a certain period of time.
For now I think it is okay that the system eventually recovers.

Interesting, Thanks for sharing.

pohly requested a review from avalluri April 16, 2020 07:36

pohly force-pushed the operator-object-creation-phases branch from eaf2281 to 54c7ce2 Compare April 16, 2020 07:37

pohly mentioned this pull request Apr 16, 2020

more operator fixes #584

Closed

avalluri reviewed Apr 17, 2020

View reviewed changes

avalluri approved these changes Apr 17, 2020

View reviewed changes

avalluri merged commit 6d4e08f into intel:operator Apr 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

operator: simplify object creation #586

operator: simplify object creation #586

pohly commented Apr 16, 2020

pohly commented Apr 16, 2020

avalluri Apr 17, 2020

pohly Apr 17, 2020

avalluri Apr 17, 2020

pohly Apr 17, 2020

avalluri Apr 17, 2020


		// getDeploymentObjects returns all objects that are part of a driver deployment.
		func (d PmemCSIDriver) getDeploymentObjects(r ReconcileDeployment) ([]runtime.Object, error) {

operator: simplify object creation #586

operator: simplify object creation #586

Conversation

pohly commented Apr 16, 2020

pohly commented Apr 16, 2020

avalluri Apr 17, 2020

Choose a reason for hiding this comment

pohly Apr 17, 2020

Choose a reason for hiding this comment

avalluri Apr 17, 2020

Choose a reason for hiding this comment

pohly Apr 17, 2020

Choose a reason for hiding this comment

avalluri Apr 17, 2020

Choose a reason for hiding this comment