Managed resource reconciler refactoring #65

negz · 2019-10-31T08:01:53Z

Description of your changes

Opening this as a draft for now, because it's fairly wide ranging and should probably be more than one PR. In the meantime everything is broken down by commit fairly cleanly.

The PR:

Adds finalizers right before the creation block - see rationale in that commit message.
Triggers reference resolution on every reconcile - this will require updates to referencers to ensure they're idempotent (i.e. only append the resolved value to a slice once).
Avoids updating the managed resource if it doesn't change during the reference resolution process.
Removes the check for the attributereferencer struct tag - see rationale in that commit message.
Cleans up the managed reconciler and reference resolver tests.

Fixes #63
Fixes #62
Fixes #53
Fixes #51

I've tested this all out by creating a GKE cluster and CloudSQL database using the examples/workload manifests.

Checklist

I have:

Run make reviewable to ensure this PR is ready for review.
Ensured this PR contains a neat, self documenting set of commits.
Updated any relevant documentation, examples, or release notes.
Updated the RBAC permissions in clusterrole.yaml to include any new types.

upbound-bot · 2019-10-31T08:04:21Z

upbound-bot · 2019-10-31T08:15:05Z

muvaf · 2019-10-31T13:15:00Z

I think I need some strong reasons to introduce Annotator chain. It seems that the reason we separate annotation process from initialization is we want to separate the identification from the life-cycle management that we achieve through k8s finalizers. But per #51 there is no need to make k8s finalizer mechanism an Initializer at all. So, this removes the life-cycle responsibility of Initialize call, meaning we don't need to call the Initialize after Observe anymore. We can keep the Initialize call as is and add the finalizer add/delete to the appropriate places.

Something like this;

Annotate call is Initialize again.
APIFinalizerAdder & APIFinalizerRemover are made internal objects.
We call addKubernetesFinalizer in the place of Initialize call as in this PR(right before if !observation.ResourceExists)
We call removeKubernetesFinalizer right after Finalize call.

muvaf

Thanks for digging into these issues right away! I know they look different but also underlying issues are pretty similar, but I think we can still separate this PRs into two or three different ones. Maybe one for references+comments and one for initializer thing?

muvaf · 2019-10-31T13:30:02Z

pkg/resource/claim_binding_reconciler.go

@@ -138,6 +138,20 @@ type ManagedFinalizer interface {
 	Finalize(ctx context.Context, cm Managed) error
 }

+// A FinalizerChain chains multiple managed finalizers.
+type FinalizerChain []ManagedFinalizer


Noticed it the first time in this PR but why do we call r.managed.Finalize in claim reconciler? Shouldn't claim reconciler just send the deletion signal and checks whether object is deleted, i.e (NotFound) and then calls its claim's finalizer? Pretty much like how managed reconciler treats its external resource.

In both of these reconcilers the "Finalize" interface is not intended explicitly to mean "remove a finalizer" in the metadata sense, but rather "finalize the lifecycle of this object". I can see how this is confusing, but struggled to find a better name for the interface.

The managed resource reconciler finalizes its work on a managed resource by removing its finalizer. The claim binding reconclise finalizes its work on a managed resource by removing its claim ref and setting it to unbound.

pkg/resource/managed_reconciler.go

muvaf · 2019-10-31T13:43:55Z

pkg/resource/managed_reconciler.go

@@ -339,7 +405,7 @@ func NewManagedReconciler(m manager.Manager, of ManagedKind, o ...ManagedReconci

 // Reconcile a managed resource with an external resource.
 func (r *ManagedReconciler) Reconcile(req reconcile.Request) (reconcile.Result, error) { // nolint:gocyclo
-	// NOTE(negz): This method is a little over our cyclomatic complexity goal.
+	// NOTE(negz): This method is a well over our cyclomatic complexity goal.


Sad but true.

negz · 2019-10-31T19:25:50Z

But per #51 there is no need to make k8s finalizer mechanism an Initializer at all.

I disagree with this. You're right that in practice pretty much no-one is going to disable adding and removing finalizers, but I think we should keep them abstracted in something like an Initializer. Currently the responsibility of the managed reconciler's Reconcile method is:

Update the Synced condition.
Update the managed resource status (only, never anything else).
Orchestrate various interfaces that do the real work.

I'd like to keep the responsibility of updating finalizer metadata contained within a separate type that we can stub out. However per our discussion this morning I think we could probably make this less obtuse by:

Removing the Annotator interface, and keeping Initializer. This would set the external name.
Naming the interfaces that add and remove the finalizer more specifically FinalizerAdder and FinalizerRemover.

negz · 2019-10-31T19:44:11Z

I've broken out two PRs from this one:

I'll break some more out once the above are merged, since much of my updates assume the new testing style.

muvaf · 2019-10-31T20:13:20Z

Removing the Annotator interface, and keeping Initializer. This would set the external name.
Naming the interfaces that add and remove the finalizer more specifically FinalizerAdder and FinalizerRemover.

This seems like a good path forward to me as discussed. Having the FinalizerAdder and FinalizerRemover pluggable would make the function more testable as you pointed out.

About the naming of the ManagedFinalizer interface and Finalize function, what do you think about using Release/TearDown/Complete instead of Finalize? It's just so easy to mix this Finalize word with k8s finalizers.

This commit refactors ResolveReferencers to allow the code that finds types within a struct that satisfy AttributeReferencer to be swapped out. It also updates the default AttributeReferencerFinder to avoid checking struct tags. Previously errors were returned when: 1. A struct field tagged as a referencer did not satisfy AttributeReferencer 2. A struct field not tagged as a referencer satisfied AttributeReferencer If either of these scenarios occurred, ResolveReferences would panic with the returned error the first time it encountered an incorrectly written API type. My feeling is that both of these conditions are testing for programmer errors that would be better caught at build time than at runtime. Signed-off-by: Nic Cope <negz@rk0n.org>

Signed-off-by: Nic Cope <negz@rk0n.org>

the established convention is for the managed resource reconciler to requeue after a short wait (typically 30 seconds) when it knows it is waiting for an operation. Signed-off-by: Nic Cope <negz@rk0n.org>

Signed-off-by: Nic Cope <negz@rk0n.org>

Reference resolution is now a no-op if nothing changes, so we run it on every reconcile. We also run it after delete has been handled, so unresolved references will only block creates and updates. This commit means we'll make more get calls to the cache (or API) in order to resolve our references each reconcile, and also risk potentially changing the values of 'immutable' fields automatically if and when our references resources change. I believe we should address this by having referencers be no-ops when the field value they would set is already set. Signed-off-by: Nic Cope <negz@rk0n.org>

The mock binding status was identical to the real one, while the mock conditioned status set only the most recent condition, leading to a few slightly broken managed resource reconciler tests. Signed-off-by: Nic Cope <negz@rk0n.org>

* By the time we get here we know our Observe call worked. If (for example) our cloud provider credentials were completely wrong, we'd never proceed far enough to add the finalizer. * If Observe works but Create fails (for example because we had RO cloud provider credentials) we would already have added the finalizer, but... * When the managed resource was deleted we'd be able to Observe that the external resource does not exist (because we were never able to Create it) and thus would not call Delete on the external resource and go straight to unpublishing credentials and removing the finalizer. Signed-off-by: Nic Cope <negz@rk0n.org>

The initialization process sets the finalizer, while the annotation process sets the managed resource's external name. We now set the finalizer later in the reconciliation process, but still want to set the annotations early in the reconcile process. Signed-off-by: Nic Cope <negz@rk0n.org>

It turns out certain managed resources use part of their spec to observe their external resource, and that part of the spec can depend on a reference. GCP connections are an example of this; there's no "get connections" call so instead we list all connections within a network, which is a spec field that could be set by a reference. Signed-off-by: Nic Cope <negz@rk0n.org>

We no longer need CanReference types to satisfy the metav1.Object interface. It was used only to determine the namespace of the referencing object before all such objects became cluster scoped. Signed-off-by: Nic Cope <negz@rk0n.org>

upbound-bot · 2019-10-31T20:18:50Z

negz · 2019-11-01T02:41:44Z

#68 and #70 cover all the changes I was pondering in this draft, so I'm closing this in favor of those smaller PRs.

negz requested a review from muvaf October 31, 2019 08:01

negz changed the title ~~Reconciliation~~ Managed resource reconciler refactoring Oct 31, 2019

muvaf reviewed Oct 31, 2019

View reviewed changes

negz added 11 commits October 31, 2019 13:14

Fix grammar in reference resolution condition message

e72d0e7

Signed-off-by: Nic Cope <negz@rk0n.org>

Make reference resolution a no-op if assignments did not change anything

fa747f3

Signed-off-by: Nic Cope <negz@rk0n.org>

Requeue after a short wait when references are not ready

f6cd27b

the established convention is for the managed resource reconciler to requeue after a short wait (typically 30 seconds) when it knows it is waiting for an operation. Signed-off-by: Nic Cope <negz@rk0n.org>

Don't requeue after a successful deletion

10c2ceb

Signed-off-by: Nic Cope <negz@rk0n.org>

Make resource.CanReference a type of runtime.Object

ffe80d2

We no longer need CanReference types to satisfy the metav1.Object interface. It was used only to determine the namespace of the referencing object before all such objects became cluster scoped. Signed-off-by: Nic Cope <negz@rk0n.org>

negz force-pushed the reconciliation branch from a9be18f to ffe80d2 Compare October 31, 2019 20:16

negz closed this Nov 1, 2019

negz deleted the reconciliation branch November 1, 2019 02:41

muvaf mentioned this pull request Nov 1, 2019

Add managed resource finalizer immediately before creating #70

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Managed resource reconciler refactoring #65

Managed resource reconciler refactoring #65

negz commented Oct 31, 2019 •

edited

Loading

upbound-bot commented Oct 31, 2019

upbound-bot commented Oct 31, 2019

muvaf commented Oct 31, 2019

muvaf left a comment

muvaf Oct 31, 2019

negz Oct 31, 2019

muvaf Oct 31, 2019

negz commented Oct 31, 2019

negz commented Oct 31, 2019

muvaf commented Oct 31, 2019

upbound-bot commented Oct 31, 2019

negz commented Nov 1, 2019

Managed resource reconciler refactoring #65

Managed resource reconciler refactoring #65

Conversation

negz commented Oct 31, 2019 • edited Loading

Description of your changes

Checklist

upbound-bot commented Oct 31, 2019

upbound-bot commented Oct 31, 2019

muvaf commented Oct 31, 2019

muvaf left a comment

Choose a reason for hiding this comment

muvaf Oct 31, 2019

Choose a reason for hiding this comment

negz Oct 31, 2019

Choose a reason for hiding this comment

muvaf Oct 31, 2019

Choose a reason for hiding this comment

negz commented Oct 31, 2019

negz commented Oct 31, 2019

muvaf commented Oct 31, 2019

upbound-bot commented Oct 31, 2019

negz commented Nov 1, 2019

negz commented Oct 31, 2019 •

edited

Loading