Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry support to objectTracker #1014

Conversation

julianKatz
Copy link
Contributor

@julianKatz julianKatz commented Dec 9, 2020

Add the readiness-retries flag, allowing the user to configure how many
attempts should be made to injest a resource into OPA while blocking the
webhook.

Previously, a failure to injest an object into the OPA cache would
strike that object from the ObjectTracker, a type tasked with blocking
the webhook from serving requests until all the necessary objects had
been observed on the API server. This setup optimizes for availability
(the webhook serving requests ASAP) over security.

By adding retry functionality, we configure gatekeeper to block the
webhook for longer, but unblock with a more complete set of constraints.

Signed-off-by: juliankatz juliankatz@google.com

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch from 875c024 to 54b66cd Compare December 9, 2020 02:42
@maxsmythe
Copy link
Contributor

Note that this PR overlaps with #952

Copy link
Contributor

@maxsmythe maxsmythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look at this!

pkg/readiness/object_tracker.go Outdated Show resolved Hide resolved
}

func newObjTracker(gvk schema.GroupVersionKind) *objectTracker {
func newObjTracker(gvk schema.GroupVersionKind, fns ...objDataMutator) *objectTracker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Golang only lets you have one variadic argument type, I'm not sure we want to spend it exclusively on configuring object data specifically.

We could just have:

func newObjTracker(gvk schema.GroupVersionKind, mutator objDataMutator) *objectTracker {
   if mutator == nil {
      mutator = setRetriesFromFlag
   }
}

Also, if we are mutating anyway, why not just pass a constructor method that creates an appropriately formed object? That would save the need to pass arguments to mutatedObjData() on line 115 of the new code (though the name would likely change).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did my way because I wouldn't have to change the usages of newObjTracker around the code, though admittedly that isn't much of a reason.

I've changed it to what you have above.

Given that there isn't the ability to pass in arbitrary objDataMutators, I went ahead and changed objectTracker.mutators (of type []objDataMutator) to a singular objectTracker.mutator (type objDataMutator). LMK if you think I should switch it back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are functional options still cool these days? That's what I usually use. Fine either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used a functional options pattern initially, and @maxsmythe advocated for saving the variadic parameter for the future if possible. I figure this method leaves things open-ended for the future and perhaps follows a "YAGNI" mentality.

@@ -249,7 +281,7 @@ func (t *objectTracker) Satisfied() bool {
}
delete(t.seen, k)
delete(t.expect, k)
t.satisfied[k] = struct{}{}
t.satisfied[k] = mutatedObjData(t.mutators...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to leave this as an empty struct to avoid unnecessary memory usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually can't do that anymore, because the value is now of type objData (since it has the retries field in it). Is there a different implementation you'd suggest that dodges this?

Passing in a raw objData would save a call to a mutation function, but it's O(1).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just have the different arrays have different types. That being said, I'm not sure the extra memory savings is worth the extra complexity.

@shomron any opinions on uniformity vs. lower memory usage here? IIRC this gets GCd after bootstrapping anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just have the different arrays have different types

I made the same suggestion for other reasons (see my review). I'm not worried about the memory usage (and yes we nuke these maps after bootstrapping) - but more about how we can reason about the code.

pkg/readiness/object_tracker.go Outdated Show resolved Hide resolved
pkg/readiness/object_tracker.go Outdated Show resolved Hide resolved
pkg/readiness/object_tracker.go Outdated Show resolved Hide resolved

func (o objSet) decrementOrDelete(key objKey) bool {
val := o[key]
if val.retries > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should set this so a retries value of -1 always returns false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I'll add a test for this as well


func (o objSet) decrementOrDelete(key objKey) bool {
val := o[key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if the key is missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would ultimately return a zero-valued objData, which would trigger the else clause below. The delete would do nothing (key doesn't exist in map) and then the function would return true.

This really isn't a break from the previous behavior, which was a call to delete. That said, it's not as clear as it could be. I can make it val, ok := o[key] and return true (given the non-existent record is not decremented).

That said, I think this is poking at a question that's bigger than this function: how should objectTracker deal with bogus keys? With or without my change, those keys end up in t.cancelled. Is that a problem?

o[key] = val
return false
} else {
delete(o, key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just return a recommendation to delete or not so the actual management of all the different maps is handled in one place by the code?


func (o objSet) decrementOrDelete(key objKey) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if these methods were on objData instead? Would that give any benefits to the consumer code? Greater uniformity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the "Should we just return..." comment are related. Let's discuss here.

I think it would be good to make this a method of objData, since that's what it's really about. This makes it logical to move the delete call out of this function, as the objData does not have access to the map it is part of.

I think this is a better separation of concerns, and makes it easier to understand the objectTracker without having to jump to the objData code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it looks much nicer like this!

@maxsmythe
Copy link
Contributor

It looks like there is a linter failure as well

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch from 54b66cd to 24b0a75 Compare December 10, 2020 01:10
@codecov-io
Copy link

codecov-io commented Dec 10, 2020

Codecov Report

Merging #1014 (6bad8bc) into master (c36b54c) will increase coverage by 0.25%.
The diff coverage is 78.72%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1014      +/-   ##
==========================================
+ Coverage   47.66%   47.92%   +0.25%     
==========================================
  Files          62       62              
  Lines        4225     4259      +34     
==========================================
+ Hits         2014     2041      +27     
- Misses       1956     1962       +6     
- Partials      255      256       +1     
Flag Coverage Δ
unittests 47.92% <78.72%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/controller/constraint/constraint_controller.go 4.24% <0.00%> (ø)
pkg/readiness/noop_expectations.go 14.28% <20.00%> (+14.28%) ⬆️
pkg/readiness/object_tracker.go 77.83% <83.87%> (-0.08%) ⬇️
pkg/readiness/objset.go 87.50% <100.00%> (+87.50%) ⬆️
pkg/readiness/ready_tracker.go 69.81% <100.00%> (+0.72%) ⬆️
pkg/readiness/tracker_map.go 80.55% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c36b54c...6bad8bc. Read the comment docs.

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch from 24b0a75 to 71b1658 Compare December 10, 2020 01:13
pkg/readiness/noop_expectations.go Outdated Show resolved Hide resolved
pkg/readiness/objset.go Outdated Show resolved Hide resolved

type objDataMutator func(*objData) *objData

func setRetriesFromFlag(o *objData) *objData {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setRetriesFromFlag and mutatedObjData could be:

func newObjDataFromFlags() *objData {
   return &objData{retries: *readinessRetries}
}

and for tests you could pass in the following function as an argument:

func() *objData { return &objData{ retries: <<whatever # of retries>> } }

This avoids having mutators embedded in the code, which is a bit more confusing to follow b/c of the extra layer of indirection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to this implementation (which I called a Factory, since it generates a new instance of a type)

Why return a pointer? It seems better to just return an objData.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was avoiding Factory because it seems like the cool golang kids don't use it (too Java), but the analogy is apt. Personally, I usually don't name the functional argument types unless the function signature is long or there is a lot of repetition, but both ways work.

WRT why return a pointer... I suppose it doesn't really matter either way, as this is a small struct that only returns an int and I don't really see us taking advantage of any mutate-in-place behavior.

I tend to default to struct pointers unless I'm returning something I'm persisting to a library consumer. Then I'll return a ref or a deepcopied pointer in order to protect the library's internal state.

TL;DR pointers can help for large structs, they may have a performance/memory penalty for smaller structs. Pointers vs structs also lead to shared state behavior that can be very useful (lots of mutate-in-place logic happening in mutation), or very annoying.

For an internal-to-the-package API I really don't put much thought into it b/c it's so cheap to change later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented in the review that I think the tip off here is that every place we're using the return value, we immediately dereference it.
As an aside, I like geeking out on the value vs pointer debate. I tend to default to values to avoid accidentally allowing mutations, but you can make bugs either way you cut it.
In terms of performance, pointers will escape to the heap and put more pressure on the GC, but also provide less data locality which leads to all kinds of cache line misses. Completely irrelevant micro-optimizations in our case :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does golang make a heap vs. stack distinction for pointers vs. values? I would have thought the ability to send &value and have GC still work meant that they are treated the same WRT memory management?

There may be a golang subtlety I'm missing here though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a stack vs. heap optimization. I wonder when the decision is made... compile time?

https://medium.com/@meeusdylan/when-to-use-pointers-in-go-44c15fe04eac

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my understanding is that the compiler performs escape analysis, not during runtime. You can see some of the decisions it's making by passing -gcflags '-m' to your build.

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch from 71b1658 to d82198d Compare December 11, 2020 00:57
Copy link
Contributor

@maxsmythe maxsmythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting close!

pkg/readiness/object_tracker.go Outdated Show resolved Hide resolved
pkg/readiness/objset.go Outdated Show resolved Hide resolved

obj := t.expect[k]
shouldDelete := obj.decrementRetries()
t.expect[k] = obj
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than copying the code from CancelExpect, I think we should call it directly.

We'll need to move CancelExpect to a different function that doesn't acquire a lock, say cancelExpectNoLock and rewrite CancelExpect() to acquire a lock and call the same function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the decrementRetries() call in the middle, I had to pass that part in as a callback. Let me know if you think it makes it too complicated.


ot.TryCancelExpect(ct[0])
g.Expect(ot.Satisfied()).To(gomega.BeTrue(), "should be satisfied")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test the infinite behavior too...

I realize that we can't test for infinite retries, but if we set retry count to -1, cancel, and retry count is still -1 that implies that we will retry indefinitely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test that sets retries to -1 and then does 20 TryCancelExpect calls. This was the best I could think of as far as testing the .*Expect() functions.

The test you're suggesting seems to be more of a test of the objData struct itself. Happy to add that, just didn't because it seemed to break from the existing testing pattern.

Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the 20x test is sufficient

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch 2 times, most recently from 6ff2696 to 8964247 Compare December 11, 2020 20:16
func (t *objectTracker) TryCancelExpect(o runtime.Object) {
t.mu.Lock()
defer t.mu.Unlock()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not decrement retries here and if shouldCancel is true call cancelExpectNoLock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to duplicate this section:

k, err := objKeyFromObject(o)            
if err != nil {                          
    log.Error(err, "skipping")           
    return                               
}                                        

Would it be better to duplicate it? I could also move this logic into CancelExpect so that this logic isn't called twice in any codepath

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO error handling code is just kind of duplicative anyway, I don't sweat it too much. IMO the added code complexity here is more costly than duplicating trivial code.

Personally, I'd have the interior function take an object key and duplicate the key generation on the caller function.

I'd also duplicate the t.allSatisfied check to occur before the call to objKeyFromObject() (the way it is in the original code), just to make sure the short-circuited state (active 99.9% of the time if the pod is long-lived) is as cheap as possible.

Copy link
Contributor

@maxsmythe maxsmythe Dec 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between copying those bits and copying the various deletes() is that the deletes are highly coordinated, where one missing delete could put the system in a bad state. In that situation, avoiding copying lowers the likelihood that a future change misses one copy, introducing a bug.

For the get-key logic and the short-circuit logic, the risk of duplication is annoying a programmer and lower performance if a copy is mishandled, respectively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great feedback. I like the part about trying to minimize complexity in the most crucial parts of the logic. DRYing up trivial logic is less valuable, as that's easier to grok/harder to break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After moving all of this out of cancelExpectNoLock(), it made more sense to me to remove all conditional logic from that function. All of the "should delete" business is now held in TryCancelExpect() where it belongs. That way none of the complexity of that codepath is bleeding into the simpler CancelExpect()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, awesome!

pkg/readiness/object_tracker.go Outdated Show resolved Hide resolved
pkg/readiness/object_tracker_test.go Outdated Show resolved Hide resolved
pkg/readiness/object_tracker_test.go Outdated Show resolved Hide resolved
@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch from 8964247 to 21f17af Compare December 15, 2020 21:15
Copy link
Contributor

@maxsmythe maxsmythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shomron LGTY?

func (t *objectTracker) TryCancelExpect(o runtime.Object) {
t.mu.Lock()
defer t.mu.Unlock()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, awesome!

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch 2 times, most recently from dd5d6ad to 57cfb2f Compare December 17, 2020 19:21
}

type objDataFactory func() *objData
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: from what I can tell, all uses of these factory functions always immediately dereference the pointer. Should we just return a value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's dereferences across the board. Personally, I vote in favor of returning objData by value, as that's consistent with the usage in the package (values of an objSet are objData structs, not pointers to said structs).

I'm going to switch this to returning a value unless @maxsmythe feels strongly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

// Check if it's time to delete an expectation or just decrement its allotted retries
obj := t.expect[k]
shouldDel := obj.decrementRetries()
t.expect[k] = obj // set the changed obj back to the map, as the value is not a pointer
Copy link
Contributor

@shomron shomron Dec 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the counter is exhausted, subsequent calls to TryCancelExpect cause the expectation to be added and immediately removed. I think it's fine, but a comment might help future us.

Copy link
Contributor Author

@julianKatz julianKatz Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to add explicit conditional logic around this, as opposed to relying on the zero-value behavior? Something like:

// Check if it's time to delete an expectation or just decrement its allotted retries
obj, exists := t.expect[k]
if !exists {
  // Do a log.Info() here?
  return
}

shouldDel := obj.decrementRetries()
t.expect[k] = obj // set the changed obj back to the map, as the value is not a pointer

if shouldDel {
    t.cancelExpectNoLock(k)
}

This has more code, but it prevents the behavior that you mentioned. Perhaps the verbosity is worth the clarity? LMK your thoughts @shomron

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be to just use pointers in the map to avoid the extra code in the first place.

If you add the if-statement, no need for the log.Info() IMO.

I like the if-statement. It seems less brittle as it wont break if decrementRetries() changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to be careful - if you skip calling cancelExpectNoLock then the object is not marked in t.cancelled - which changes the old behavior for certain out-of-order scenarios.

A comment and a test would work for me, otherwise we need to define the correct behavior for cancelling unexpected objects in light of the new code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The out-of-order execution is a good point... I had forgotten about that. Given that, I agree the comment is the better choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait... out-of-order also adds a new layer of difficulty here.

If we perform a TryCancel on something that has no expectations yet, we should probably track the # of attempts. We shouldn't short-circuit the retry behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So @maxsmythe and I have been discussing the right way to implement this and came up with the idea of tracking TryCancelExpect calls in their own separate map. Our thinking was this would allow us to track the number of attempts made to cancel, regardless of the other actions that have been called on an object.

It turns out that this will provide a fix for a nasty edge case. Because of this section in TryCancelExpect():

obj := t.expect[k]
shouldDel := obj.decrementRetries()
t.expect[k] = obj // set the changed obj back to the map, as the value is not a pointer

An as yet unexpected obj will actually be added to t.expect as a zero-value version of the struct. This over-writes the retry behavior, which is set via a constructor function.

This provides further indication that separating these ideas into two maps will make the code easier to grok, as I obviously did this by mistake. Yay for TDD!

}

func newObjTracker(gvk schema.GroupVersionKind) *objectTracker {
func newObjTracker(gvk schema.GroupVersionKind, fns ...objDataMutator) *objectTracker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are functional options still cool these days? That's what I usually use. Fine either way.

return &objData{retries: *readinessRetries}
}

type objSet map[objKey]objData
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong, but only the expect set uses counters. I'm finding it confusing reading the code and checking whether Observe is still idempotent because it overwrites a counter nobody is looking at. Should we create a separate type for expect and all the rest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me.

I've added a second type: objRetrySet. This type is used only for t.expect, while t.seen and others will be assigned like:

t.seen[k] = struct{}{}

as they were done before. I believe makes the distinction clear in code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per this other comment, I've now moved the retries tracking into its own map.

So, t.expect is back to using the original objSet.

@@ -249,7 +281,7 @@ func (t *objectTracker) Satisfied() bool {
}
delete(t.seen, k)
delete(t.expect, k)
t.satisfied[k] = struct{}{}
t.satisfied[k] = mutatedObjData(t.mutators...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just have the different arrays have different types

I made the same suggestion for other reasons (see my review). I'm not worried about the memory usage (and yes we nuke these maps after bootstrapping) - but more about how we can reason about the code.

@shomron
Copy link
Contributor

shomron commented Dec 19, 2020

LGTM, no blockers.

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch 2 times, most recently from 928ce30 to 1c3050d Compare December 21, 2020 23:41
Copy link
Contributor

@shomron shomron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julianKatz lgtm, thanks for working on this enhancement!

@julianKatz
Copy link
Contributor Author

@julianKatz lgtm, thanks for working on this enhancement!

Thanks for the thorough review!! @shomron

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch from 1c3050d to 3633049 Compare December 29, 2020 02:16
@julianKatz
Copy link
Contributor Author

I think I've addressed all the comments on this. Could @maxsmythe or @shomron merge it for me?

@maxsmythe
Copy link
Contributor

Thanks for fixing that order-of-ops issue @shomron do you want to take a second look or are we good to merge?

@shomron
Copy link
Contributor

shomron commented Jan 7, 2021

lgtm. Only nit is that we could release the tryCancelled map in objectTracker.Satisfied once the circuit breaker trips (countering golang/go#20135)

@@ -29,13 +30,16 @@ import (
"k8s.io/apimachinery/pkg/types"
)

var readinessRetries = flag.Int("readiness-retries", 0, "The number of resource ingestion attempts allowed before the resource is disregarded. A value of -1 will retry indefinitely.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add this flag to the readme so people know how to use it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ritazh I've added some information to the README. Could you take a look?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @julianKatz!

@julianKatz
Copy link
Contributor Author

lgtm. Only nit is that we could release the tryCancelled map in objectTracker.Satisfied once the circuit breaker trips (countering golang/go#20135)

Good call. Will do.

@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch 2 times, most recently from f298a8f to fda0b7a Compare January 9, 2021 01:33
@julianKatz
Copy link
Contributor Author

lgtm. Only nit is that we could release the tryCancelled map in objectTracker.Satisfied once the circuit breaker trips (countering golang/go#20135)

This is done

README.md Outdated Show resolved Hide resolved
@@ -51,20 +55,28 @@ type objectTracker struct {
gvk schema.GroupVersionKind
cancelled objSet // expectations that have been cancelled
expect objSet // unresolved expectations
tryCancelled objRetrySet // TryCancelExpect calls, decrementing allotted retries for an object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment should read: tracks TryCancelExpect calls, ...

attempts should be made to injest a resource into OPA while blocking the
webhook.

Previously, a failure to injest an object into the OPA cache would
strike that object from the ObjectTracker, a type tasked with blocking
the webhook from serving requests until all the necessary objects had
been observed on the API server.  That setup optimizes for availability
(the webhook serving requests ASAP) over security.

By adding retry functionality, we configure gatekeeper to block the
webhook for longer, but unblock with a more complete set of constraints.

fixes: #213

Signed-off-by: juliankatz <juliankatz@google.com>
@julianKatz julianKatz force-pushed the wait-for-sync-before-webhook-goes-live-213-173056532 branch from fda0b7a to 6bad8bc Compare January 11, 2021 21:59
@maxsmythe
Copy link
Contributor

Looks like all outstanding comments have been addressed, merging.

@maxsmythe maxsmythe merged commit 1fc4ad1 into open-policy-agent:master Jan 11, 2021
@julianKatz julianKatz deleted the wait-for-sync-before-webhook-goes-live-213-173056532 branch January 11, 2021 23:06
@maxsmythe maxsmythe mentioned this pull request Jan 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants