-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caching: Support out-of-thread (async) HttpCache implementations #12622
Conversation
…thread HttpCache implementations Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…ntext Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@@ -32,8 +33,12 @@ CacheFilter::CacheFilter(const envoy::extensions::filters::http::cache::v3alpha: | |||
: time_source_(time_source), cache_(http_cache) {} | |||
|
|||
void CacheFilter::onDestroy() { | |||
lookup_ = nullptr; | |||
insert_ = nullptr; | |||
if (lookup_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to calling onDestroy on lookup_ and insert_, we also need to remember that our onDestroy_ has been called. They might have posted callbacks that haven't happened yet; when we get them, we need to ignore them. I suggest adding a new FilterState for this. (Another approach would be to hand these pointers to dispatcher().deferredDelete, but making it a state seems cleaner.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattklein123 This PR relies on details of event handling that aren't documented, but that I think are intended. Specifically, it assumes that all callbacks posted before the return of CacheFilter::onDestroy will be executed before this CacheFilter is deleted. (If this isn't intended, then I don't understand the purpose of deferred deletion.) Can you confirm? (Either way, we should update the docs.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tracked this PR, but no, this is not guaranteed. Can you describe a bit more what is going on or do you want me to review the entire PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of CacheFilter's plugin cache implementations will complete requests asynchronously, and there's no guarantee of what thread those completions will run on, so their callbacks get posted to our dispatcher. Suppose a plugin posts a callback, then (perhaps due to a client disconnect) CacheFilter::onDestroy is called. That callback must happen before the CacheFilter gets deleted (or else it will access a deleted CacheFilter). AFAICT, the current code does give us the correct order (because it runs pending callbacks before it runs deferred deletion), and I'll be surprised if it ever needs to change, but we need to make sure.
I think that all we need to do is add documentation and tests to lock in the current behavior of running pending callbacks before deferred deletion.
If we can't rely on this, we'll have to go back to a prior iteration of the original CacheFilter PR, and use either weak_ptrs or shared_ptrs to the CacheFilter, but that complicates the code and adds bus-locked operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution here is to not use post, but use a "timer" which can be cancelled. Can we switch to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmarantz the problem arises if the cache lookup has already finished and posted the callback to the dispatcher, but then the client closes the connection and the filter chain is destroyed. Currently, there's no guarantee that the posted callback will run before the filter is deleted. If the filter is deleted first, the posted callback will run on a destroyed filter.
One solution here is to capture a weak_ptr to the CacheFilter in the posted callback, and check that the CacheFilter is still alive before accessing any of its members/methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we arrange a destroy-notify callback from the filter chain? The cache system could then mark the filter structure as stale and avoid referencing it further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course weak_ptr does the job too, if the thing being pointed to is already a shared_ptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more thing: the advantage of a callback over a weak_ptr is that you might be able to actively cancel an outstanding lookup request (if that is possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattklein123 I am following the discussion at #12364. Can you elaborate how is it related to our problem here?
It's loosely related in the sense that we have TLS post/callbacks that are being running beyond the acceptable lifetime.
one more thing: the advantage of a callback over a weak_ptr is that you might be able to actively cancel an outstanding lookup request (if that is possible).
This is how a cancellable post would be accomplished within dispatcher. This is what I recommend if you want a core solution, otherwise you can implement this yourself.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
…ered while the filter is being destroyed. Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
I assigned myself so can help review the entire PR once we sort out the implementation plan. /wait |
Calling onDestroy on lookup_ and insert_ takes care of the cancellation
callback. Holding a weak_ptr<CacheFilter> in the callback lambda takes care
of CacheFilter getting deleted before the post happens. (See PR #7198 for
an earlier state of this approach.)
…On Tue, Aug 18, 2020 at 1:49 PM Matt Klein ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In source/extensions/filters/http/cache/cache_filter.cc
<#12622 (comment)>:
> @@ -32,8 +33,12 @@ CacheFilter::CacheFilter(const envoy::extensions::filters::http::cache::v3alpha:
: time_source_(time_source), cache_(http_cache) {}
void CacheFilter::onDestroy() {
- lookup_ = nullptr;
- insert_ = nullptr;
+ if (lookup_) {
@mattklein123 <https://github.com/mattklein123> I am following the
discussion at #12364 <#12364>.
Can you elaborate how is it related to our problem here?
It's loosely related in the sense that we have TLS post/callbacks that are
being running beyond the acceptable lifetime.
one more thing: the advantage of a callback over a weak_ptr is that you
might be able to actively cancel an outstanding lookup request (if that is
possible).
This is how a cancellable post would be accomplished within dispatcher.
This is what I recommend if you want a core solution, otherwise you can
implement this yourself.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#12622 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFRAWPPRY7CMV7FFNGKPGF3SBLSLXANCNFSM4P5NVFGQ>
.
|
…r being deleted, in the case where the cache callback is posted to the dispatcher then the filter is deleted before the callback is executed Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
I added safeguard against the "callback posted then filter deleted" case we have been discussing by using a weak_ptr. I tried to make it as simple as possible. I added a TODO to look into other solutions as they arise (guaranteed dispatcher ordering of posts and deletions, cancellable posts, etc.). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks generally LGTM but a few small bugs/questions. Thank you!
/wait
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
@mattklein123 I fixed the test. Turns out the lookup in the test did not find any cache entries so |
Great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Commit Message:
Added support to out-of-thread (async) implementations of HttpCache.
Signed-off-by: Yosry Ahmed yosryahmed@google.com
Additional Description:
Risk Level: Low
Testing: Updated existing tests
Docs Changes: N/A
Release Notes: N/A