Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use context cache in most queue handlers to avoid warning logs #23218

Closed
wants to merge 1 commit into from
Closed

Use context cache in most queue handlers to avoid warning logs #23218

wants to merge 1 commit into from

Conversation

wolfogre
Copy link
Member

@wolfogre wolfogre commented Mar 1, 2023

Follow: #23186 #23054

Related to: #22294

The point is that the context cache is not only "HTTP request level cache", but also "queue handling level cache". And there are so many queues...

@wolfogre wolfogre added the type/enhancement An improvement of existing functionality label Mar 1, 2023
@wolfogre wolfogre added this to the 1.20.0 milestone Mar 1, 2023
@wolfogre wolfogre requested a review from lunny March 1, 2023 10:47
@wxiaoguang
Copy link
Contributor

Is it correct to use cache for all these background tasks?

If yes, that's fine.

If no (eg: cache items get out-of-sync and cause bugs?), I think it's worth to add a cache.WithNoCacheContext() to explicitly declare that this context doesn't support cache, then the cache code could skip the Warn log if it detects that the context doesn't support cache by purpose.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 1, 2023
@wolfogre
Copy link
Member Author

wolfogre commented Mar 2, 2023

Is it correct to use cache for all these background tasks?

TBH, I'm not entirely sure. It's correct to do this only if all queue handling tasks should take only a short time(less than 5 minutes), so it's fine when the cache gets out-of-sync. But I can't prove it or disprove it.

... add a cache.WithNoCacheContext() ...

Yeah, I believe it can resolve the warning logs. But there will be three ways to use a context:

  • ctx = cache.WithCacheContext(ctx)
  • ctx = cache.WithNoCacheContext(ctx)
  • just ctx

What's the difference? And what if cache.WithCacheContext(cache.WithNoCacheContext(ctx)) or cache.WithNoCacheContext(cache.WithCacheContext(ctx))? 😂

Of course, we can define those cases, but I think it would be quite confusing.


So I think the point is not how to avoid warning logs, but that the context cache is designed for "short life cycle tasks", like handling a HTTP request, and (maybe) handling a task from queue, but developers could use it for "long life cycle tasks" by accident. IMO, five minutes is long enough.

There are some ideas for it:

  • Remove the warning logs, ctx means "no cache", so we don't need cache.WithNoCacheContext. And use cache.WithCacheContext for handling HTTP requests only. Then add the cache for tasks which we are sure are worth it and have a short life cycle.
  • Refactor queue.HandlerFunc from type HandlerFunc func(...Data) (unhandled []Data) to type HandlerFunc func(ctx context.Context, ...Data) (unhandled []Data), so it will be easy to enable or disable cache context for the handlers before invoking them.
  • Add a lifetime to the cache context, like 5 minutes. If the cache data is still being read after 5 minutes after it was created, skip using the cache data and print warning logs, and it's also a good idea to log stack trace to identify where it is being misused.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Mar 2, 2023

I think we should explicitly declare a context as "cache-support" or "no-cache".

Think about a case (your confusing part):

ctx := cache.WithCache(...)

foo(ctx)

func foo(ctx) {
    // here, what if `foo` want to bypass the cache?
    // if we have "cache.WithNoCache", we can overwrite the "WithCache"
    // by design, Go's context only use "Value" to access it's context data, so IMO it's overwrite-able.
    ctx = cache.WithNoCache(ctx)
}

As a framework mechanism, I think it's always worth to make everything explicitly clear, instead of using implicit behaviors.

@wxiaoguang
Copy link
Contributor

I have the same idea with "Add a lifetime to the cache context, like 5 minutes" , while I think 5-min is still too long. IMO 5sec is long enough.

@lunny
Copy link
Member

lunny commented Mar 2, 2023

Is it correct to use cache for all these background tasks?

TBH, I'm not entirely sure. It's correct to do this only if all queue handling tasks should take only a short time(less than 5 minutes), so it's fine when the cache gets out-of-sync. But I can't prove it or disprove it.

... add a cache.WithNoCacheContext() ...

Yeah, I believe it can resolve the warning logs. But there will be three ways to use a context:

* `ctx = cache.WithCacheContext(ctx)`

* `ctx = cache.WithNoCacheContext(ctx)`

* just `ctx`

What's the difference? And what if cache.WithCacheContext(cache.WithNoCacheContext(ctx)) or cache.WithNoCacheContext(cache.WithCacheContext(ctx))? 😂

Of course, we can define those cases, but I think it would be quite confusing.

So I think the point is not how to avoid warning logs, but that the context cache is designed for "short life cycle tasks", like handling a HTTP request, and (maybe) handling a task from queue, but developers could use it for "long life cycle tasks" by accident. IMO, five minutes is long enough.

There are some ideas for it:

* Remove the warnlogs logs, `ctx` means "no cache", so we don't need `cache.WithNoCacheContext`. And use `cache.WithCacheContext` for handling HTTP requests only. Then add the cache for tasks which we are sure are worth it and have a short life cycle.

* Refactor `queue.HandlerFunc` from `type HandlerFunc func(...Data) (unhandled []Data)` to `type HandlerFunc func(ctx context.Context, ...Data) (unhandled []Data)`, so it will be easy to enable or disable cache context for the handlers before invoking them.

* Add a lifetime to the cache context, like 5 minutes. If the cache data is still being read after 5 minutes after it was created, skip using the cache data and print warning logs, and it's also a good idea to log stack trace to identify where it is being misused.

Or we can just remove the warning log?

@wxiaoguang
Copy link
Contributor

Or we can just remove the warning log?

I do not think so, that log really helps us to think about these problems.

I always prefer to make framework mechanisms have stable/explicit behaviors, to avoid surprises and bugs. Many legacy bugs/problems in Gitea are caused by unclear/misued behaviors.

But if most people like the implicit behaviors and think that no need to distinguish a context with/without cache support, I do not have strong objection at the moment - until some new bug comes.

@wolfogre wolfogre added the pr/wip This PR is not ready for review label Mar 6, 2023
@wolfogre wolfogre mentioned this pull request Mar 6, 2023
@wolfogre wolfogre closed this Mar 6, 2023
@lunny lunny removed this from the 1.20.0 milestone Mar 6, 2023
jolheiser pushed a commit that referenced this pull request Mar 8, 2023
Related to: #22294 #23186 #23054

Replace: #23218

Some discussion is in the comments of #23218.

Highlights:
- Add Expiration for cache context. If a cache context has been used for
more than 10s, the cache data will be ignored, and warning logs will be
printed.
- Add `discard` field to `cacheContext`, a `cacheContext` with `discard`
true will drop all cached data and won't store any new one.
- Introduce `WithNoCacheContext`, if one wants to run long-life tasks,
but the parent context is a cache context,
`WithNoCacheContext(perentCtx)` will discard the cache data, so it will
be safe to keep the context for a long time.
- It will be fine to treat an original context as a cache context, like
`GetContextData(context.Backgraud())`, no warning logs will be printed.

Some cases about nesting:

When:
- *A*, *B* or *C* means a cache context.
- ~*A*~, ~*B*~ or ~*C*~ means a discard cache context.
- `ctx` means `context.Backgrand()`
- *A(ctx)* means a cache context with `ctx` as the parent context.
- *B(A(ctx))* means a cache context with `A(ctx)` as the parent context.
- `With` means `WithCacheContext`
- `WithNo` means `WithNoCacheContext`

So:
- `With(ctx)` -> *A(ctx)*
- `With(With(ctx))` -> *A(ctx)*, not *B(A(ctx))*
- `With(With(With(ctx)))` -> *A(ctx)*, not *C(B(A(ctx)))*
- `WithNo(ctx)` -> *ctx*, not *~A~(ctx)*
- `WithNo(With(ctx))` -> *~A~(ctx)*
- `WithNo(WithNo(With(ctx)))` -> *~A~(ctx)*, not *~B~(~A~(ctx))*
- `With(WithNo(With(ctx)))` -> *B(~A~(ctx))*
- `WithNo(With(WithNo(With(ctx))))` -> *~B~(~A~(ctx))*
- `With(WithNo(With(WithNo(With(ctx)))))` -> *C(~B~(~A~(ctx)))*
@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. pr/wip This PR is not ready for review type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants