-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup ContextInstances if there are null instances #38761
Conversation
fb75964
to
46da21d
Compare
46da21d
to
a25c127
Compare
...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java
Show resolved
Hide resolved
92bc2e5
to
8aa4ea2
Compare
This comment has been minimized.
This comment has been minimized.
8aa4ea2
to
be0d26d
Compare
This comment has been minimized.
This comment has been minimized.
...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java
Outdated
Show resolved
Hide resolved
...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java
Show resolved
Hide resolved
...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added few comments but it looks really good!
I will try to run our microbenchmarks to see if it makes any difference...
Thanks @mkouba yep, in case the micro doesn't impact, I suggest to perform one where the context instance is allocated on the fly and have different instance fields while using a parameter to decide how many to compute, and including a remove at the end of the usage life-cycle ; having such benchmark will help capturing the actual possible behaviours for request scoped beans, which seems to be allocated, computed, removed in the hot path |
So I ran the microbenchmarks and the only benchmark that shows significant improvement is the The However, the RequestContextBenchmark, which activates the context, invokes client proxies of 5 |
Exactly and happy that you already had crafted other benchmarks which capture the use case (which is farly realistic for the request scoped case) of the changes sent here, thanks again for checking!!!! Probably RequestContextBenchmark, could have been relevant if the benchmark had a parameter to decide how many of these beans are computed eg for a total of 5 beans, just 1(,2,3,4,5) is computed. It will impact the removal as well, given that this pr remove only what is actually computed, too. Just for reference this was the call-stack of a compute case with 1/5 beans were computed (hence saving 3 * 4 = 12 atomic volatile ops in the hot path) |
So I've tried to adapt the
And this PR indeed increased the throughput in the first scenario (1/10) by ~ 40%, in the second (3/10) by ~ 15% and the results of the third are more or less the same. In other words, the more request scoped beans exist in the app and the less beans are actually used in the benchmark the better throughput with this PR. And since it's very likely that only a fraction of all request scoped beans is used in a single request of a "real world" app I think that it's really a great improvement. Good job! |
be0d26d
to
0608ddd
Compare
I'm adding a note here to help ourselves of the future; in order to "fix" the existing impl and this PR in the case where a compute is started after (no need to be concurrent really) it (regardless the remove has found any value to remove, before):
The last loop is necessary because other(s) concurrent compute can end up "recomputing" the value again, so it's important for them to keep on trying to remove the value till is stable as This mechanism is to prevent @mkouba @Ladicek |
This comment has been minimized.
This comment has been minimized.
0608ddd
to
c256a0c
Compare
This comment has been minimized.
This comment has been minimized.
c256a0c
to
a86cddb
Compare
Ok @geoand this should be a very good one, once merged. Just need to make sure the test failures are all flaky ones |
This comment has been minimized.
This comment has been minimized.
https://github.com/franz1981/quarkus/actions/runs/7941041918 seems to show that the failure reported here is real or another flaky test |
a86cddb
to
ab94de7
Compare
This comment has been minimized.
This comment has been minimized.
ab94de7
to
b0b28a4
Compare
@mkouba this looks good, WDYT? |
This comment has been minimized.
This comment has been minimized.
b0b28a4
to
9905b94
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getAllPresent()
will no longer return a consistent snapshot, but that's not guaranteed for ComputingCacheContextInstances
either, so should be fine. Nice work!
Status for workflow
|
This is an alternative to #38737 with no need of cutoff values and always performing fast-path unlocked checks, if possible, and lazy allocation of a
ReentrantLock
.In order to reduce the stack-map it has introduced smaller methods to remove/lazy allocate locks which are reused whenever possible