Speedup ContextInstances if there are null instances #38761

franz1981 · 2024-02-13T15:35:58Z

This is an alternative to #38737 with no need of cutoff values and always performing fast-path unlocked checks, if possible, and lazy allocation of a ReentrantLock.

In order to reduce the stack-map it has introduced smaller methods to remove/lazy allocate locks which are reused whenever possible

franz1981 · 2024-02-13T15:36:10Z

@Ladicek @mkouba

...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java

franz1981 · 2024-02-13T16:35:08Z

@mkouba I've added 92bc2e5 and update the Java code snippet to explain what is doing: it shouldn't be a big deal but it seems more correct to return only the value read, instead of performing an unguarded volatile read, there.

...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java

mkouba

I've added few comments but it looks really good!

I will try to run our microbenchmarks to see if it makes any difference...

franz1981 · 2024-02-15T12:49:39Z

Thanks @mkouba yep, in case the micro doesn't impact, I suggest to perform one where the context instance is allocated on the fly and have different instance fields while using a parameter to decide how many to compute, and including a remove at the end of the usage life-cycle ; having such benchmark will help capturing the actual possible behaviours for request scoped beans, which seems to be allocated, computed, removed in the hot path

mkouba · 2024-02-15T15:39:23Z

So I ran the microbenchmarks and the only benchmark that shows significant improvement is the RequestContextActivationBenchmark which merely tests request context activation/deactivation, no bean operations - and it's expected due to lazy init of Lock. The throughtput is doubled!

The ContextProviderBenchmark that tests ArcContextProvider (SmallRye Context Propagation integration) shows a decent improvement too (because it does not do much more than RequestContextActivationBenchmark). So this PR could slightly improve the performance in use cases where SR Context Propagation is used extensively.

However, the RequestContextBenchmark, which activates the context, invokes client proxies of 5 @RequestScoped beans and finally terminates the context, shows no significant change. And I think that it's expected too because this PR does not improve the ContextInstances#computeIfAbsent() method which is critical for this benchmark. @franz1981 is the use case you describe in the previous comment different?

franz1981 · 2024-02-15T16:07:42Z

@mkouba

@franz1981 is the use case you describe in the previous comment different?

Exactly and happy that you already had crafted other benchmarks which capture the use case (which is farly realistic for the request scoped case) of the changes sent here, thanks again for checking!!!!

Probably RequestContextBenchmark, could have been relevant if the benchmark had a parameter to decide how many of these beans are computed eg for a total of 5 beans, just 1(,2,3,4,5) is computed. It will impact the removal as well, given that this pr remove only what is actually computed, too.

Just for reference

this was the call-stack of a compute case with 1/5 beans were computed (hence saving 3 * 4 = 12 atomic volatile ops in the hot path)

mkouba · 2024-02-16T09:07:17Z

Probably RequestContextBenchmark, could have been relevant if the benchmark had a parameter to decide how many of these beans are computed eg for a total of 5 beans, just 1(,2,3,4,5) is computed. It will impact the removal as well, given that this pr remove only what is actually computed, too.

So I've tried to adapt the RequestContextBenchmark in a way that:

The app contains 10 request scoped beans
We always test 15 client proxy invocations but in three variants:
1. 1 of the 10 beans is used (i.e. instantiated in ContextInstances.computeIfAbsent()),
2. 3 of the 10 beans are used, and
3. 5 of the 10 beans are used.

And this PR indeed increased the throughput in the first scenario (1/10) by ~ 40%, in the second (3/10) by ~ 15% and the results of the third are more or less the same.

In other words, the more request scoped beans exist in the app and the less beans are actually used in the benchmark the better throughput with this PR. And since it's very likely that only a fraction of all request scoped beans is used in a single request of a "real world" app I think that it's really a great improvement. Good job!

franz1981 · 2024-02-16T12:39:00Z

I'm adding a note here to help ourselves of the future; in order to "fix" the existing impl and this PR in the case where a compute is started after (no need to be concurrent really) it (regardless the remove has found any value to remove, before):

compute always have to check for any existing invalidation before starting
after compute has happened, in case it has stored any value, it should check again the invalidation status, and if true, should loop till removing null out everything

The last loop is necessary because other(s) concurrent compute can end up "recomputing" the value again, so it's important for them to keep on trying to remove the value till is stable as null.

This mechanism is to prevent @PreDestroy to NOT be called because the remove will read a null volatile instance value, and the compute will set it right after, without cleaning it up.

@mkouba @Ladicek
I know that's crazy but i can easily forget this concurrent stuff, so let me brain dump this here

franz1981 · 2024-02-17T10:23:41Z

Ok @geoand this should be a very good one, once merged. Just need to make sure the test failures are all flaky ones

franz1981 · 2024-02-17T17:09:07Z

https://github.com/franz1981/quarkus/actions/runs/7941041918 seems to show that the failure reported here is real or another flaky test

geoand · 2024-02-19T15:35:44Z

@mkouba this looks good, WDYT?

franz1981 · 2024-02-20T08:55:42Z

Done @mkouba and @geoand waiting the CI again, although I've just changed a comment and rebased)

Ladicek

getAllPresent() will no longer return a consistent snapshot, but that's not guaranteed for ComputingCacheContextInstances either, so should be fine. Nice work!

quarkus-bot · 2024-02-20T13:05:23Z

Status for workflow `Quarkus CI`

This is the status report for running Quarkus CI on commit 9905b94.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

quarkus-bot bot added the area/arc Issue related to ARC (dependency injection) label Feb 13, 2024

franz1981 force-pushed the main_lazy_ctx_instances branch 2 times, most recently from fb75964 to 46da21d Compare February 13, 2024 15:47

franz1981 marked this pull request as ready for review February 13, 2024 15:50

franz1981 force-pushed the main_lazy_ctx_instances branch from 46da21d to a25c127 Compare February 13, 2024 15:55

franz1981 commented Feb 13, 2024

View reviewed changes

...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java Show resolved Hide resolved

franz1981 force-pushed the main_lazy_ctx_instances branch from 92bc2e5 to 8aa4ea2 Compare February 13, 2024 17:43

This comment has been minimized.

Sign in to view

franz1981 force-pushed the main_lazy_ctx_instances branch from 8aa4ea2 to be0d26d Compare February 14, 2024 12:22

This comment has been minimized.

Sign in to view

quarkus-bot bot added the triage/flaky-test label Feb 14, 2024

mkouba reviewed Feb 15, 2024

View reviewed changes

...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java Outdated Show resolved Hide resolved

mkouba reviewed Feb 15, 2024

View reviewed changes

...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java Show resolved Hide resolved

mkouba reviewed Feb 15, 2024

View reviewed changes

...projects/arc/processor/src/main/java/io/quarkus/arc/processor/ContextInstancesGenerator.java Show resolved Hide resolved

mkouba reviewed Feb 15, 2024

View reviewed changes

franz1981 force-pushed the main_lazy_ctx_instances branch from be0d26d to 0608ddd Compare February 16, 2024 09:35

This comment has been minimized.

Sign in to view

franz1981 force-pushed the main_lazy_ctx_instances branch from 0608ddd to c256a0c Compare February 16, 2024 13:48

This comment has been minimized.

Sign in to view

franz1981 force-pushed the main_lazy_ctx_instances branch from c256a0c to a86cddb Compare February 17, 2024 10:23

This comment has been minimized.

Sign in to view

franz1981 force-pushed the main_lazy_ctx_instances branch from a86cddb to ab94de7 Compare February 17, 2024 17:09

This comment has been minimized.

Sign in to view

geoand requested a review from mkouba February 19, 2024 06:40

franz1981 mentioned this pull request Feb 19, 2024

Context local storage SPI eclipse-vertx/vert.x#5048

Merged

franz1981 force-pushed the main_lazy_ctx_instances branch from ab94de7 to b0b28a4 Compare February 19, 2024 15:34

This comment has been minimized.

Sign in to view

Speedup ContextInstances if there are null instances

9905b94

franz1981 force-pushed the main_lazy_ctx_instances branch from b0b28a4 to 9905b94 Compare February 20, 2024 08:55

mkouba approved these changes Feb 20, 2024

View reviewed changes

mkouba added the triage/waiting-for-ci Ready to merge when CI successfully finishes label Feb 20, 2024

Ladicek approved these changes Feb 20, 2024

View reviewed changes

geoand merged commit da7f89a into quarkusio:main Feb 20, 2024
49 checks passed

quarkus-bot bot added this to the 3.9 - main milestone Feb 20, 2024

quarkus-bot bot removed the triage/waiting-for-ci Ready to merge when CI successfully finishes label Feb 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup ContextInstances if there are null instances #38761

Speedup ContextInstances if there are null instances #38761

franz1981 commented Feb 13, 2024

franz1981 commented Feb 13, 2024

franz1981 commented Feb 13, 2024

This comment has been minimized.

This comment has been minimized.

mkouba left a comment

franz1981 commented Feb 15, 2024

mkouba commented Feb 15, 2024

franz1981 commented Feb 15, 2024 •

edited

Loading

mkouba commented Feb 16, 2024

franz1981 commented Feb 16, 2024 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

franz1981 commented Feb 17, 2024

This comment has been minimized.

franz1981 commented Feb 17, 2024

This comment has been minimized.

geoand commented Feb 19, 2024

This comment has been minimized.

franz1981 commented Feb 20, 2024

Ladicek left a comment

quarkus-bot bot commented Feb 20, 2024

Speedup ContextInstances if there are null instances #38761

Speedup ContextInstances if there are null instances #38761

Conversation

franz1981 commented Feb 13, 2024

franz1981 commented Feb 13, 2024

franz1981 commented Feb 13, 2024

This comment has been minimized.

This comment has been minimized.

mkouba left a comment

Choose a reason for hiding this comment

franz1981 commented Feb 15, 2024

mkouba commented Feb 15, 2024

franz1981 commented Feb 15, 2024 • edited Loading

mkouba commented Feb 16, 2024

franz1981 commented Feb 16, 2024 • edited Loading

This comment has been minimized.

This comment has been minimized.

franz1981 commented Feb 17, 2024

This comment has been minimized.

franz1981 commented Feb 17, 2024

This comment has been minimized.

geoand commented Feb 19, 2024

This comment has been minimized.

franz1981 commented Feb 20, 2024

Ladicek left a comment

Choose a reason for hiding this comment

quarkus-bot bot commented Feb 20, 2024

Status for workflow Quarkus CI

franz1981 commented Feb 15, 2024 •

edited

Loading

franz1981 commented Feb 16, 2024 •

edited

Loading

Status for workflow `Quarkus CI`