Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix concurrency bug in TransactionScoped beans initialization #29159

Merged
merged 1 commit into from
Nov 24, 2022

Conversation

imperatorx
Copy link
Contributor

@imperatorx imperatorx commented Nov 9, 2022

Fix #29157

Introduced locking to prevent double, triple etc. initialization of TransactionScoped beans when called from different threads concurrently (think virtual threads and StucturedTaskScope forks).

ReentrantLock is used instead of synchronized blocks to make the locking virtual thread friendly. The computeIfAbsent usage of the ConcurrentHashMap in TransactionContextState would also be problematic (uses synchronized blocks internally) when bean creation does IO, like establishing a TCP connection, thus pinning the virtual threads carrier thread.

@quarkus-bot quarkus-bot bot added the area/narayana Transactions / Narayana label Nov 9, 2022
@gastaldi
Copy link
Contributor

/cc @mmusgrov

@gsmet
Copy link
Member

gsmet commented Nov 10, 2022

/cc @Sanne too

Copy link
Contributor

@mmusgrov mmusgrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look ok but it'd be good if someone with CDI experience (mine is in transactions) could take a look, maybe @mkouba or @manovotn .

But we need a test for this because the MP CP TCK @Ignore`s the test [1] for the feature (the SmallRye implementation did pass the test before it was ignored). Can a similar test be added to this PR.

[1] https://github.com/eclipse/microprofile-context-propagation/blob/master/tck/src/main/java/org/eclipse/microprofile/context/tck/cdi/JTACDITest.java#L250

@mmusgrov
Copy link
Contributor

Do we have a feel for the performance overhead added by the extra locking? Is the performance hit only on the first request for a contextual instance (jakarta.enterprise.context.spi.Context.get(...))?

@manovotn
Copy link
Contributor

Do we have a feel for the performance overhead added by the extra locking? Is the performance hit only on the first request for a contextual instance (jakarta.enterprise.context.spi.Context.get(...))?

As per CDI documentation, Context#get() is called whenever you need to obtain a bean instance. It either returns an existing one, or creates a new one.
How and when you choose to perform locking is up to the context implementation. Obviously, you ideally want to only lock when you find out there is no instance and you are about to create a new one store it into context before returning it.

@imperatorx
Copy link
Contributor Author

imperatorx commented Nov 22, 2022

Do we have a feel for the performance overhead added by the extra locking? Is the performance hit only on the first request for a contextual instance (jakarta.enterprise.context.spi.Context.get(...))?

As the PR uses double-checked locking for both the TransactionContextState and the instance inside the TransactionContextState, only the first invocation will hit the locks, subsequent invocations will use the same path as before.

Slight correction: the first lock (TransactionContextState) will only lock once per transaction if uncontested, the second once per bean in uncontested.

@Sanne
Copy link
Member

Sanne commented Nov 22, 2022

Hi @imperatorx , thanks - it looks good at high level but there's some details I'd like to clarify.

You seem to have wrapped within the lock also the acces to the TransactionSynchronizationRegistry (the invocation of transactionSynchronizationRegistry.get(), which is a LazyValue.

LazyValue already provides concurrent initialization safeguards - but it's relying on synchronization.
Is that something that should be fixed as well? There's more access to it beyond the block you've just protected; in general I wonder how any of this code could work safely with virtual threads.

@imperatorx
Copy link
Contributor Author

I updated the PR:

  • Added a test
  • Removed the LazyValue.get() from the synchronized block as @Sanne suggested.
  • Removed the double-checked locking from the first lock, since uncontended locking (single threaded regular use case) uses a simple compare-and-set operation (see NonfairSync.initialTryLock in ReentrantLock), but the two getResource calls do a lot of things. So IMHO one compare-and-set is better performance-wise than an additional getResource call (calls TransactionManager.getTransaction(), accessess a synhronized hashmap, etc.)

As for refactoring LazyValue: Syncronized blocks work well with virtual threads if there is no blocking done inside them (IO, native calls, sleep, etc.), and the TransactionContext use case has none of it, it just returns a new instance. If other LazyValue use cases have blocking work in the creation part of the new instance, the synhronized block should be changed to a ReentrantLock.

@Sanne
Copy link
Member

Sanne commented Nov 22, 2022

Excellent, thanks 👍 and I agree with the decision regarding LazyValue; I asked because in the previous revision of the patch is seemed like you intentetionally included that in the critical section.

Could you squash the changes together and avoid the import javax.transaction.*; please? It's a very minor nitpick, but we prefer listing all individual imports.

Copy link
Member

@Sanne Sanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested two minor changes, however even in current state I think it's good.

@quarkus-bot

This comment has been minimized.

@Sanne Sanne added the triage/waiting-for-ci Ready to merge when CI successfully finishes label Nov 22, 2022
Copy link
Contributor

@mmusgrov mmusgrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test and for answering my performance question.

@manovotn
Copy link
Contributor

@manovotn
Copy link
Contributor

@mmusgrov Also how does this solution link into the discussion in #29157 (comment)?
Does Narayna support that or not? This PR makes sure there is no bean creation race but IMO doesn't answer the MP CP question and we still lack test that would verify it either way.

@imperatorx
Copy link
Contributor Author

imperatorx commented Nov 24, 2022

@mmusgrov Also how does this solution link into the discussion in #29157 (comment)? Does Narayna support that or not? This PR makes sure there is no bean creation race but IMO doesn't answer the MP CP question and we still lack test that would verify it either way.

No, sadly Narayana does not fully support this workload: in case an exception is thrown inside a child virtual thread, the unwinding of the context propagation causes the transaction on the parent thread to get deactivated (or rather not re-activated). I'll look into that if I have time, the culprit seems to be https://github.com/smallrye/smallrye-context-propagation/blob/main/jta/src/main/java/io/smallrye/context/jta/context/propagation/JtaContextProvider.java#L65

@mmusgrov
Copy link
Contributor

mmusgrov commented Nov 24, 2022

@mmusgrov FYI, the very same problem is in Narayana codebase, see https://github.com/jbosstm/narayana/blob/master/ArjunaJTA/cdi/classes/com/arjuna/ats/jta/cdi/TransactionContext.java

That code was added to support the wildfly-mp-reactive-feature-pack (https://github.com/wildfly-extras/wildfly-mp-reactive-feature-pack) which we did at the request of the WildFly team. I'll check with Kabir to see if there is any mechanism to get some concurrency testing (with respect to transaction propagation) added to the feature pack.

And, if that isn't possible then I'll investigate if it's feasible to add the test to the narayana testsuite.

@manovotn
Copy link
Contributor

That code was added to support the wildfly-mp-reactive-feature-pack (https://github.com/wildfly-extras/wildfly-mp-reactive-feature-pack) which we did at the request of the WildFly team.

The Context impl class must have been present (in one form or another) since JTA wanted to integrate with CDI and have their own custom @TransactionalScoped.

@mmusgrov
Copy link
Contributor

@manovotn Yes but isn't this bug only present when used with MP CP with parallel threads sharing the same transaction and we aren't testing that in narayana and the code was requested by WildFly as an incubator for reactive messaging, reactive streams operators and context propagation and, it turns out, is not used after all by WildFly, I'm checking to see who else might use it.

So if nobody is using it then I'd prefer to either send users to the Quarkus JTA extension for the functionality or ask them to raise an RFE.

And if nobody is using the feature in narayana and we remove it would we still need to protect the Context (I don't see any other code protecting the CDI Context in this way).

But that said, I will still look into how easy it would be for us to add the testing of MP CP to our own testsuite even if there is every chance it never get used!!

@gsmet gsmet merged commit 701b620 into quarkusio:main Nov 24, 2022
@quarkus-bot quarkus-bot bot added this to the 2.15 - main milestone Nov 24, 2022
@quarkus-bot quarkus-bot bot added kind/bugfix and removed triage/waiting-for-ci Ready to merge when CI successfully finishes labels Nov 24, 2022
@gsmet gsmet changed the title Fix concurrency bug Fix concurrency bug in TransactionScoped beans initialization Nov 24, 2022
@manovotn
Copy link
Contributor

I was simply pointing out that it might make sense to keep the TransactionalContext code/behavior identical (or as identical as possible) between both codebases. Whether you do it is ultimately up to you :)

@mmusgrov
Copy link
Contributor

Ok it may make sense to keep them aligned and thanks for pointing it out. I will report back if/when we decide to include SmallRye testing in the Naryana CDI module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/narayana Transactions / Narayana kind/bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make @TransactionScoped thread safe (virtual and platform threads)
6 participants