Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved efficiency of ConnectionHandlingMode: deeper integration between Hibernate ORM and Agroal #7242

Closed
Sanne opened this issue Feb 17, 2020 · 21 comments · Fixed by #14305
Assignees
Labels
area/hibernate-orm Hibernate ORM kind/enhancement New feature or request
Milestone

Comments

@Sanne
Copy link
Member

Sanne commented Feb 17, 2020

After having switched the Hibernate ORM's PhysicalConnectionHandlingMode to DELAYED_ACQUISITION_AND_RELEASE_AFTER_TRANSACTION as I believed it would have been a more efficient choice, I had to revert it as short term fix for #6770 . This is to track the propert fix.

The problem
Hibernate ORM's is assuming it deals with the "physical" connection, as the name of the property suggests, but when there is a running transaction Agroal doesn't return the raw connection but a wrapper.
At transaction commit, Agroal verifies that all wrappers which depend on the currenty physical connection have been closed first - otherwise it's assuming one has "escaped" the scope of its indended lifespan and it logs a warning about a possible leak.

At high level, assuming a JTA environment (like Quarkus normally is):

  1. a JTA transaction is started
  2. a connection is acquired from the pool, so it needs to be enlisted in the existing TX
  3. connection is being used by the application (Hibernate ORM in this case)
  4. TX is committed

In such a scenario, if Hibernate's EntityManager [or Session] isn't closed before the TX is committed, the connection wrapper will still be open - hence the warning log.
A simple way forward is to expect the EntityManager to have been closed before commit, but this is counter-intuitive if we allow code looking like:

    EntityManager em = emf.createEntityManager();
    EntityTransaction transaction = em.getTransaction();
    transaction.begin();
    em.createNativeQuery("Delete from Person").executeUpdate();
    transaction.commit();
    em.close();

One problem of such code is that we seem to allow em.getTransaction(), which is normally not allowed in JTA (it's supposed to throw an IllegalStateException) as one should either let the container manage transactions, or explicitly request a reference to a UserTransaction. Clearly a benefit of letting the container handle such details is that it can control the lifecycle of the various parties and cleanup stuff in some controlled order.

Question

What should EntityManager.getTransaction() do ? Is the current semantics reasonable?

N.B. if I understood the Agroal experts correctly, a single Agroal connection pool is either configured to integrate with JTA or not: if JTA integration is enabled, it will require a running transaction for any new connection request.

Also Jesper reminds: "we can't support deferred enlistement, like JCA can"

Some Ideas

Some notes from previous brainstorming:

Implement a new DELAYED_ACQUISITION_AND_RELEASE_BEFORE_TRANSACTION

We could have Hibernate ORM release the connection just before the commit.

@barreiro already created a branch: hibernate/hibernate-orm@master...barreiro:before-completion

This seems simple and would be rather effective, but @jesperpedersen is warnings us against the use of interposed Synchronization(s); in WildFly for example they had to guarantee ordering to avoid some more complex problems with them.

[Q: is the use of Synchronization really a problem? ORM is already registering one, it seems Luis's code is merely using the existing synchronization]

Allow Hibernate ORM to bypass Agroal's safety net.

I'm not suggesting we should, but it would be interesting to consider this: both Agroal and Hibernate ORM seem to implement strict measures against leaking of resources. For sake of performance we should allow the two components to trust each other, and avoid duplicating checks.

It's tempting to have ORM access the real "physical" connection and:

  • bypass the wrapper creation
  • bypass the resource tracking of statements provided by Agroal
  • as a consequence, no warnings about the leaked wrappers would be logged.

[I'm aware that resource tracking of statements can be disabled - but if we disable it on the pool, then it's also disabled for user code using the pool directly, which is likely not a good idea]

I suspect the strongest argument against this would be that Agroal itself would need to expose an unsafe API. Maybe not my best idea? But it would be nice to allow a "friend framework" some deeper integration for sake of efficiency, as it would allow for more optimisations than just solve this specific issue.

For example, both Agroal and Hibernate ORM perform #clearWarnings on the connection before returning it - I noticed it's actually not such a cheap operation on some drivers, even when there's no warnings to clear.

Have Agroal change design?

Agroal has chosen to implement this "Connection Wrapper" approach as a way to model the fact that each resource in an XA transaction needs to have reached the "end state" for the operations on each resource before the commit is allowed to proceed.

However, the usage of Hibernate ORM is not illegal per se: it is in a de-facto "end state" as it will no longer issue any statement on the connection - it simply hasn't really closed it yet.

This makes me think that perhaps the Connection Wrapper design in Agroal is too strict: there might be other ways to model the end state? For example, the wrapper could simply disallow any further use of the underlying connection rather than expecting it to have been closed already.

But first

The ideas above all seem viable, and are not mutually exclusive. Personally I think I prefer @barreiro 's proposal to introduce a new before-commit release mode in ORM - and yet we could also optimise the Agroal/ORM integration by removing some more redundant checks.

And yet.. we probably need to consider first what are the implications of allowing people to use EntityManager.getTransaction() : honestly that's confusing me a bit.

@nimo23
Copy link
Contributor

nimo23 commented Feb 17, 2020

we probably need to consider first what are the implications of allowing people to use EntityManager.getTransaction()

What are the use cases for the user to bypass JTA?

Maybe distributed transactions, XA transactions, or nested transactions where the user needs fine grained control for each of its transaction. In JEE, we have UserTransaction which is a managed kind of EntityManager.getTransaction(). But for those cases, today, we can also use @Transactional..

@Sanne
Copy link
Member Author

Sanne commented Feb 18, 2020

What are the use cases for the user to bypass JTA?

I don't think we shuld encourage people to bypass JTA, as I can't think of very compelling reasons. My main concern is that right now we seem to allow this, could be tricky to remove it.

@emmanuelbernard
Copy link
Member

Some infos.

In pure JPA spec, only RESOURCE_LOCAL entityManagers are allowed to use the EntityTransaction object. A JTA EM can use @transactional or the UserTransaction to control transaction boundaries. See https://quarkus.io/guides/transaction#api-approach

Is the EntityManager explicitly set to RESOURCE_LOCAL? if not then we are in JTA mode and we have two choices:

  1. we raise an exception and recommend the other approach
  2. we noop the EntityTransaction usage to only check that a Tx was started and mark a Tx for rollback or not. This was the transparent mode that Hibernate used ot have (don't know about today). But at any rate, it requires the JTA transaction to have been started before.

Am I missing a case?

If the EM as set to RESOURCE_LOCAL, we could think of a way for Hibernate to influence Agroal or vice versa in the extensions to ensure this no JTA thing you guys discussed

@emmanuelbernard
Copy link
Member

Reading Sanne's message, looks like I reiterated some points he made. Sorry :)

@jesperpedersen
Copy link

I would suggest to split this into 2 issues.

  1. Enforce JTA transactions
  • Have EM.getTransaction() throw an IllegalStateException in all cases
  • Merge @barreiro branch, and use that as the default
  • Implement Synchronization ordering with the help of @mmusgrov

The last item is to guarantee that the Synchronization instance registered by Hibernate is executed before the Synchronization instance registered by Agroal.

  1. Support RESOURCE_LOCAL deployments

Requirements

  • EM.getTransaction() scoping work with c.setAutoCommit(false), and associated c.commit() / c.rollback()
  • Have a jta=false mode for Agroal that doesn't integrate with Narayana (TransactionIntegration.none())

This is really to support cases following the bean managed transaction scenarios.

@emmanuelbernard
Copy link
Member

Why have bean managed transaction though vs only the UserTransaction approach?

@jesperpedersen
Copy link

In the bean managed transaction scenario the application controls the transaction boundary local to the resources in question. For the Hibernate case - EntityTransaction::begin() would call c.setAutoCommit(false), and EntityTransaction::commit() would call c.commit(); c.close().

For UserTransaction - it is still a JTA transaction, just controlled by the user, so resources will have to be automatically enlisted. Since deferred enlistment isn't supported by Agroal - if any resource manager (MQ, ...) - you will have to make sure that UserTransaction.begin() has been executed before any connection from any resource manager is obtained.

@sebersole
Copy link
Contributor

FWIW adding a BEFORE_COMMIT is worthwhile regardless of the outcome here..

@Sanne
Copy link
Member Author

Sanne commented Feb 18, 2020

FWIW adding a BEFORE_COMMIT is worthwhile regardless of the outcome here..

right, thanks @sebersole let's start with that aspect then; I think I like Jesper's plan, some of the details can be figured out later.

@barreiro could you wrap up your prototype and send an HHH PR ?

@sebersole
Copy link
Contributor

I would suggest to split this into 2 issues.

  1. Enforce JTA transactions
  • Have EM.getTransaction() throw an IllegalStateException in all cases
  • Merge @barreiro branch, and use that as the default
  • Implement Synchronization ordering with the help of @mmusgrov

The last item is to guarantee that the Synchronization instance registered by Hibernate is executed before the Synchronization instance registered by Agroal.

Personally not a fan of "ordered sync" approach. We tried this once with JBoss AS and Hibernate and the JBoss Cache implementation. It was difficult to get right in all situations.

Plus, Quarkus could handle this itself by registering a custom org.hibernate.engine.transaction.jta.platform.spi.JtaPlatform (which it seems like maybe it ought to do anyway based on this discussion) and managing the ordering.

As for the rest, well that depends on whether Quarkus wants/needs to support non-JTA transactions. I can't answer that

@sebersole
Copy link
Contributor

FWIW adding a BEFORE_COMMIT is worthwhile regardless of the outcome here..

right, thanks @sebersole let's start with that aspect then; I think I like Jesper's plan, some of the details can be figured out later.

@barreiro could you wrap up your prototype and send an HHH PR ?

My guess regarding your trouble with DELAYED_ACQUISITION is that its handling not implemented correctly. I know of no one using this, so maybe its worthwhile to verify that first. IMO that is the cleanest (and intended) solution

@sebersole
Copy link
Contributor

@Sanne A lot of what you propose in Allow Hibernate ORM to bypass Agroal's safety net. is interesting because it would seem beneficial also wrt performance. You mentioned one such improvement with #clearWarnings; there are others. Would be interesting to see what @barreiro thinks of that suggestion.

@sebersole
Copy link
Contributor

In the bean managed transaction scenario the application controls the transaction boundary local to the resources in question. For the Hibernate case - EntityTransaction::begin() would call c.setAutoCommit(false), and EntityTransaction::commit() would call c.commit(); c.close().

That's not strictly true - it depends on the configuration. If one has configured JTA then Hibernate will start a Transaction/UserTransaction as opposed to using Connection's auto-commit

For UserTransaction - it is still a JTA transaction, just controlled by the user, so resources will have to be automatically enlisted. Since deferred enlistment isn't supported by Agroal - if any resource manager (MQ, ...) - you will have to make sure that UserTransaction.begin() has been executed before any connection from any resource manager is obtained.

However, this would require DELAYED_ACQUISITION as discussed above to avoid the missing "deferred enlistment" support in Agroal

@emmanuelbernard
Copy link
Member

In the bean managed transaction scenario the application controls the transaction boundary local to the resources in question. For the Hibernate case - EntityTransaction::begin() would call c.setAutoCommit(false), and EntityTransaction::commit() would call c.commit(); c.close().

For UserTransaction - it is still a JTA transaction, just controlled by the user, so resources will have to be automatically enlisted. Since deferred enlistment isn't supported by Agroal - if any resource manager (MQ, ...) - you will have to make sure that UserTransaction.begin() has been executed before any connection from any resource manager is obtained.

Right but what would be the value or use case for a user to use the setAutoCommit(true); in a Quarkus app? I don't see one for now.

@Sanne
Copy link
Member Author

Sanne commented Feb 18, 2020

Regarding the ordered synch:

  • I need to try learn more about it, I kinda suspect we won't actually need it in our specific circumstances - but certainly not planning to ignore Jesper. I'll talk with him about this as a sub-task of this.
  • Quarkus already has its custom implementation of JtaPlatform, so in case we actually need it, we can certainly handle this aspect within the Quarkus codebase without bothering with ORM.

Regarding the plan to support non-JTA transaction: agree with Jesper, let's consider to take the existing undefined support out; we can always plan to reintroduce later.

Regarding DELAYED_ACQUISITION : it seems to be working fine (debugged it). The focus of the present problem is the fact that the connection isn't closed yet at the time in which Agroal commits. So we're opening fine, but closing "too late" - not really an ORM bug IMO as I'd agree it's reasonable to have the connection open still, but it's how Agroal verifies that we're done with all work.

@sebersole
Copy link
Contributor

Regarding DELAYED_ACQUISITION : it seems to be working fine (debugged it). The focus of the present problem is the fact that the connection isn't closed yet at the time in which Agroal commits. So we're opening fine, but closing "too late" - not really an ORM bug IMO as I'd agree it's reasonable to have the connection open still, but it's how Agroal verifies that we're done with all work.

Regarding releasing after the transaction rather than before... Maybe not a bug per-se, but I would not call it the best option. Releasing during the before-commit phase is definitely a better approach imo and should at least be an option as we are discussing here

@Sanne
Copy link
Member Author

Sanne commented Feb 18, 2020

@emmanuelbernard

Right but what would be the value or use case for a user to use the setAutoCommit(true); in a Quarkus app? I don't see one for now.

There's one, which is the transaction-less use case. The techempower benchmark badly needs it :(

@emmanuelbernard
Copy link
Member

@emmanuelbernard

Right but what would be the value or use case for a user to use the setAutoCommit(true); in a Quarkus app? I don't see one for now.

There's one, which is the transaction-less use case. The techempower benchmark badly needs it :(

sorry I mean setAutoCommit(false). true is the default so it works I suppose, does it?

@radcortez
Copy link
Member

@Sanne @barreiro Not sure if this was considered in Agroal design, but I may have found another possible issue with the Connection Wrapper.

If you set https://docs.oracle.com/en/java/javase/11/docs/api/java.sql/java/sql/ResultSet.html#HOLD_CURSORS_OVER_COMMIT, because of the Connection Wrapper, Agroal will not honour the ResultSet type and will just close the connection on commit (like designed) and completely ignore the flag.

@yrodiere
Copy link
Member

yrodiere commented Jan 8, 2021

It's been a while and this could solve other issues, so I'll try to revive this...

From my understanding, switching the PhysicalConnectionHandlingMode to DELAYED_ACQUISITION_AND_RELEASE_BEFORE_TRANSACTION_COMPLETION would solve the problem.

The only remaining concerns are:

  1. Transactions started explicitly by the user (em.getTransaction().begin()) would not work well or would trigger warnings. It's been discussed we should just remove the ability to do that and push users to rely on UserTransaction, and maybe re-introduce resource-local mode later. Seems reasonsable.
  2. There are concerns that Synchronizations would not be executed in the right order, but THB I don't understand how that's related to this issue. Even without changing the PhysicalConnectionHandlingMode, we're already relying on a synchronization on the ORM side to execute org.hibernate.internal.SessionImpl#beforeTransactionCompletion which, crucially, executes the ORM flush before the transaction. So if there's something wrong with Synchronization ordering, there's already a problem and switching the PhysicalConnectionHandlingMode won't change that.

Unless someone disagrees, I'll go ahead and submit a PR that removes support for em.getTransaction().begin() and switches the PhysicalConnectionHandlingMode. I'll also open a ticket to implement Synchronization ordering.

@Sanne
Copy link
Member Author

Sanne commented Jan 8, 2021

@yrodiere yes switching to DELAYED_ACQUISITION_AND_RELEASE_BEFORE_TRANSACTION_COMPLETION is the right plan, but we already tried and there was some problem. I don't remember the details, so it's worth trying again.. but I expect some complex debugging is needed.

@yrodiere yrodiere self-assigned this Jan 14, 2021
yrodiere added a commit to yrodiere/quarkus that referenced this issue Jan 14, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
yrodiere added a commit to yrodiere/quarkus that referenced this issue Jan 14, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
This was referenced Jan 14, 2021
yrodiere added a commit to yrodiere/quarkus that referenced this issue Jan 21, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
yrodiere added a commit to yrodiere/quarkus that referenced this issue Feb 5, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
yrodiere added a commit to yrodiere/quarkus that referenced this issue Feb 10, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
yrodiere added a commit to yrodiere/quarkus that referenced this issue Feb 22, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
yrodiere added a commit to yrodiere/quarkus that referenced this issue Feb 22, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
yrodiere added a commit to yrodiere/quarkus that referenced this issue Feb 22, 2021
DELAYED_ACQUISITION_AND_RELEASE_AFTER_STATEMENT, on top of being less
efficient, also leads to resource leaks (that are caught by Agroal,
but still).

Signed-off-by: Yoann Rodière <yoann@hibernate.org>
@quarkus-bot quarkus-bot bot added this to the 1.13 - master milestone Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/hibernate-orm Hibernate ORM kind/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants