You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We ran into an issue that seem to indicate a possible connection leak within the SpringTransactionManager. I do not have a reproducer at the moment, but I believe that simple code analysis does confirm the issue.
The issue
The logic within SpringTransactionManager allows for an unhandled exception to not close a connection properly, essentially leaking it. Namely, the invocation of defaultIsolationLevelhere can fail. If that happens, the connection does not get closed within the scope of the method and there doesn't seem to be any code higher up the call stack that would correctly handle it either.
Real-world scenario
To give you a better understanding of how the issue above can happen, here's a detailed breakdown of how it happened to us.
In our case, the invocation of defaultIsolationLevel failed because of no available connection in the connection pool. We have a scheduled task executor with X worker threads and - coincidentally - the same X number of connections in the connection pool. Upon startup, the scheduler would start execution of X tasks at once, each of which making a DB query (rather simple SELECT) via a @Transactional-annotated Spring @Repository. This means that all the X scheduler worker threads reach the repository method at almost the same time.
One of these (let's call it T1) would then get as far as this:
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) <------- BLOCK
at org.jetbrains.exposed.sql.Database$Companion$connect$3.invoke(Database.kt:130)
at org.jetbrains.exposed.sql.Database$Companion$connect$3.invoke(Database.kt:130)
at org.jetbrains.exposed.sql.Database$Companion$doConnect$3.invoke(Database.kt:117)
at org.jetbrains.exposed.sql.Database$Companion$doConnect$3.invoke(Database.kt:116)
at org.jetbrains.exposed.sql.Database.metadata$exposed_core(Database.kt:25)
at org.jetbrains.exposed.sql.Database$vendor$2.invoke(Database.kt:38)
at org.jetbrains.exposed.sql.Database$vendor$2.invoke(Database.kt:37)
at kotlin.SynchronizedLazyImpl.getValue(LazyJVM.kt:74) <------- LOCK
at org.jetbrains.exposed.sql.Database.getVendor(Database.kt:37)
at org.jetbrains.exposed.sql.Database$Companion.getDefaultIsolationLevel(Database.kt:188)
at org.jetbrains.exposed.spring.SpringTransactionManager.getDefaultIsolationLevel(SpringTransactionManager.kt:36)
at org.jetbrains.exposed.spring.SpringTransactionManager.initTransaction(SpringTransactionManager.kt:100)
at org.jetbrains.exposed.spring.SpringTransactionManager.doBegin(SpringTransactionManager.kt:47)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.startTransaction(AbstractPlatformTransactionManager.java:400)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:373)
at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:595)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:382)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:119)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:750)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:692)
at <call to our @Transactional-annotated repository method>
...
acquiring a lock on the Database.vendor property (see the line marked LOCK above). This makes all the other threads wait on this monitor, each having already acquired a connection within Spring's Tx tooling.
However, as we have the same number of threads as there is the number of connections, T1 gets blocked on the connection pool as there is no available connection (see the line marked BLOCK above), effectively deadlocking on the connection pool.
Of course, there is a connection timeout set on the pool that should resolve this. But once the timeout kicks in, it throws a timeout exception on T1 from the blocking getConnection call on the pool marked above. That, in turn, throws from within the aforementioned invocation of the defaultIsolationLevel which leaks the connection.
In our case, this was particularly problematic as now we had only X-1 connections in the connection pool, making it impossible to ever recover from such a state.
Summary
While we understand that having the same size of the connection pool as the expected concurrency level may not be the best idea, it does not invalidate the point about the connection leakage.
As a workaround, we simply increased the connection pool size. Even adding a single additional connection resolved the problem, confirming our understanding of the issue.
For a proper fix, I believe a simple try - finally block around this should do.
The text was updated successfully, but these errors were encountered:
Thank you for a detailed overview of the problem. I'll fix it in the next release.
As a workaround, you can define an explicit default isolation level to prevent invocation of metadata part.
Hi Exposed team 👋
We ran into an issue that seem to indicate a possible connection leak within the
SpringTransactionManager
. I do not have a reproducer at the moment, but I believe that simple code analysis does confirm the issue.The issue
The logic within
SpringTransactionManager
allows for an unhandled exception to not close a connection properly, essentially leaking it. Namely, the invocation ofdefaultIsolationLevel
here can fail. If that happens, the connection does not get closed within the scope of the method and there doesn't seem to be any code higher up the call stack that would correctly handle it either.Real-world scenario
To give you a better understanding of how the issue above can happen, here's a detailed breakdown of how it happened to us.
In our case, the invocation of
defaultIsolationLevel
failed because of no available connection in the connection pool. We have a scheduled task executor with X worker threads and - coincidentally - the same X number of connections in the connection pool. Upon startup, the scheduler would start execution of X tasks at once, each of which making a DB query (rather simple SELECT) via a@Transactional
-annotated Spring@Repository
. This means that all the X scheduler worker threads reach the repository method at almost the same time.One of these (let's call it
T1
) would then get as far as this:acquiring a lock on the Database.vendor property (see the line marked
LOCK
above). This makes all the other threads wait on this monitor, each having already acquired a connection within Spring's Tx tooling.However, as we have the same number of threads as there is the number of connections,
T1
gets blocked on the connection pool as there is no available connection (see the line markedBLOCK
above), effectively deadlocking on the connection pool.Of course, there is a connection timeout set on the pool that should resolve this. But once the timeout kicks in, it throws a timeout exception on
T1
from the blockinggetConnection
call on the pool marked above. That, in turn, throws from within the aforementioned invocation of thedefaultIsolationLevel
which leaks the connection.In our case, this was particularly problematic as now we had only X-1 connections in the connection pool, making it impossible to ever recover from such a state.
Summary
While we understand that having the same size of the connection pool as the expected concurrency level may not be the best idea, it does not invalidate the point about the connection leakage.
As a workaround, we simply increased the connection pool size. Even adding a single additional connection resolved the problem, confirming our understanding of the issue.
For a proper fix, I believe a simple
try - finally
block around this should do.The text was updated successfully, but these errors were encountered: