Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in HikariCP connection pool #913

Closed
chuwy opened this issue Jun 10, 2019 · 11 comments
Closed

Memory leak in HikariCP connection pool #913

chuwy opened this issue Jun 10, 2019 · 11 comments

Comments

@chuwy
Copy link

chuwy commented Jun 10, 2019

I'm testing my RESTful service built on top of http4s and doobie and encountering cryptic (as always) OOM.

It suspect it happens with any load, but below results are were produced with stress-testing via apache benchmark (ab -n 1000000 -c 800 from the same host).

Memory dump, couple of minutes after launch:

Screenshot 2019-06-10 at 19 44 06

The connectionEC is the only object in memory that looks slightly (yet) suspicious. Working queue has 778 elements.

I dump memory every 5 minutes and see more or less the same picture. connectionEC is always on the first line, but never significantly above 3M. Until ~1hr later I get the OOM and heap produced with -XX:+HeapDumpOnOutOfMemoryError shows following picture:

Screenshot 2019-06-10 at 19 54 07

Basically it filled my heap with... something doobieish inside something IOish (sorry, not very faimiliar with cats-effect internals).

I think this particular setup was poorly configured: small 1CPU/2GB machine, 16 threads in connection pool, 5 maximumPoolSize (which looks too big), fairly heavy load and heavy DB response (~400Kb over the network), but other benchmarks have shown similar results.

I'm running this doobie 0.7.0 (same problem was with 0.6.0), http4s-blaze 0.20.1 NIO1. Here's the pool initialization code: https://github.com/snowplow-incubator/iglu-server/blob/release/0.6.0/src/main/scala/com/snowplowanalytics/iglu/server/storage/Storage.scala#L95-L116 (in case I miss somethiing very obvious).

This looks suspciciously similar to #824 / http4s/http4s-okhttp-client#3, except that I'm not using http4s client.

@chuwy
Copy link
Author

chuwy commented Jun 10, 2019

Just in case - the message I'm getting along with OOM:

[iglu-hikaricp-pool housekeeper] WARN com.zaxxer.hikari.pool.HikariPool - iglu-hikaricp-pool - Thread starvation or clock leap detected (housekeeper delta=1m238ms398µs104ns).
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /iglu/java_pid6.hprof ...
Heap dump file created [388489171 bytes in 10.352 secs]
Aborting due to java.lang.OutOfMemoryError: Java heap space
...
#  Internal Error (debug.cpp:321), pid=6, tid=72
#  fatal error: OutOfMemory encountered: Java heap space
...
Exception in thread "pool-3-thread-1"

@chuwy
Copy link
Author

chuwy commented Jun 14, 2019

It seems the problem was in http4s Timeout middleware. The problem went away after we removed it.

I'm leaving the issue open as I don't know whether the issue is in http4s (unlikely as Timeout implementation is so primitive that I cannot believe it can be wrong) in doobie (maybe something is not cancellable in fact?) or cats-effect.

@chuwy
Copy link
Author

chuwy commented Jun 25, 2019

I think this PR into cats-effect might have fixed it: typelevel/cats-effect#568. Will try to reproduce with both versions.

@darekmydlarz
Copy link

@chuwy any updates on that?

@mdedetrich
Copy link

mdedetrich commented Jul 2, 2019

Was about to make an issue but I also have a leak, not sure if its entirely related though https://github.com/mdedetrich/task-doobie-memory-leak

@chuwy
Copy link
Author

chuwy commented Jul 4, 2019

@darek1024, sorry for delay - not yet. My problem went away once I removed Timeout middleware in http4s code. I still didn't have a chance to write minimal project to reproduce and check if typelevel/cats-effect#568 really fixed it.

@chuwy
Copy link
Author

chuwy commented Jul 4, 2019

@mdedetrich based on this (monix/monix#912) Task also doesn't yet use setRemoveOnCancelPolicy, so it certainly can be related.

@mdedetrich
Copy link

The leak happens even when you don't use Task

@chuwy
Copy link
Author

chuwy commented Jul 4, 2019

Which implementation do you use? I assume that only IO from cats-effect-2.0.0 and ZIO don't suffer from it.

@mdedetrich
Copy link

In the repository which I linked to, I just used IO (well I ran IO.unsafeToFuture) without any task and the leak was still there.

@jatcwang
Copy link
Collaborator

Closing this as it's probably not an issue anymore. Please reopen if that's not true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants