-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issue reporting EhCache metrics with a large cache #1584
Comments
It will be hard to diagnose what the cause of the OOM error is without more information. Can you check a heap dump when the OOM occurs? If it is related to Micrometer, the most likely cause is that unique metrics are being created with high cardinality (unbounded). If that's the cause, we'd want to see what metric there are a lot of. Perhaps check the Prometheus endpoint before an OOM happens to see if there is any metric with many many different labels. |
In a project I helped with, we weren't hitting OOM errors, but we were seeing that the way that ehcache checks it size and local memory usage were both inefficient. We had to turn it off by:
We weren't able to work around it with Micrometer because the calls to ehcache themselves ( Alternatively you could watch the metrics that are being reported and before the OOM and limit the cache size to that amount. I would have made that approach on the aforementioned project, but instead they had opted to increase their memory usage and keep their caches as they were. |
Would a If those two metrics are generally problematic, maybe we should not register those metrics by default. What do you think? |
Since this is the first issue opened in this regard, I don't think they are generally problematic. We could add at The project I saw where they ran into this were a little oblivious to how much they were caching and how large the objects were. I hadn't checked for an issue against EhCache, since I've largely been migrating to Caffeine for in-memory caches going forward. |
I also experience this problem in a large Spring web application. We've disabled metrics for the time being because calls by prometheus to scrape endpoint end up lasting longer and longer to the point where there are tens of threads occupied by calls to scrape endpoint stuck inside |
We are also having this issue with a big Spring web application with a huge amount of cache. I ended here in my search for a way to disable just the cache or ehcache binding. Will @shakuzen idea with a MeterFilter be an option? |
You can simply exclude Spring autoconfiguration: |
Thanks @mnowotnik :-) I experimented a bit and also found that using just one of @checketts suggested settings seem to disable exactly the metric that causes the problem:
|
Glad it works for you. EDIT*:
|
Is it still a problem with the latest versions? |
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed. |
Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open. |
Spring Boot 2.0.x
EhCache: 2.10.6
micrometer: 1.0.8
The text was updated successfully, but these errors were encountered: