Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drop measurement causes cache maximum memory size exceeded #7161

Closed
sebito91 opened this issue Aug 16, 2016 · 16 comments
Closed

drop measurement causes cache maximum memory size exceeded #7161

sebito91 opened this issue Aug 16, 2016 · 16 comments
Assignees
Labels
Milestone

Comments

@sebito91
Copy link
Contributor

Bug report

Encountered stop-the-world data ingest issues after issuing multiple drop measurement commands on limited series. These series were part of a test collector and during cleanup the primary ingest mechanisms are completely blocked and not unlocking. Problem is getting worse...

We see many, many of these errors on the logs:

[write] 2016/08/16 12:20:56 write failed for shard 3: engine: cache maximum memory size exceeded
[udp] 2016/08/16 12:20:56 failed to write point batch to database "tg_udp": engine: cache maximum memory size exceeded

SYSTEM VERSION

System info:

[run] 2016/08/16 12:25:38 InfluxDB starting, version 1.1.0~n201608130800, branch master, commit 35f2fdafcb7a6c5ad4de0135c1c1355a6097b2cc

RHEL7.2 3.10.0-327.10.1.el7

Dell PowerEdge R730xd
768GB RAM
22x 400GB SSD


[data]
  dir = "/data/influxdb-data/metrics"
  engine = "tsm1"

  #max-wal-size = 104857600 # Maximum size the WAL can reach before a flush. Defaults to 100MB.
  #wal-flush-interval = "10m" # Maximum time data can sit in WAL before a flush.
  #wal-partition-flush-delay = "2s" # The delay time between each WAL partition being flushed.

  wal-dir = "/data/influxdb-wal/wal" # Issue 4498 to make this work for tsm1
  wal-logging-enabled = true
  query-log-enabled = true
  max-series-per-database = 1000000000
  trace-logging-enabled = false

  #wal-ready-series-size = 512000
  #wal-compaction-threshold = 0.25
  #wal-max-series-size = 1048576

  cache-max-memory-size = 524288000
  cache-snapshot-memory-size = 262144000
  # cache-snapshot-write-cold-duration = "1h"
  # compact-min-file-count = 3
  compact-full-write-cold-duration = "4h"

  # max-points-per-block = 1000

Steps to reproduce:

  1. Active influx database
  2. drop measurement
  3. drop measurement
  4. check logs and see incoming metrics blocked

Expected behavior: not block ALL incoming metrics forever, not clearing

Actual behavior: blocking metrics forever, not unlocking

Additional info: [Include gist of relevant config, logs, etc.]

SUPPORT FILES ADDED TO FTP!

@jwilder
Copy link
Contributor

jwilder commented Aug 16, 2016

@sebito91 If it's currently locked up, can you attach these profiles of the instance to help identify the cause of the lockup?

curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1" 
curl -o goroutine.txt "http://localhost:8086/debug/pprof/goroutine?debug=1" 

@sebito91
Copy link
Contributor Author

sebito91 commented Aug 16, 2016

The heap is on your SFTP, under /uploads/issue_7161

block.txt
goroutine.txt

jwilder added a commit that referenced this issue Aug 17, 2016
If a delete takes a long time to process while writes to the
shard are occuring, it was possible for the cache to fill up
and writes to be rejected.  This occurred because we disabled
all compactions while writing tombstone file to prevent deleted
data from re-appearing after a compaction completed.

Instead, we only disable the level compactions and allow snapshot
compactions to continue.  Snapshots already handle deleted data
with the cache and wal.

Fixes #7161
@jwilder
Copy link
Contributor

jwilder commented Aug 17, 2016

@sebito91 I have a possible fix for this in #7165

@sebito91
Copy link
Contributor Author

@jwilder thank you, should we build and test or is this in the current nightly?

@jwilder
Copy link
Contributor

jwilder commented Aug 17, 2016

@sebito91 This is not merged to master yet. If you are able to test this PR and let me know if it helps, that would be great.

@sebito91
Copy link
Contributor Author

@jwilder will do and get back to you

jwilder added a commit that referenced this issue Aug 17, 2016
If a delete takes a long time to process while writes to the
shard are occuring, it was possible for the cache to fill up
and writes to be rejected.  This occurred because we disabled
all compactions while writing tombstone file to prevent deleted
data from re-appearing after a compaction completed.

Instead, we only disable the level compactions and allow snapshot
compactions to continue.  Snapshots already handle deleted data
with the cache and wal.

Fixes #7161
@sebito91
Copy link
Contributor Author

sebito91 commented Aug 18, 2016

Unfortunately still seeing the same issue :/

with fix

[root@carf-metrics-influx02 ~]# tail -f /var/log/influxdb/influxdb.log
[write] 2016/08/17 22:16:48 write failed for shard 113: engine: cache maximum memory size exceeded
[udp] 2016/08/17 22:16:48 failed to write point batch to database "tg_udp": engine: cache maximum memory size exceeded
[write] 2016/08/17 22:16:48 write failed for shard 113: engine: cache maximum memory size exceeded
[udp] 2016/08/17 22:16:48 failed to write point batch to database "tg_udp": engine: cache maximum memory size exceeded
[write] 2016/08/17 22:16:48 write failed for shard 113: engine: cache maximum memory size exceeded
[udp] 2016/08/17 22:16:48 failed to write point batch to database "tg_udp": engine: cache maximum memory size exceeded
[write] 2016/08/17 22:16:49 write failed for shard 113: engine: cache maximum memory size exceeded
[udp] 2016/08/17 22:16:49 failed to write point batch to database "tg_udp": engine: cache maximum memory size exceeded
[write] 2016/08/17 22:16:53 write failed for shard 113: engine: cache maximum memory size exceeded
[udp] 2016/08/17 22:16:53 failed to write point batch to database "tg_udp": engine: cache maximum memory size exceeded
[query] 2016/08/17 22:17:00 SELECT last(usage_user) FROM tg_udp.autogen.cpu WHERE host = 'carf-metrics-influx02' AND 

block.txt
goroutine.txt

@jwilder
Copy link
Contributor

jwilder commented Aug 18, 2016

I'll take a look some more. You could also try setting the cache max memory size to 0 which should disable that limit.

@sebito91
Copy link
Contributor Author

Checked again this morning and the cache did ultimately clear, just took some time. Metrics are coming in again now. Deleting one measurement did take an hour though :)

new_dash_2

@jwilder
Copy link
Contributor

jwilder commented Aug 18, 2016

@sebito91 So this PR is working better? Or did you disable the max size of the cache as well?

@sebito91
Copy link
Contributor Author

@jwilder this is before max size touched...still the same setting as before. Bit concerning that we block metrics for the hour during deletion though, that should be caught within the wal no?

@jwilder
Copy link
Contributor

jwilder commented Aug 18, 2016

Your cache was likely already backed up from the prior failures. I would expect new deletes to not block writes with this patch.

@sebito91
Copy link
Contributor Author

This was our 'backup' node, which we didn't actually run the deletes on. I'll try a fresh start and do the following:

  1. delete with existing cache settings
  2. delete with max size set to 0

This is our build of yesterday's HEAD + your patch

@jwilder jwilder added this to the 1.1.0 milestone Aug 18, 2016
@sebito91
Copy link
Contributor Author

Been out for a couple of days, sorry for the delay.

Implemented the fix again against head on Friday, noticed the following:

  1. when we restart the machine and perform a delete, everything works very quickly and no data is lost (caches virtually empty)
  2. if we perform the delete after a few hours of uptime, the data starts to get blocked + lost during delete (we are using UDP primarily)
  3. the longer we wait for uptime, say two/three days, the longer the block period takes

I would have thought we'd be able to cache metrics into the wal even though we're blocking writes out to tsm, then ultimately flush from wal. I understand the race condition, which is basically that we might still be receiving measurements which we're deleting, but if they're in the wal and ONLY the existing tsm files are queried for deletes then this shouldn't result in lost data.

Thoughts?

@sebito91
Copy link
Contributor Author

Looking still closer on this one, it appears that ANY drop measurement function takes exactly one hour to block all incoming data and finally return (allowing new metrics to come in). We just tested this on another measurement that was collecting for about 5 mins.

To clarify, the CLI returns relatively quickly and the database is still functional and interactive...but the table we delete from becomes completely blocked for precisely one hour due to cache max exceeded. Once the hour passes, data resumes ingest on the table but that window is empty for all measurements.

We will try now with the cache-max-memory-size = 0 and report back.

@dsmilyanov
Copy link

@sebito91 Hey, any update on this particular problem? Did you try with the cache-max-memory-size config?

jwilder added a commit that referenced this issue Oct 18, 2016
If a delete takes a long time to process while writes to the
shard are occuring, it was possible for the cache to fill up
and writes to be rejected.  This occurred because we disabled
all compactions while writing tombstone file to prevent deleted
data from re-appearing after a compaction completed.

Instead, we only disable the level compactions and allow snapshot
compactions to continue.  Snapshots already handle deleted data
with the cache and wal.

Fixes #7161
@jwilder jwilder self-assigned this Oct 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants