Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 1.3.4 uses 26GB more memory to load same dataset as 1.2.4 #8741

Closed
edufelipe opened this issue Aug 24, 2017 · 8 comments · Fixed by #8804
Closed

Version 1.3.4 uses 26GB more memory to load same dataset as 1.2.4 #8741

edufelipe opened this issue Aug 24, 2017 · 8 comments · Fixed by #8804
Assignees

Comments

@edufelipe
Copy link

Bug report

System info:

InfluxDB 1.3.4 vs InfluxDB 1.2.4
Both running under Ubuntu 16.04 on a 12GB Linode VM.

Steps to reproduce:

  1. Upgraded my server from 1.2.4 to 1.3.4
  2. Server started getting killed by the OS due to OOM
  3. Added 50GB swap file just to get version 1.3.4 to load
  4. Verified that it used almost 30GB of memory (18GB swap plus all resident memory on my VM) but was working as expected, returning valid results and accepting writes.
  5. Downgraded to version 1.2.4 and verified that it used less than 3GB of resident memory and no swap and still contained correct data.

Expected behavior:
Use around the same amount of memory between versions for the same dataset.

Actual behavior:
InfluxDB 1.3.4 needed almost 30GB of memory to fit a dataset that used less than 3GB under version 1.2.4

Additional info:
I've dumped all the vars and profiling for both versions. I've also included the logs of influxd starting up.

influx-debug.tar.gz

@jwilder
Copy link
Contributor

jwilder commented Aug 24, 2017

This appears to be a issue due to: da6bdfd

You have about 20GB in the heap because there are about 20k wal segments. Each wal segment reader is creating a 1M buffer which is blowing up your heap at startup. These wal segments are 0 length so the readers really don't need to be created.

@edufelipe
Copy link
Author

@jwilder Is there a workaround for this? Is there a way I can reduce the number of wal segments?

@jwilder
Copy link
Contributor

jwilder commented Aug 24, 2017

@edufelipe I have not tried this, but running find /var/lib/influxdb/wal -size 0 -type f -print -delete might be a workaround.

@edufelipe
Copy link
Author

@jwilder I've deleted all zero-sized files, updated to 1.3.4 and while that cut off a third of memory, it still uses over 20GB. Would you like me to send the vars.txt for this last run?

@jwilder
Copy link
Contributor

jwilder commented Aug 24, 2017

@edufelipe If you can include everything you included in influx-debug.tar.gz that would be great.

@oiooj
Copy link
Contributor

oiooj commented Aug 29, 2017

@jwilder why is there so many 0 size wal segments? Can I delete all this files ?

find /data/influxdb/wal -size 0 -type f -print | wc -l
30304

@edufelipe
Copy link
Author

@jwilder Do you have any suggestions I can test for now?

@ghost ghost assigned jwilder Sep 7, 2017
@ghost ghost added the review label Sep 7, 2017
@jwilder
Copy link
Contributor

jwilder commented Sep 8, 2017

@edufelipe #8804 should fix this.

@ghost ghost removed the review label Sep 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants