Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSI engine consumes significantly more memory than TSM engine #11830

Closed
andreykaipov opened this issue Feb 12, 2019 · 12 comments
Closed

TSI engine consumes significantly more memory than TSM engine #11830

andreykaipov opened this issue Feb 12, 2019 · 12 comments

Comments

@andreykaipov
Copy link

Ever since switching our InfluxDB nodes to use TSI-based indices, we've seen nodes slowly climb up in memory until they've started crashing frequently during midnight compactions.

We set up a version 1.7.2 test node with around 4 million series spread across 30 databases to try and investigate, and found TSI-based indices consuming significantly more memory. In our tests we let Influx run for a while before stopping it, converting between TSI and TSM-based indices, and restarting it.

Here are the top processes for the node when TSI is enabled and all indices are TSI-based:

$ go tool pprof -top ~/Downloads/tsi_based.pprof.influxd.alloc_objects.alloc_space.inuse_objects.inuse_space.pb.gz | head -n20
File: influxd
Type: inuse_space
Time: Jan 23, 2019 at 10:32am (EST)
Showing nodes accounting for 22114.98MB, 94.93% of 23296.71MB total
Dropped 349 nodes (cum <= 116.48MB)
      flat  flat%   sum%        cum   cum%
10376.94MB 44.54% 44.54% 16915.98MB 72.61%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*LogFile).execSeriesEntry
 3111.78MB 13.36% 57.90%  3679.67MB 15.79%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*logTagValue).addSeriesID
 1815.08MB  7.79% 65.69%  1815.08MB  7.79%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*logTagKey).createTagValueIfNotExists (inline)
 1568.99MB  6.73% 72.43%  1568.99MB  6.73%  github.com/influxdata/influxdb/pkg/rhh.assign (inline)
  743.07MB  3.19% 75.62%   743.07MB  3.19%  github.com/influxdata/influxdb/pkg/estimator/hll.(*Plus).toNormal
  571.64MB  2.45% 78.07%   571.64MB  2.45%  github.com/influxdata/influxdb/pkg/rhh.(*HashMap).alloc (inline)
  401.58MB  1.72% 79.79%   406.12MB  1.74%  github.com/influxdata/influxdb/vendor/github.com/influxdata/roaring.(*arrayContainer).iaddReturnMinimized
  383.02MB  1.64% 81.44%   383.02MB  1.64%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*logMeasurement).createTagSetIfNotExists (inline)
  361.01MB  1.55% 82.99%   361.01MB  1.55%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*LogFile).createMeasurementIfNotExists (inline)
  350.75MB  1.51% 84.49%  2489.86MB 10.69%  github.com/influxdata/influxdb/tsdb.(*SeriesIndex).execEntry
  347.47MB  1.49% 85.98%   347.47MB  1.49%  bytes.makeSlice
  345.17MB  1.48% 87.47%   789.95MB  3.39%  github.com/influxdata/influxdb/tsdb/engine/tsm1.(*partition).write
  339.77MB  1.46% 88.92%   339.77MB  1.46%  github.com/influxdata/influxdb/tsdb/engine/tsm1.(*entry).add
  272.60MB  1.17% 90.09%   327.10MB  1.40%  github.com/influxdata/influxdb/tsdb.(*MeasurementFieldSet).load

And here they are for the same node when TSM is enabled and all indices are TSM-based:

$ go tool pprof -top ~/Downloads/tsm_based.pprof.influxd.alloc_objects.alloc_space.inuse_objects.inuse_space.pb.gz | head -n20
File: influxd
Type: inuse_space
Time: Jan 23, 2019 at 10:02am (EST)
Showing nodes accounting for 15612.24MB, 96.19% of 16229.92MB total
Dropped 248 nodes (cum <= 81.15MB)
      flat  flat%   sum%        cum   cum%
 2941.28MB 18.12% 18.12%  4735.34MB 29.18%  github.com/influxdata/influxdb/vendor/github.com/influxdata/platform/models.Tags.Clone
 2117.10MB 13.04% 31.17%  9213.32MB 56.77%  github.com/influxdata/influxdb/tsdb/index/inmem.(*Index).CreateSeriesListIfNotExists
 1794.06MB 11.05% 42.22%  1794.06MB 11.05%  github.com/influxdata/influxdb/vendor/github.com/influxdata/platform/models.Tag.Clone (inline)
 1593.51MB  9.82% 52.04%  1593.51MB  9.82%  github.com/influxdata/influxdb/pkg/rhh.assign (inline)
 1212.86MB  7.47% 59.51%  1409.37MB  8.68%  github.com/influxdata/influxdb/tsdb/index/inmem.(*tagKeyValue).InsertSeriesIDByte
 1061.20MB  6.54% 66.05%  1061.20MB  6.54%  github.com/influxdata/influxdb/tsdb/engine/tsm1.FloatArrayDecodeAll
  774.15MB  4.77% 70.82%   774.15MB  4.77%  github.com/influxdata/influxdb/vendor/github.com/influxdata/platform/tsdb/cursors.(*FloatArray).Merge
  659.56MB  4.06% 74.88%   659.56MB  4.06%  github.com/influxdata/influxdb/tsdb/index/inmem.newSeries (inline)
  582.79MB  3.59% 78.48%   582.79MB  3.59%  github.com/influxdata/influxdb/pkg/rhh.(*HashMap).alloc (inline)
  392.93MB  2.42% 80.90%   392.93MB  2.42%  github.com/influxdata/influxdb/tsdb/engine/tsm1.timeBatchDecodeAllSimple
  343.73MB  2.12% 83.01%  2519.52MB 15.52%  github.com/influxdata/influxdb/tsdb.(*SeriesIndex).execEntry
  319.79MB  1.97% 84.98%   319.79MB  1.97%  bytes.makeSlice
  290.33MB  1.79% 86.77%   290.33MB  1.79%  github.com/influxdata/influxdb/tsdb/engine/tsm1.(*entry).add
  274.92MB  1.69% 88.47%   330.93MB  2.04%  github.com/influxdata/influxdb/tsdb.(*MeasurementFieldSet).load

These heap profiles were found from hitting the /debug/pprof/heap endpoint and are attached below.

When aggregating over the line level, the TSI-based profile shows the following line to be responsible for a good chunk of memory: https://github.com/influxdata/influxdb/blob/v1.7.2/tsdb/index/tsi1/log_file.go#L718.

From what it looks like, this is where InfluxDB saves the tag values for a specific tag key when writing to the write-ahead log. When saving tags for high cardinality series, the allocation from string(v) so many times must be overloading the heap. I don't know how accurate that guess is, but how come the TSM engine doesn't have a similar issue?


TSI profile: tsi_based.pprof.influxd.alloc_objects.alloc_space.inuse_objects.inuse_space.pb.gz
TSM profile: tsm_based.pprof.influxd.alloc_objects.alloc_space.inuse_objects.inuse_space.pb.gz

@e-dard
Copy link
Contributor

e-dard commented Feb 12, 2019

Hi @andreykaipov

Thanks for the the very comprehensive ticket! As you have 30 databases you will have at least 30 indexes—more if you have more than one shard per index.

Sometimes, depending on the cardinality, your index data will never flush from the log files to the file-backed tsi files because it doesn't reach the default limit needed to flush. That limit is controlled by the max-index-log-file-size = 1048576 config variable.

My initial thought here is that your high heap is because you have lots of log files sat on the heap and there isn't enough cardinality in them to flush them to tsi file.

Could you try reducing max-index-log-file-size down to say max-index-log-file-size = 131072 and see how that improves things?

@andreykaipov
Copy link
Author

Hey @e-dard - thanks for getting back to me so quick!

Before we reverted our nodes back to using the TSM engine, we tried tweaking that setting too. Across five identical TSI nodes with the same data, we tried the following values for max-index-log-file-size:

  1. the default "1m"
  2. "512k"
  3. "256k"
  4. "128k"
  5. "64k"

After restarting the nodes and letting them run throughout the evening, all of them climbed back up to the same memory usage they were at before. From what I remember, the heap profiles also didn't look too different, but unfortunately I didn't save those.

Since we didn't see any noticeable difference in memory usage between a log file size of "1m" and "64k", we didn't try setting it any lower... but maybe we should have kept going?

@seanlook
Copy link

seanlook commented Mar 6, 2019

I have the same issue with influxdb 1.6.2,1.6.6 using tsi1 index:

go tool pprof heap_profile.pprof
(pprof) top
Showing nodes accounting for 40.13GB, 96.92% of 41.40GB total
Dropped 224 nodes (cum <= 0.21GB)
Showing top 10 nodes out of 62
      flat  flat%   sum%        cum   cum%
   20.40GB 49.27% 49.27%    38.25GB 92.39%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*LogFile).execSeriesEntry
    9.81GB 23.69% 72.97%    11.81GB 28.52%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*logTagValue).addSeriesID
    5.59GB 13.49% 86.46%     5.59GB 13.49%  github.com/influxdata/influxdb/tsdb/index/tsi1.(*logTagKey).createTagValueIfNotExists (inline)
    1.11GB  2.68% 89.14%     1.19GB  2.88%  github.com/influxdata/influxdb/vendor/github.com/RoaringBitmap/roaring.(*arrayContainer).iaddReturnMinimized
    0.80GB  1.92% 91.06%     1.52GB  3.67%  github.com/influxdata/influxdb/tsdb.(*SeriesIndex).execEntry
    0.65GB  1.57% 92.63%     0.65GB  1.57%  bytes.makeSlice
    0.57GB  1.38% 94.02%     0.57GB  1.38%  github.com/influxdata/influxdb/vendor/github.com/RoaringBitmap/roaring.newArrayContainer (inline)
    0.50GB  1.20% 95.21%     0.50GB  1.20%  github.com/influxdata/influxdb/pkg/rhh.assign (inline)
    0.48GB  1.15% 96.37%     0.48GB  1.15%  github.com/influxdata/influxdb/vendor/github.com/RoaringBitmap/roaring.(*roaringArray).insertNewKeyValueAt (inline)
    0.23GB  0.55% 96.92%     0.23GB  0.55%  github.com/influxdata/influxdb/pkg/rhh.(*HashMap).alloc (inline)

The influxdb process have almost 120G memory reserved and oom-killed every week.

@qphien
Copy link

qphien commented Mar 14, 2019

i have the same issue with influxdb v1.5.2. Is there any news about this question?

@couloum
Copy link

couloum commented May 23, 2019

Hi,

Same issue here.
I'm running 1.7.6 (latest stable release).
I have 4 databases only, including two with high cardinality (arround 500k series).
Databases are not that big: 4,5GB for /var/lib/influxdb directory (this is a preview environment).

When I run InfluxDB with tsi1 engine and after having build all indexes with sudo -u influxdb influx_inspect buildtsi -datadir /var/lib/influxdb/data -waldir /var/lib/influxdb/walcommand, I see it consumes around 10GB memory on my server:

$ systemctl status influxdb
● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-05-23 20:03:46 UTC; 20min ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 2436 (influxd)
   Memory: 10.1G (limit: 14.0G)
   CGroup: /system.slice/influxdb.service
           └─2436 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

The, I stop influxdb, change configuration to use inmem index-version and delete all tsi1 index files using command find /var/lib/influxdb/data -type d -name "index" -exec rm -rf '{}' \;

After I start influxdb and all inmem indexing is calculated, influxdb is consumming around 4,5GB memory:

$ systemctl status influxdb
● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-05-23 20:25:56 UTC; 1min 13s ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 5019 (influxd)
   Memory: 4.5G (limit: 14.0G)
   CGroup: /system.slice/influxdb.service
           └─5019 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

The overall startup time of InfluxDB is not especially improved using tsi1 than with inmem:
With tsi1: 51 seconds
With inmem: 53 seconds

@seanlook
Copy link

seanlook commented Jun 2, 2019

Quote myself, I found out that because my influxdb instance has too much shards. After setting the longer shard group duration, my tsi1 memory decreased from 120GB to 70GB.

Meanwhile I disabled the precreate shard option, because some future datapoints may lead to create too much shards.

@e-dard
Copy link
Contributor

e-dard commented Aug 13, 2019

@andreykaipov this might be the world’s longest wait for a reply... But, I was reviewing this ticket and noticed something. When you change the max log file size it will only affect new shards. Therefore in order to see the impact on current shards you need to rebuild your TSI indexes with the buildtsi tool (which accepts an optional max-log-file-size).

@stale
Copy link

stale bot commented Nov 11, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Nov 11, 2019
@andreykaipov
Copy link
Author

@e-dard I must also apologize for taking a bit to get back with an update! 😅

At the time of this ticket, I totally missed the buildtsi tool had an option for that. We figured the setting in the config applied to existing shards too. Passing a lower max-log-file-size value to the buildtsi tool when converting from TSM to TSI did indeed cause InfluxDB to start up with lower memory usage, but things were still running rather hot.

So next we also went with @seanlook's suggestion of increasing our shard group durations. By default we had daily shards, so as an experiment converted half our nodes in one cluster to use weekly shards. I wish we had tried this out sooner. As shard counts dropped over the next few months, so did memory and load average on the TSI nodes, eventually steadying out. Unfortunately it's hard to say at what counts we saw the improvement. We just looked at our dashboards one day and noticed the gradual improvement!

Other improvements we noticed are the TSI nodes with fewer shards haven't had any significant or unusual buffered writes over these few months, and they start up a lot quicker too (which makes sense given there's less indices to open).

So yeah, that's the story. As we slowly convert our other datacenters and environments to use TSI and weekly shards, I'll try to report back the results. I'm hopeful. Thank you to everybody on this thread. We learned a lot and we really appreciate it. :-)

I'm okay with closing out this issue. I hope the operational knowledge helps and addresses the issue for everybody else too.

@stale stale bot removed the wontfix label Nov 11, 2019
@stale
Copy link

stale bot commented Feb 9, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Feb 9, 2020
@russorat russorat removed the wontfix label Feb 18, 2020
@Shelvak
Copy link

Shelvak commented Apr 9, 2020

I hit the same issue with indexes. The only way to start my influx db (about 90Gb in disk) was deleting indexes and recreating them with influx_inspect buildtsi -max-log-file-size 32678 and after it finished, the service consumes more RAM than before:

  • with inmem about 10Gb
  • with indexes and tsi1 about 12Gb...

With both index versions the startup process takes about 7minutes =/

I'm running 1.7.9 in ubuntu server, (In DigitalOcean) 8cpu's and 16Gb...
Any other work around or recommendation?

Cheers

@lesam
Copy link
Contributor

lesam commented Nov 22, 2021

Likely fixed in 2.0.9 by #22520

@lesam lesam closed this as completed Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants