Improve TSM1 cache performance #7633

e-dard · 2016-11-16T18:40:54Z

Required for all non-trivial PRs

Rebased/mergable
Tests pass
CHANGELOG.md updated

This PR optimises the TSM1 cache, improving write performance under load by up to 70% from 1.1.

It works by using a simple (crude, really) Hash Ring, which allows any series key to be consistently mapped to a partition in the ring. Each partition contains a map of series keys to TSM1 entries, and manages safe access to that map. Entries are distributed uniformly between a small (currently fixed to 128) partitions.

Since each partition manages access to the entries that map to it, contention over the entire cache is minimised, and restricted to cases when keys that map to the same partition as each other need to operate on the cache.

Further, currently the cache is essentially re-allocated and re-build after a snapshot is taken, which is expensive. This PR adds some logic that keeps track of how entries within the cache are currently allocated as hints, and uses these hints to pre-allocate the cache after a snapshot, so that subsequent additions of entries do not result in new allocations within the cache data-structure.

Performance

I've been mainly testing performance on a c4.8xlarge AWS instance, firing a combined 1.8 M writes/sec at the instance using influx-stress. Specifically:

# host 1
$ influx-stress insert -n 30000000 --pps 900000 --host http://influxdb:8086

# host 2
$ influx-stress insert -n 30000000 --pps 900000 --host http://influxdb:8086

In total 600M points are written.

Write Throughput InfluxDB 1.1 RC1: 1,082,621 writes/sec
Write Throughput er-cache: 1,682,816 writes/sec

This is a typical run, where the points have a single value. The case where there are multiple points have a similar difference.

mark-rushakoff · 2016-11-17T19:15:50Z

tsdb/engine/tsm1/ring.go

+	}
+
+	wg.Wait()
+	close(res)


If this Wait and close are in their own goroutine, you can start looping over res sooner. Because res is buffered, it looks like there's no risk of deadlock.

jwilder

One small nit, but looks great!

jwilder · 2016-12-13T16:39:02Z

tsdb/engine/tsm1/ring.go

+//     {1, 2, 4, 8, 16, 32, 64, 128, 256}.
+//
+func newring(n int) (*ring, error) {
+	if n <= 0 || n > 256 {


Move 256 to a const since it's repeated a few times in this func?

Also, we should probably test w/ higher partitions under high load to see if that has any benefit.

jwilder · 2016-12-13T16:42:07Z

tsdb/engine/tsm1/ring.go

+
+			p.mu.RLock()
+			for k, e := range p.store {
+				if err := f(k, e); err != nil {


When I got a lot of timeouts under high load, this was one of the spots that showed up where many goroutines where busy. The issue was a that each entry was being sorted which was expensive and there were millions of them which I think caused CPU starvation for the waiting write HTTP goroutines.

This is huge, thank you very much for adding this!

Currently, whenever a snapshot occurs the Cache is reset and so many allocations are repeated, as the same type of data is re-added to the Cache. This commit allows the stores to keep track of the number of values within an entry, and use that size as a hint when the same entry needs to be recreated after a snapshot. To avoid hints persisting over a long period of time they are deleting after every snapshot, and rebuilt using the most recent entries only.

e-dard added area/writes area/performance labels Nov 16, 2016

e-dard added this to the 1.2.0 milestone Nov 16, 2016

e-dard changed the title ~~[WIPImprove TSM1 cache performance~~ [WIP] Improve TSM1 cache performance Nov 16, 2016

e-dard force-pushed the er-cache branch 3 times, most recently from eff1549 to 6aa2456 Compare November 17, 2016 18:41

mark-rushakoff reviewed Nov 17, 2016

View reviewed changes

e-dard force-pushed the er-cache branch 8 times, most recently from a68fc86 to 344e576 Compare November 29, 2016 19:23

e-dard changed the title ~~[WIP] Improve TSM1 cache performance~~ Improve TSM1 cache performance Dec 2, 2016

e-dard force-pushed the er-cache branch from 702a826 to 0d1804a Compare December 9, 2016 19:15

jwilder approved these changes Dec 13, 2016

View reviewed changes

e-dard force-pushed the er-cache branch from 0d1804a to 5af360c Compare December 14, 2016 18:17

e-dard added 9 commits December 14, 2016 18:21

Add benchmarks

d3e6d4e

Sharded Cache using a hash ring

66edb32

Update size using atomic

98f0392

Reduce contention when adding entries

f2b5c7f

Fix some races

d78ca1a

Add to index safely

05ec6ad

Further optimisations and a race fix

ec27c57

Update dependency

799e5f4

e-dard force-pushed the er-cache branch from 5af360c to 799e5f4 Compare December 14, 2016 18:23

e-dard merged commit b0a1e91 into master Dec 14, 2016

e-dard deleted the er-cache branch December 14, 2016 18:56

rkuchan mentioned this pull request Dec 29, 2016

[InfluxDB] Update differences doc influxdata/docs.influxdata.com-ARCHIVE#928

Closed

rkuchan mentioned this pull request Jan 13, 2017

[InfluxDB] Update differences and upgrading documentation influxdata/docs.influxdata.com-ARCHIVE#948

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve TSM1 cache performance #7633

Improve TSM1 cache performance #7633

e-dard commented Nov 16, 2016 •

edited

Loading

mark-rushakoff Nov 17, 2016

jwilder left a comment

jwilder Dec 13, 2016

jwilder Dec 13, 2016

jwilder Dec 13, 2016

sebito91 Dec 15, 2016

Improve TSM1 cache performance #7633

Improve TSM1 cache performance #7633

Conversation

e-dard commented Nov 16, 2016 • edited Loading

Required for all non-trivial PRs

Performance

mark-rushakoff Nov 17, 2016

Choose a reason for hiding this comment

jwilder left a comment

Choose a reason for hiding this comment

jwilder Dec 13, 2016

Choose a reason for hiding this comment

jwilder Dec 13, 2016

Choose a reason for hiding this comment

jwilder Dec 13, 2016

Choose a reason for hiding this comment

sebito91 Dec 15, 2016

Choose a reason for hiding this comment

e-dard commented Nov 16, 2016 •

edited

Loading