High tail latency around Index compactions #9506

jcalvert · 2018-03-28T19:25:34Z

While using a cluster on 3.3.1 when there are millions of keys in etcd, we have seen very long durations (5-10 seconds) on the index compaction stat measured here. This coincided with increases in request latency and client side timeouts. We have further observed that almost all of the time spent in this function has been around the tree index compaction here. Looking into this function, we observed the the following comment. Since this function locks for the duration of compacting the B-Tree, this affects overall throughput performance. We wrote a benchmark in order to validate the run time in isolation that is provided below with our results. Unfortunately these results do not line up with the O(10ms) comment. We ran these benchmarks on a 16 logical CPU machine with Xeon E5-2670 v2 @ 2.50GHz. Is this an expected bottleneck?

package mvcc

import (
        "testing"
)

func BenchmarkIndexCompact1(b *testing.B)   { benchmarkIndexCompact(b, 1) }
func BenchmarkIndexCompact100(b *testing.B) { benchmarkIndexCompact(b, 100) }
func BenchmarkIndexCompact10000(b *testing.B) { benchmarkIndexCompact(b, 10000) }
func BenchmarkIndexCompact100000(b *testing.B) { benchmarkIndexCompact(b, 100000) }
func BenchmarkIndexCompact1000000(b *testing.B) { benchmarkIndexCompact(b, 1000000) }

func benchmarkIndexCompact(b *testing.B, size int) {
        plog.SetLevel(0) // suppress log entries
  kvindex := newTreeIndex()

        bytesN := 64
        keys := createBytesSlice(bytesN, size)
        for i := 1; i < size; i++ {
                kvindex.Put(keys[i], revision {main: int64(i), sub: int64(i)})
        }
        b.ResetTimer()
        for i := 1; i < b.N; i++ {
                kvindex.Compact(int64(i))
        }
}

BenchmarkIndexCompact1-16                2000000               666 ns/op
BenchmarkIndexCompact100-16               100000         18643 ns/op
BenchmarkIndexCompact10000-16               2000            866163 ns/op
BenchmarkIndexCompact100000-16               100          10112223 ns/op
BenchmarkIndexCompact1000000-16              100         385412535 ns/op

The text was updated successfully, but these errors were encountered:

xiang90 · 2018-03-28T21:54:06Z

@jcalvert

You can do something similar to https://github.com/coreos/etcd/pull/9384/files#diff-d741eeb0ba73b4c9fdb36742f975395dR88 to fix the problem. I might have time to take a look in the next couple of weeks.

xiang90 · 2018-03-28T21:55:42Z

While using a cluster on 3.3.1 when there are millions of keys in etcd, we have seen very long durations (5-10 seconds) on the index compaction stat measured here

the benchmark does not explain why the pause is at second level. maybe there is something else going on?

jcalvert · 2018-03-29T15:15:13Z

The benchmark shows that compaction scales worse than linearly. Adding a benchmark for 5000000 shows on this same machine a result in excess of 2 seconds. Extrapolating this to 10 million index entries it would seem is sufficient to explain that much pause.

xiang90 · 2018-03-29T17:07:59Z

@jcalvert ok. makes sense if you have 10 million of keys.

jcalvert mentioned this issue Mar 29, 2018

mvcc: Clone for batch index compaction and shorten lock #9511

Merged

gyuho added the area/performance label May 2, 2018

jcalvert closed this as completed Jun 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High tail latency around Index compactions #9506

High tail latency around Index compactions #9506

jcalvert commented Mar 28, 2018 •

edited

Loading

xiang90 commented Mar 28, 2018

xiang90 commented Mar 28, 2018

jcalvert commented Mar 29, 2018

xiang90 commented Mar 29, 2018

High tail latency around Index compactions #9506

High tail latency around Index compactions #9506

Comments

jcalvert commented Mar 28, 2018 • edited Loading

xiang90 commented Mar 28, 2018

xiang90 commented Mar 28, 2018

jcalvert commented Mar 29, 2018

xiang90 commented Mar 29, 2018

jcalvert commented Mar 28, 2018 •

edited

Loading