Fix tx buffer inconsistency if there are unordered key writes in one tx. #17263

siyuanfoundation · 2024-01-16T22:32:55Z

Recommit #17228 with buf fix.

#17247 was fixed by reverting #17228. But the underlying problem with the buffer should still be fixed.

No failure locally with

EXPECT_DEBUG=true GO_TEST_FLAGS='-run=TestRobustness/Kubernetes/HighTraffic/ClusterOfSize3 --timeout=3000m --count=100 --failfast -v' RESULTS_DIR=./tmp/results make test-robustness

and just the [RaftAfterSaveSnapPanic, MemberReplace] Failpoints.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

siyuanfoundation · 2024-01-16T23:33:55Z

/retest

fuweid · 2024-01-17T02:41:36Z

Hi @siyuanfoundation you can try to limit cpu by the following command and use patch #17248

EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results taskset -c 0,1,2 make test-robustness

Hope it can help.

siyuanfoundation · 2024-01-17T04:45:20Z

Hi @siyuanfoundation you can try to limit cpu by the following command and use patch #17248
EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results taskset -c 0,1,2 make test-robustness
Hope it can help.

Thank you @fuweid, it did help!

siyuanfoundation · 2024-01-17T05:47:01Z

cc @serathius @ahrtr @fuweid

server/storage/backend/tx_buffer.go

fuweid

LGTM!

server/storage/backend/batch_tx_test.go

ahrtr · 2024-01-18T17:56:59Z

Please read #17247 (comment)

ahrtr

#17263 (comment)

siyuanfoundation · 2024-01-26T02:41:58Z

/retest

server/storage/backend/batch_tx_test.go

fuweid

LGTM

serathius · 2024-01-29T16:05:42Z

The change looks good to me, it fixes a clear problem and adds regression tests, however #17228 did the same. Can we take some additional steps to ensure we prevent another regression?

I don't we can depend on reviewers trying to analyse and find edge cases as it failed before. One idea I have is to do comparison testing between buffered and unbuffered transaction. What do you think?

siyuanfoundation · 2024-03-25T23:58:28Z

cc @ahrtr
Added benchmark test for writeback function
Ran the test 3 times. The results are kind of noisy, but there is no clear indication of performance regression.
Before the PR:

BenchmarkWritebackSeqBatches1BatchSize10000-8      	     100	 132904368, 137525118, 134134904  ns/op
BenchmarkWritebackSeqBatches10BatchSize1000-8      	     100	 135975057, 144034473, 138393072 ns/op
BenchmarkWritebackSeqBatches100BatchSize100-8      	     100	 145159646, 147396648, 148373301 ns/op
BenchmarkWritebackSeqBatches1000BatchSize10-8      	     100	 225271045, 225629845, 229453516 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize1-8    	     100	  29684442, 29051643, 31323272 ns/op
BenchmarkWritebackNonSeqBatches10000BatchSize1-8   	     100	1011180750, 1012368726, 1013996570 ns/op
BenchmarkWritebackNonSeqBatches100BatchSize10-8    	     100	  22771697, 22785210, 22529997 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize10-8   	     100	 233131758, 233625789, 238527133 ns/op

After the PR:

enchmarkWritebackSeqBatches1BatchSize10000-8      	     100	 136669713, 131847663, 130722632 ns/op
BenchmarkWritebackSeqBatches10BatchSize1000-8      	     100	 139880124, 139196486, 137805060 ns/op
BenchmarkWritebackSeqBatches100BatchSize100-8      	     100	 150674477, 146649422, 146416992 ns/op
BenchmarkWritebackSeqBatches1000BatchSize10-8      	     100	 226596374, 228097908, 223997282 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize1-8    	     100	  30098653, 30031651, 29341949 ns/op
BenchmarkWritebackNonSeqBatches10000BatchSize1-8   	     100	 990611851, 999573564, 994189677 ns/op
BenchmarkWritebackNonSeqBatches100BatchSize10-8    	     100	  22655668, 22641929, 21832902 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize10-8   	     100	 234623372, 233671979, 231825643 ns/op

siyuanfoundation · 2024-03-26T03:53:00Z

The change looks good to me, it fixes a clear problem and adds regression tests, however #17228 did the same. Can we take some additional steps to ensure we prevent another regression?

I don't we can depend on reviewers trying to analyse and find edge cases as it failed before. One idea I have is to do comparison testing between buffered and unbuffered transaction. What do you think?

cc @serathius
This bug happened because of the coincidence of Commit() timing between two different buckets, like in the tests I had to do two consecutive tx.Commit() to reproduce the bug. It is very hard to ensure there is no regression deterministically. In this case, we just have to be careful with more unit tests and rely on non-deterministic robustness test to detect the regression.

serathius · 2024-03-26T08:07:07Z

Ran the test 3 times. The results are kind of noisy, but there is no clear indication of performance regression.
Before the PR:

Can you use benchstat ? https://sourcegraph.com/blog/go/gophercon-2019-optimizing-go-code-without-a-blindfold

siyuanfoundation · 2024-03-26T22:37:00Z

Ran the test 3 times. The results are kind of noisy, but there is no clear indication of performance regression.
Before the PR:

Can you use benchstat ? https://sourcegraph.com/blog/go/gophercon-2019-optimizing-go-code-without-a-blindfold

Ok. I ran

go clean -testcache && go test -bench=. -count=10 -timeout 0 > benchmark_results.txt
benchstat benchmark_results.txt

Here are the results: (they are basically the same)
Before

WritebackSeqBatches1BatchSize10000-8               128.2m ± 1%
WritebackSeqBatches10BatchSize1000-8               130.7m ± 1%
WritebackSeqBatches100BatchSize100-8               139.5m ± 1%
WritebackSeqBatches1000BatchSize10-8               220.6m ± 4%
WritebackNonSeqBatches1000BatchSize1-8             28.34m ± 2%
WritebackNonSeqBatches10000BatchSize1-8             1.002 ± 1%
WritebackNonSeqBatches100BatchSize10-8             21.64m ± 2%
WritebackNonSeqBatches1000BatchSize10-8            226.3m ± 1%

After

WritebackSeqBatches1BatchSize10000-8              128.3m ± 1%
WritebackSeqBatches10BatchSize1000-8              130.8m ± 3%
WritebackSeqBatches100BatchSize100-8              140.0m ± 1%
WritebackSeqBatches1000BatchSize10-8              216.3m ± 4%
WritebackNonSeqBatches1000BatchSize1-8            27.49m ± 2%
WritebackNonSeqBatches10000BatchSize1-8           966.6m ± 1%
WritebackNonSeqBatches100BatchSize10-8            20.47m ± 2%
WritebackNonSeqBatches1000BatchSize10-8           221.3m ± 1%

ahrtr

/lgtm

Thanks

serathius · 2024-03-27T08:24:44Z

Here are the results: (they are basically the same)

Benchstat do not change the results, it just makes them readable and makes it easy to check if they are statistically significant

If you provide both old and new benchmark results to benchstat, it will provide also the performance delta and p-value

serathius · 2024-03-27T08:31:06Z

server/storage/backend/batch_tx_test.go

+		}
+		if !isSeq {
+			shuffleList(ks)
+		}


Nit: When benchmark needs some input data, it's best to avoid not include it in the measurements of the benchmark to avoid noise.

You can do that by preparing the input data first, and then calling b.ResetTimer() before running the code you want to benchmark.

serathius · 2024-03-27T08:41:38Z

server/storage/backend/batch_tx_test.go

It's great to have a regression tests, however I'm thinking that the scenarios we are covering are so intricate and complicated that it scares me. If there is another issue hiding in the transaction logic, we don't have any chance of finding it using this approach.

Signed-off-by: Siyuan Zhang <sizhang@google.com>

…bucket. Signed-off-by: Siyuan Zhang <sizhang@google.com>

fuweid

LGTM

Thanks for adding benchmark test suites.

siyuanfoundation · 2024-03-28T02:30:39Z

Here are the results: (they are basically the same)

Benchstat do not change the results, it just makes them readable and makes it easy to check if they are statistically significant

If you provide both old and new benchmark results to benchstat, it will provide also the performance delta and p-value

Reran benchstat twice with count=10,
run 1:

                                        │ benchmark_before.txt │        benchmark_after.txt         │
                                        │        sec/op        │   sec/op     vs base               │
WritebackSeqBatches1BatchSize10000-8               127.7m ± 1%   127.2m ± 3%       ~ (p=0.796 n=10)
WritebackSeqBatches10BatchSize1000-8               129.9m ± 6%   129.9m ± 1%       ~ (p=0.971 n=10)
WritebackSeqBatches100BatchSize100-8               138.6m ± 1%   137.0m ± 1%  -1.17% (p=0.029 n=10)
WritebackSeqBatches1000BatchSize10-8               215.7m ± 1%   212.7m ± 4%       ~ (p=0.052 n=10)
WritebackNonSeqBatches1000BatchSize1-8             28.48m ± 2%   28.05m ± 2%  -1.48% (p=0.035 n=10)
WritebackNonSeqBatches10000BatchSize1-8            981.0m ± 1%   949.6m ± 0%  -3.20% (p=0.000 n=10)
WritebackNonSeqBatches100BatchSize10-8             20.34m ± 1%   21.21m ± 5%  +4.27% (p=0.000 n=10)
WritebackNonSeqBatches1000BatchSize10-8            216.3m ± 1%   216.4m ± 4%       ~ (p=0.796 n=10)

run 2:

                                        │ benchmark_before_0.txt │       benchmark_after_0.txt        │
                                        │         sec/op         │   sec/op     vs base               │
WritebackSeqBatches1BatchSize10000-8                 127.8m ± 3%   130.0m ± 2%  +1.73% (p=0.023 n=10)
WritebackSeqBatches10BatchSize1000-8                 127.2m ± 1%   130.9m ± 3%  +2.95% (p=0.001 n=10)
WritebackSeqBatches100BatchSize100-8                 135.4m ± 1%   139.2m ± 3%  +2.76% (p=0.001 n=10)
WritebackSeqBatches1000BatchSize10-8                 212.4m ± 2%   213.1m ± 0%       ~ (p=0.165 n=10)
WritebackNonSeqBatches1000BatchSize1-8               27.82m ± 2%   28.80m ± 3%  +3.51% (p=0.001 n=10)
WritebackNonSeqBatches10000BatchSize1-8              970.1m ± 2%   951.8m ± 1%  -1.89% (p=0.000 n=10)
WritebackNonSeqBatches100BatchSize10-8               20.23m ± 1%   20.39m ± 5%       ~ (p=0.165 n=10)
WritebackNonSeqBatches1000BatchSize10-8              217.1m ± 1%   216.6m ± 1%       ~ (p=0.529 n=10)

serathius · 2024-03-28T11:07:49Z

So in the updated benchmark results you can see that it causes ~2% regression in performance. Still I think impact on overall etcd performance should be not noticable and we should pick correctness here.

siyuanfoundation marked this pull request as draft January 16, 2024 22:59

k8s-ci-robot added the do-not-merge/work-in-progress label Jan 16, 2024

siyuanfoundation force-pushed the txBuf1 branch 2 times, most recently from 188163b to ea46154 Compare January 17, 2024 01:24

siyuanfoundation force-pushed the txBuf1 branch from ea46154 to a6132e5 Compare January 17, 2024 04:29

siyuanfoundation changed the title ~~move buffer dedupe before overwriting read buffer.~~ skip buffer dedupe for seq buffer. Jan 17, 2024

siyuanfoundation marked this pull request as ready for review January 17, 2024 05:18

k8s-ci-robot removed the do-not-merge/work-in-progress label Jan 17, 2024

fuweid reviewed Jan 17, 2024

View reviewed changes

server/storage/backend/tx_buffer.go Show resolved Hide resolved

ahrtr mentioned this pull request Jan 17, 2024

Duplicated watch event detected in robustness test #17247

Closed

4 tasks

siyuanfoundation force-pushed the txBuf1 branch 3 times, most recently from 9816f82 to f6a5621 Compare January 17, 2024 22:24

siyuanfoundation changed the title ~~skip buffer dedupe for seq buffer.~~ fix buffer dedupe bug. Jan 17, 2024

siyuanfoundation force-pushed the txBuf1 branch 2 times, most recently from 1abf2f9 to 440a140 Compare January 18, 2024 00:32

fuweid approved these changes Jan 18, 2024

View reviewed changes

server/storage/backend/batch_tx_test.go Show resolved Hide resolved

siyuanfoundation force-pushed the txBuf1 branch from 440a140 to 9b4acde Compare January 18, 2024 16:46

ahrtr requested changes Jan 18, 2024

View reviewed changes

siyuanfoundation marked this pull request as draft January 19, 2024 17:23

k8s-ci-robot added the do-not-merge/work-in-progress label Jan 19, 2024

siyuanfoundation force-pushed the txBuf1 branch from 9b4acde to 9d4d3af Compare January 25, 2024 21:23

siyuanfoundation changed the title ~~fix buffer dedupe bug.~~ Fix tx buffer inconsistency if there are unordered key writes in one tx. Jan 25, 2024

siyuanfoundation marked this pull request as ready for review January 25, 2024 23:40

k8s-ci-robot removed the do-not-merge/work-in-progress label Jan 25, 2024

serathius reviewed Jan 26, 2024

View reviewed changes

server/storage/backend/batch_tx_test.go Show resolved Hide resolved

fuweid approved these changes Jan 29, 2024

View reviewed changes

serathius mentioned this pull request Jan 29, 2024

Add verification on keys: should be always moronically increasing #17290

Closed

siyuanfoundation mentioned this pull request Feb 6, 2024

Add VerifyTxConsistency to backend. #17359

Merged

siyuanfoundation force-pushed the txBuf1 branch 4 times, most recently from 1cd20f4 to 003a32a Compare March 25, 2024 23:22

siyuanfoundation force-pushed the txBuf1 branch 2 times, most recently from 34de0c8 to 1667df0 Compare March 26, 2024 00:56

ahrtr approved these changes Mar 27, 2024

View reviewed changes

serathius reviewed Mar 27, 2024

View reviewed changes

serathius approved these changes Mar 27, 2024

View reviewed changes

siyuanfoundation force-pushed the txBuf1 branch from 1667df0 to 27fc1a0 Compare March 27, 2024 16:41

siyuanfoundation added 3 commits March 27, 2024 17:03

Add benchmark tests for buffer writeback function.

4346a43

Signed-off-by: Siyuan Zhang <sizhang@google.com>

Add tx buffer test case of unordered key writes.

7be3606

Signed-off-by: Siyuan Zhang <sizhang@google.com>

add key dedupe when a write buffer writeback to an empty read buffer …

0a54362

…bucket. Signed-off-by: Siyuan Zhang <sizhang@google.com>

siyuanfoundation force-pushed the txBuf1 branch from 27fc1a0 to 0a54362 Compare March 27, 2024 17:03

fuweid approved these changes Mar 27, 2024

View reviewed changes

serathius merged commit a22ae62 into etcd-io:main Mar 28, 2024
39 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tx buffer inconsistency if there are unordered key writes in one tx. #17263

Fix tx buffer inconsistency if there are unordered key writes in one tx. #17263

siyuanfoundation commented Jan 16, 2024 •

edited

Loading

siyuanfoundation commented Jan 16, 2024

fuweid commented Jan 17, 2024

siyuanfoundation commented Jan 17, 2024

siyuanfoundation commented Jan 17, 2024

fuweid left a comment

ahrtr commented Jan 18, 2024

ahrtr left a comment

siyuanfoundation commented Jan 26, 2024

fuweid left a comment

serathius commented Jan 29, 2024

siyuanfoundation commented Mar 25, 2024 •

edited

Loading

siyuanfoundation commented Mar 26, 2024 •

edited

Loading

serathius commented Mar 26, 2024

siyuanfoundation commented Mar 26, 2024

ahrtr left a comment

serathius commented Mar 27, 2024

serathius Mar 27, 2024

siyuanfoundation Mar 28, 2024

serathius Mar 27, 2024

fuweid left a comment

siyuanfoundation commented Mar 28, 2024

serathius commented Mar 28, 2024 •

edited

Loading

Fix tx buffer inconsistency if there are unordered key writes in one tx. #17263

Fix tx buffer inconsistency if there are unordered key writes in one tx. #17263

Conversation

siyuanfoundation commented Jan 16, 2024 • edited Loading

siyuanfoundation commented Jan 16, 2024

fuweid commented Jan 17, 2024

siyuanfoundation commented Jan 17, 2024

siyuanfoundation commented Jan 17, 2024

fuweid left a comment

Choose a reason for hiding this comment

ahrtr commented Jan 18, 2024

ahrtr left a comment

Choose a reason for hiding this comment

siyuanfoundation commented Jan 26, 2024

fuweid left a comment

Choose a reason for hiding this comment

serathius commented Jan 29, 2024

siyuanfoundation commented Mar 25, 2024 • edited Loading

siyuanfoundation commented Mar 26, 2024 • edited Loading

serathius commented Mar 26, 2024

siyuanfoundation commented Mar 26, 2024

ahrtr left a comment

Choose a reason for hiding this comment

serathius commented Mar 27, 2024

serathius Mar 27, 2024

Choose a reason for hiding this comment

siyuanfoundation Mar 28, 2024

Choose a reason for hiding this comment

serathius Mar 27, 2024

Choose a reason for hiding this comment

fuweid left a comment

Choose a reason for hiding this comment

siyuanfoundation commented Mar 28, 2024

serathius commented Mar 28, 2024 • edited Loading

siyuanfoundation commented Jan 16, 2024 •

edited

Loading

siyuanfoundation commented Mar 25, 2024 •

edited

Loading

siyuanfoundation commented Mar 26, 2024 •

edited

Loading

serathius commented Mar 28, 2024 •

edited

Loading