Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tx buffer inconsistency if there are unordered key writes in one tx. #17263

Merged
merged 3 commits into from
Mar 28, 2024

Conversation

siyuanfoundation
Copy link
Contributor

@siyuanfoundation siyuanfoundation commented Jan 16, 2024

Recommit #17228 with buf fix.

#17247 was fixed by reverting #17228. But the underlying problem with the buffer should still be fixed.

No failure locally with

EXPECT_DEBUG=true GO_TEST_FLAGS='-run=TestRobustness/Kubernetes/HighTraffic/ClusterOfSize3 --timeout=3000m --count=100 --failfast -v' RESULTS_DIR=./tmp/results make test-robustness

and just the [RaftAfterSaveSnapPanic, MemberReplace] Failpoints.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@siyuanfoundation
Copy link
Contributor Author

/retest

@fuweid
Copy link
Member

fuweid commented Jan 17, 2024

Hi @siyuanfoundation you can try to limit cpu by the following command and use patch #17248

EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results taskset -c 0,1,2 make test-robustness

Hope it can help.

@siyuanfoundation siyuanfoundation changed the title move buffer dedupe before overwriting read buffer. skip buffer dedupe for seq buffer. Jan 17, 2024
@siyuanfoundation
Copy link
Contributor Author

Hi @siyuanfoundation you can try to limit cpu by the following command and use patch #17248

EXPECT_DEBUG=true GO_TEST_FLAGS=${GO_TEST_FLAGS} RESULTS_DIR=/tmp/results taskset -c 0,1,2 make test-robustness

Hope it can help.

Thank you @fuweid, it did help!

@siyuanfoundation siyuanfoundation marked this pull request as ready for review January 17, 2024 05:18
@siyuanfoundation
Copy link
Contributor Author

cc @serathius @ahrtr @fuweid

@siyuanfoundation siyuanfoundation changed the title skip buffer dedupe for seq buffer. fix buffer dedupe bug. Jan 17, 2024
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

server/storage/backend/batch_tx_test.go Show resolved Hide resolved
@ahrtr
Copy link
Member

ahrtr commented Jan 18, 2024

Please read #17247 (comment)

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siyuanfoundation siyuanfoundation marked this pull request as draft January 19, 2024 17:23
@siyuanfoundation siyuanfoundation changed the title fix buffer dedupe bug. Fix tx buffer inconsistency if there are unordered key writes in one tx. Jan 25, 2024
@siyuanfoundation siyuanfoundation marked this pull request as ready for review January 25, 2024 23:40
@siyuanfoundation
Copy link
Contributor Author

/retest

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@serathius
Copy link
Member

The change looks good to me, it fixes a clear problem and adds regression tests, however #17228 did the same. Can we take some additional steps to ensure we prevent another regression?

I don't we can depend on reviewers trying to analyse and find edge cases as it failed before. One idea I have is to do comparison testing between buffered and unbuffered transaction. What do you think?

@siyuanfoundation
Copy link
Contributor Author

siyuanfoundation commented Mar 25, 2024

cc @ahrtr
Added benchmark test for writeback function
Ran the test 3 times. The results are kind of noisy, but there is no clear indication of performance regression.
Before the PR:

BenchmarkWritebackSeqBatches1BatchSize10000-8      	     100	 132904368, 137525118, 134134904  ns/op
BenchmarkWritebackSeqBatches10BatchSize1000-8      	     100	 135975057, 144034473, 138393072 ns/op
BenchmarkWritebackSeqBatches100BatchSize100-8      	     100	 145159646, 147396648, 148373301 ns/op
BenchmarkWritebackSeqBatches1000BatchSize10-8      	     100	 225271045, 225629845, 229453516 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize1-8    	     100	  29684442, 29051643, 31323272 ns/op
BenchmarkWritebackNonSeqBatches10000BatchSize1-8   	     100	1011180750, 1012368726, 1013996570 ns/op
BenchmarkWritebackNonSeqBatches100BatchSize10-8    	     100	  22771697, 22785210, 22529997 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize10-8   	     100	 233131758, 233625789, 238527133 ns/op

After the PR:

enchmarkWritebackSeqBatches1BatchSize10000-8      	     100	 136669713, 131847663, 130722632 ns/op
BenchmarkWritebackSeqBatches10BatchSize1000-8      	     100	 139880124, 139196486, 137805060 ns/op
BenchmarkWritebackSeqBatches100BatchSize100-8      	     100	 150674477, 146649422, 146416992 ns/op
BenchmarkWritebackSeqBatches1000BatchSize10-8      	     100	 226596374, 228097908, 223997282 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize1-8    	     100	  30098653, 30031651, 29341949 ns/op
BenchmarkWritebackNonSeqBatches10000BatchSize1-8   	     100	 990611851, 999573564, 994189677 ns/op
BenchmarkWritebackNonSeqBatches100BatchSize10-8    	     100	  22655668, 22641929, 21832902 ns/op
BenchmarkWritebackNonSeqBatches1000BatchSize10-8   	     100	 234623372, 233671979, 231825643 ns/op

@siyuanfoundation
Copy link
Contributor Author

siyuanfoundation commented Mar 26, 2024

The change looks good to me, it fixes a clear problem and adds regression tests, however #17228 did the same. Can we take some additional steps to ensure we prevent another regression?

I don't we can depend on reviewers trying to analyse and find edge cases as it failed before. One idea I have is to do comparison testing between buffered and unbuffered transaction. What do you think?

cc @serathius
This bug happened because of the coincidence of Commit() timing between two different buckets, like in the tests I had to do two consecutive tx.Commit() to reproduce the bug. It is very hard to ensure there is no regression deterministically. In this case, we just have to be careful with more unit tests and rely on non-deterministic robustness test to detect the regression.

@serathius
Copy link
Member

Ran the test 3 times. The results are kind of noisy, but there is no clear indication of performance regression.
Before the PR:

Can you use benchstat ? https://sourcegraph.com/blog/go/gophercon-2019-optimizing-go-code-without-a-blindfold

@siyuanfoundation
Copy link
Contributor Author

Ran the test 3 times. The results are kind of noisy, but there is no clear indication of performance regression.
Before the PR:

Can you use benchstat ? https://sourcegraph.com/blog/go/gophercon-2019-optimizing-go-code-without-a-blindfold

Ok. I ran

go clean -testcache && go test -bench=. -count=10 -timeout 0 > benchmark_results.txt
benchstat benchmark_results.txt

Here are the results: (they are basically the same)
Before

WritebackSeqBatches1BatchSize10000-8               128.2m ± 1%
WritebackSeqBatches10BatchSize1000-8               130.7m ± 1%
WritebackSeqBatches100BatchSize100-8               139.5m ± 1%
WritebackSeqBatches1000BatchSize10-8               220.6m ± 4%
WritebackNonSeqBatches1000BatchSize1-8             28.34m ± 2%
WritebackNonSeqBatches10000BatchSize1-8             1.002 ± 1%
WritebackNonSeqBatches100BatchSize10-8             21.64m ± 2%
WritebackNonSeqBatches1000BatchSize10-8            226.3m ± 1%

After

WritebackSeqBatches1BatchSize10000-8              128.3m ± 1%
WritebackSeqBatches10BatchSize1000-8              130.8m ± 3%
WritebackSeqBatches100BatchSize100-8              140.0m ± 1%
WritebackSeqBatches1000BatchSize10-8              216.3m ± 4%
WritebackNonSeqBatches1000BatchSize1-8            27.49m ± 2%
WritebackNonSeqBatches10000BatchSize1-8           966.6m ± 1%
WritebackNonSeqBatches100BatchSize10-8            20.47m ± 2%
WritebackNonSeqBatches1000BatchSize10-8           221.3m ± 1%

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks

@serathius
Copy link
Member

Here are the results: (they are basically the same)

Benchstat do not change the results, it just makes them readable and makes it easy to check if they are statistically significant

If you provide both old and new benchmark results to benchstat, it will provide also the performance delta and p-value

}
if !isSeq {
shuffleList(ks)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: When benchmark needs some input data, it's best to avoid not include it in the measurements of the benchmark to avoid noise.

You can do that by preparing the input data first, and then calling b.ResetTimer() before running the code you want to benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to have a regression tests, however I'm thinking that the scenarios we are covering are so intricate and complicated that it scares me. If there is another issue hiding in the transaction logic, we don't have any chance of finding it using this approach.

Signed-off-by: Siyuan Zhang <sizhang@google.com>
Signed-off-by: Siyuan Zhang <sizhang@google.com>
…bucket.

Signed-off-by: Siyuan Zhang <sizhang@google.com>
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for adding benchmark test suites.

@siyuanfoundation
Copy link
Contributor Author

Here are the results: (they are basically the same)

Benchstat do not change the results, it just makes them readable and makes it easy to check if they are statistically significant

If you provide both old and new benchmark results to benchstat, it will provide also the performance delta and p-value

Reran benchstat twice with count=10,
run 1:

                                        │ benchmark_before.txt │        benchmark_after.txt         │
                                        │        sec/op        │   sec/op     vs base               │
WritebackSeqBatches1BatchSize10000-8               127.7m ± 1%   127.2m ± 3%       ~ (p=0.796 n=10)
WritebackSeqBatches10BatchSize1000-8               129.9m ± 6%   129.9m ± 1%       ~ (p=0.971 n=10)
WritebackSeqBatches100BatchSize100-8               138.6m ± 1%   137.0m ± 1%  -1.17% (p=0.029 n=10)
WritebackSeqBatches1000BatchSize10-8               215.7m ± 1%   212.7m ± 4%       ~ (p=0.052 n=10)
WritebackNonSeqBatches1000BatchSize1-8             28.48m ± 2%   28.05m ± 2%  -1.48% (p=0.035 n=10)
WritebackNonSeqBatches10000BatchSize1-8            981.0m ± 1%   949.6m ± 0%  -3.20% (p=0.000 n=10)
WritebackNonSeqBatches100BatchSize10-8             20.34m ± 1%   21.21m ± 5%  +4.27% (p=0.000 n=10)
WritebackNonSeqBatches1000BatchSize10-8            216.3m ± 1%   216.4m ± 4%       ~ (p=0.796 n=10)

run 2:

                                        │ benchmark_before_0.txt │       benchmark_after_0.txt        │
                                        │         sec/op         │   sec/op     vs base               │
WritebackSeqBatches1BatchSize10000-8                 127.8m ± 3%   130.0m ± 2%  +1.73% (p=0.023 n=10)
WritebackSeqBatches10BatchSize1000-8                 127.2m ± 1%   130.9m ± 3%  +2.95% (p=0.001 n=10)
WritebackSeqBatches100BatchSize100-8                 135.4m ± 1%   139.2m ± 3%  +2.76% (p=0.001 n=10)
WritebackSeqBatches1000BatchSize10-8                 212.4m ± 2%   213.1m ± 0%       ~ (p=0.165 n=10)
WritebackNonSeqBatches1000BatchSize1-8               27.82m ± 2%   28.80m ± 3%  +3.51% (p=0.001 n=10)
WritebackNonSeqBatches10000BatchSize1-8              970.1m ± 2%   951.8m ± 1%  -1.89% (p=0.000 n=10)
WritebackNonSeqBatches100BatchSize10-8               20.23m ± 1%   20.39m ± 5%       ~ (p=0.165 n=10)
WritebackNonSeqBatches1000BatchSize10-8              217.1m ± 1%   216.6m ± 1%       ~ (p=0.529 n=10)

@serathius
Copy link
Member

serathius commented Mar 28, 2024

So in the updated benchmark results you can see that it causes ~2% regression in performance. Still I think impact on overall etcd performance should be not noticable and we should pick correctness here.

@serathius serathius merged commit a22ae62 into etcd-io:main Mar 28, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants