Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix concurrent map panic when querying and committing concurrently #34

Merged
merged 1 commit into from
Mar 9, 2022

Conversation

p0mvn
Copy link
Member

@p0mvn p0mvn commented Mar 9, 2022

benchstat:

name                                                                  old time/op    new time/op    delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-16    2.99µs ± 4%    2.98µs ± 7%      ~     (p=0.690 n=5+5)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-16    9.93µs ± 5%    9.60µs ± 5%      ~     (p=0.095 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-fast-16                     449ns ±12%     354ns ± 7%   -21.22%  (p=0.008 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-slow-16                    13.3µs ± 1%    13.3µs ± 4%      ~     (p=1.000 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-fast-16                     64.0ms ± 2%    64.7ms ± 5%      ~     (p=0.548 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-16                      1.22s ±14%     1.22s ±10%      ~     (p=1.000 n=5+5)
Medium/goleveldb-100000-100-16-40/update-16                              160µs ±13%     189µs ±23%      ~     (p=0.095 n=5+5)
Medium/goleveldb-100000-100-16-40/block-16                              22.6ms ± 8%    23.5ms ± 4%      ~     (p=0.310 n=5+5)

name                                                                  old alloc/op   new alloc/op   delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-16      814B ± 0%      814B ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-16    1.41kB ± 1%    1.41kB ± 1%      ~     (p=0.952 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-fast-16                    14.6B ±112%      0.0B       -100.00%  (p=0.008 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-slow-16                    2.00kB ± 1%    1.99kB ± 0%      ~     (p=0.183 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-fast-16                     29.3MB ± 0%    29.3MB ± 0%      ~     (p=0.690 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-16                      276MB ± 0%     276MB ± 0%    -0.02%  (p=0.029 n=4+4)
Medium/goleveldb-100000-100-16-40/update-16                             48.8kB ± 5%    49.9kB ± 4%      ~     (p=0.421 n=5+5)
Medium/goleveldb-100000-100-16-40/block-16                              6.24MB ± 7%    6.20MB ± 6%      ~     (p=1.000 n=5+5)

name                                                                  old allocs/op  new allocs/op  delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-16      16.0 ± 0%      16.0 ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-16      24.0 ± 0%      24.0 ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-fast-16                      0.00           0.00           ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-slow-16                      34.0 ± 0%      34.0 ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/iteration-fast-16                       523k ± 0%      523k ± 0%      ~     (p=0.167 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-16                      4.71M ± 0%     4.71M ± 0%      ~     (p=0.171 n=4+4)
Medium/goleveldb-100000-100-16-40/update-16                                527 ±12%       543 ±12%      ~     (p=0.421 n=5+5)
Medium/goleveldb-100000-100-16-40/block-16                               70.6k ± 7%     70.0k ± 5%      ~     (p=1.000 n=5+5)
roman@akhtariev:~/cosmos/iavl (dev/iavl_data_locality)$ git branch roman/concurrent-map-fix
roman@akhtariev:~/cosmos/iavl (dev/iavl_data_locality)$ git checkout concurrent-map-fix
error: pathspec 'concurrent-map-fix' did not match any file(s) known to git
roman@akhtariev:~/cosmos/iavl (dev/iavl_data_locality)$ git checkout roman/concurrent-map-fix
M       benchmarks/bench_test.go
M       mutable_tree.go
Switched to branch 'roman/concurrent-map-fix'
roman@akhtariev:~/cosmos/iavl (roman/concurrent-map-fix)$ /usr/local/go/bin/go test -count=5 -benchmem -run=^$ -bench ^BenchmarkMedium$ github.com/cosmos/iavl/benchmarks > new2.log
roman@akhtariev:~/cosmos/iavl (roman/concurrent-map-fix)$ benchstat old.log new2.logname                                                                  old time/op    new time/op    delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-16    2.99µs ± 4%    2.92µs ± 4%      ~     (p=0.222 n=5+5)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-16    9.93µs ± 5%    9.76µs ± 2%      ~     (p=0.222 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-fast-16                     449ns ±12%     355ns ± 2%   -20.95%  (p=0.008 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-slow-16                    13.3µs ± 1%    13.2µs ± 2%      ~     (p=0.548 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-fast-16                     64.0ms ± 2%    62.8ms ± 4%      ~     (p=0.548 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-16                      1.22s ±14%     1.21s ±12%      ~     (p=1.000 n=5+5)
Medium/goleveldb-100000-100-16-40/update-16                              160µs ±13%     172µs ±18%      ~     (p=0.095 n=5+5)
Medium/goleveldb-100000-100-16-40/block-16                              22.6ms ± 8%    24.0ms ± 4%      ~     (p=0.095 n=5+5)

name                                                                  old alloc/op   new alloc/op   delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-16      814B ± 0%      814B ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-16    1.41kB ± 1%    1.41kB ± 0%      ~     (p=0.183 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-fast-16                    14.6B ±112%      0.0B       -100.00%  (p=0.008 n=5+5)
Medium/goleveldb-100000-100-16-40/query-hits-slow-16                    2.00kB ± 1%    1.99kB ± 0%      ~     (p=0.135 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-fast-16                     29.3MB ± 0%    29.3MB ± 0%      ~     (p=0.881 n=5+5)
Medium/goleveldb-100000-100-16-40/iteration-slow-16                      276MB ± 0%     276MB ± 0%      ~     (p=0.200 n=4+4)
Medium/goleveldb-100000-100-16-40/update-16                             48.8kB ± 5%    51.3kB ± 8%      ~     (p=0.151 n=5+5)
Medium/goleveldb-100000-100-16-40/block-16                              6.24MB ± 7%    6.35MB ± 3%      ~     (p=0.841 n=5+5)

name                                                                  old allocs/op  new allocs/op  delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-16      16.0 ± 0%      16.0 ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-16      24.0 ± 0%      24.0 ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-fast-16                      0.00           0.00           ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-slow-16                      34.0 ± 0%      34.0 ± 0%      ~     (all equal)
Medium/goleveldb-100000-100-16-40/iteration-fast-16                       523k ± 0%      523k ± 0%      ~     (p=0.571 n=5+4)
Medium/goleveldb-100000-100-16-40/iteration-slow-16                      4.71M ± 0%     4.71M ± 0%    -0.00%  (p=0.029 n=4+4)
Medium/goleveldb-100000-100-16-40/update-16                                527 ±12%       549 ±13%      ~     (p=0.690 n=5+5)
Medium/goleveldb-100000-100-16-40/block-16                               70.6k ± 7%     71.5k ± 2%      ~     (p=1.000 n=5+5)

Perf tested using https://github.com/p0mvn/perf-osmo with the following settings:

numConnections: 100
numCallsPerConnection: 10000
heightsToCover: 1000

Meaning, it was concurrently sending 10000 queries over the last 1000 heights chosen randomly through 100 connections. No failure has occurred.

Also tried the following:

  numConnections: 200
  numCallsPerConnection: 1000
  heightsToCover: 1000

Sending 1000 queries over the last 1000 heights through 200 connections.

Result: The deadlock has not occurred and the node handled load well

@p0mvn p0mvn requested a review from ValarDragon March 9, 2022 16:54
Copy link
Member

@ValarDragon ValarDragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this makes sense, so we update the fast node cache during commit not during processing, which already takes a mutex on the whole tree/cache

@ValarDragon ValarDragon merged commit d136ed6 into dev/iavl_data_locality Mar 9, 2022
@ValarDragon ValarDragon deleted the roman/concurrent-map-fix branch March 9, 2022 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants