Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance of UUID generation is slow – implementation uses getrandom syscall #109397

Closed
srosenberg opened this issue Aug 24, 2023 · 7 comments · Fixed by #124271
Closed

performance of UUID generation is slow – implementation uses getrandom syscall #109397

srosenberg opened this issue Aug 24, 2023 · 7 comments · Fixed by #124271
Assignees
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. E-quick-win Likely to be a quick win for someone experienced. E-starter Might be suitable for a starter project for new employees or team members. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@srosenberg
Copy link
Member

srosenberg commented Aug 24, 2023

UUID (gen_random_uuid() and uuid_v4()) [1] are currently implemented using crypto/rand. While this provides a strong guarantee wrt security, it's highly inefficient. UUID v4 is a randomly generated 128-bit value; nothing in the spec. [2] dictates that the implementation must use a cryptographically secure PRNG.

The current implementation effectively requires a system call (getrandom) for every new value of gen_random_uuid(). This was originally observed in [3]. I have independently stumbled upon it while straceing an execution of,

./workload fixtures import tpcc  --warehouses=100

which induced ~6 million syscalls to getrandom! The reason is we use gen_random_uuid() for automatically generating the rowid column in the history table [4].

Microbenchmarks

Using the existing microbenchmarks BenchmarkMakeV4 and BenchmarkFastMakeV4 on c5.9xlarge (with PMU), we can see that the performance is > 10x slower; cf. ~811 ns/op vs. 54 ns/op.

 perf stat -r 3 -C2 taskset -c 2 uuid_test -test.v --test.benchtime=10000000x -test.bench 'BenchmarkMakeV4' -test.run '^$'
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
BenchmarkMakeV4
BenchmarkMakeV4 	10000000	       811.3 ns/op
PASS
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
BenchmarkMakeV4
BenchmarkMakeV4 	10000000	       812.2 ns/op
PASS
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
BenchmarkMakeV4
BenchmarkMakeV4 	10000000	       812.3 ns/op
PASS

 Performance counter stats for 'CPU(s) 2' (3 runs):

           8128.12 msec cpu-clock                 #    1.000 CPUs utilized            ( +-  0.09% )
             48426      context-switches          #    5.953 K/sec                    ( +-  2.29% )
                 2      cpu-migrations            #    0.246 /sec
              1976      page-faults               #  242.898 /sec                     ( +- 19.25% )
       27698246183      cycles                    #    3.405 GHz                      ( +-  0.02% )
       32737716985      instructions              #    1.18  insn per cycle           ( +-  0.03% )
        3791218261      branches                  #  466.032 M/sec                    ( +-  0.06% )
          14589289      branch-misses             #    0.38% of all branches          ( +-  0.23% )

           8.12496 +- 0.00284 seconds time elapsed  ( +-  0.03% )
perf stat -r 3 -C2 taskset -c 2 uuid_test -test.v --test.benchtime=10000000x -test.bench 'BenchmarkFastMakeV4' -test.run '^$'
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
BenchmarkFastMakeV4
BenchmarkFastMakeV4 	10000000	        53.98 ns/op
PASS
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
BenchmarkFastMakeV4
BenchmarkFastMakeV4 	10000000	        53.98 ns/op
PASS
goos: linux
goarch: amd64
cpu: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
BenchmarkFastMakeV4
BenchmarkFastMakeV4 	10000000	        53.97 ns/op
PASS

 Performance counter stats for 'CPU(s) 2' (3 runs):

            544.35 msec cpu-clock                 #    0.999 CPUs utilized            ( +-  2.10% )
               254      context-switches          #  456.941 /sec                     ( +-  0.73% )
                 2      cpu-migrations            #    3.598 /sec
               798      page-faults               #    1.436 K/sec                    ( +-  0.11% )
        1875740389      cycles                    #    3.374 GHz                      ( +-  0.03% )
        4910290809      instructions              #    2.62  insn per cycle           ( +-  0.00% )
        1013553042      branches                  #    1.823 G/sec                    ( +-  0.00% )
            168504      branch-misses             #    0.02% of all branches          ( +-  0.91% )

          0.544744 +- 0.000349 seconds time elapsed  ( +-  0.06% )

[1] https://www.cockroachlabs.com/docs/stable/uuid
[2] https://www.ietf.org/rfc/rfc4122.txt
[3] #30236 (comment)
[4]

rowid uuid not null default gen_random_uuid(),

Jira issue: CRDB-30899

@srosenberg srosenberg added the C-performance Perf of queries or internals. Solution not expected to change functional behavior. label Aug 24, 2023
@srosenberg
Copy link
Member Author

srosenberg commented Aug 24, 2023

NOTE: there might be wrong assumptions in the code which assume that the underlying implementation of uuid.MakeV4() will produce zero collisions. One such assumption is in [1]. There are likely others which would need to be audited. Also see the discussion in slack [2].

[1]

ret[i] = Var(hex.EncodeToString(uuid.MakeV4().GetBytes()))

[2] https://cockroachlabs.slack.com/archives/C4X2J0RH6/p1692725321047249

@srosenberg
Copy link
Member Author

@cockroachdb/sql-foundations

@knz knz added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Aug 24, 2023
@rafiss
Copy link
Collaborator

rafiss commented Aug 24, 2023

there might be wrong assumptions in the code which assume that the underlying implementation of uuid.MakeV4() will produce zero collisions.

We actually intentionally assume collision free UUIDs in some parts of the code. For example, the optimizer will skip certain uniqueness checks if you have a UUID primary key in some cases. @mgartner can describe that better, and also may have an opinion on this issue.

@mgartner
Copy link
Collaborator

Yes, that's an assumption we make by default for REGIONAL BY ROW tables that have unique keys with a default value of gen_random_uuid() or are updated/inserted with gen_random_uuid(). There's an option to opt out of this optimization if the risk is collisions is too high for a particular use case, but we believe it to be exceedingly unlikely that there will be a collision assuming a proper implementation of the UUID generator. If there's a known flaw, we need to do something about it.

@rafiss
Copy link
Collaborator

rafiss commented Sep 13, 2023

Also note that for PG compat, we may need to keep UUID generation through SQL as cryptographically secure https://security.stackexchange.com/questions/93902/is-postgress-uuid-generate-v4-securely-random/93905#93905

@rafiss
Copy link
Collaborator

rafiss commented Sep 19, 2023

We can make a cluster setting that determines which random source to use for gen_random_uuid.

@srosenberg
Copy link
Member Author

srosenberg commented May 8, 2024

As of 1.22, math/rand is based on a crypto block cipher, namely ChaCha8Rand [1],

Using Go 1.22, the new ChaCha8Rand generator is seeded from 256 bits of entropy and can generate 2²⁵⁶ possible first UUIDs. It does not need to worry about collisions.

Thus, we should be able to switch to math/rand without further changes, plus gain a non-trivial perf. boost.

[1] https://go.dev/blog/chacha8rand

@rafiss rafiss added E-starter Might be suitable for a starter project for new employees or team members. E-quick-win Likely to be a quick win for someone experienced. labels May 8, 2024
@rafiss rafiss self-assigned this May 15, 2024
@craig craig bot closed this as completed in 13dec2a May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. E-quick-win Likely to be a quick win for someone experienced. E-starter Might be suitable for a starter project for new employees or team members. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants