Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmarks are incorrect #91

Open
raulk opened this issue Nov 15, 2020 · 1 comment · May be fixed by #93
Open

benchmarks are incorrect #91

raulk opened this issue Nov 15, 2020 · 1 comment · May be fixed by #93

Comments

@raulk
Copy link
Contributor

raulk commented Nov 15, 2020

BenchmarkParallelCacheGet, BenchmarkCacheGetWithBuf, BenchmarkCacheGetFn create 256MiB caches. Then they add as many elements as b.N. The problem with this setup is that b.N is very likely going to overflow 256MiB in some iteration, which means that elements will be evacuated, and the Get operations will short-circuit with ErrNotFound (miss), rather than incurring in the full cost of the hit, which is what we're trying to measure.

Proof:

BenchmarkCacheGet
    cache_test.go:681: b.N: 1; hit rate: 1.000000
    cache_test.go:681: b.N: 100; hit rate: 1.000000
    cache_test.go:681: b.N: 10000; hit rate: 1.000000
    cache_test.go:681: b.N: 1000000; hit rate: 1.000000
    cache_test.go:681: b.N: 2904801; hit rate: 0.962555 <====

BenchmarkParallelCacheGet
    cache_test.go:725: b.N: 1; hit rate: 1.000000
    cache_test.go:725: b.N: 100; hit rate: 1.000000
    cache_test.go:725: b.N: 10000; hit rate: 1.000000
    cache_test.go:725: b.N: 1000000; hit rate: 1.000000
    cache_test.go:725: b.N: 21083162; hit rate: 0.000000 <====
    cache_test.go:725: b.N: 62143274; hit rate: 0.000000

The hard drop from 1 to 0 in BenchmarkParallelCacheGet occurs because:

  1. the benchmark populates the caches sequentially.
  2. with so many keys, only the most recent 256MiB worth of keys are retained.
  3. the benchmarks start retrieving sequentially from 0, querying for keys that have been evacuated.
  4. because go partitions the iteration count, and all parallel threads start from 0, none of them ever reaches the range whose cache entries have been retained.

In a nutshell, the benchmark numbers are not reliable. A solution to fix this would be to calculate the max entries we can add, add that many, and then modulo iterate over them when fetching.

@raulk
Copy link
Contributor Author

raulk commented Nov 15, 2020

I'm working on a PR.

raulk added a commit to raulk/freecache that referenced this issue Nov 15, 2020
@raulk raulk linked a pull request Nov 15, 2020 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant