Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for xxhash #38

Merged
merged 1 commit into from
Oct 7, 2022
Merged

Add support for xxhash #38

merged 1 commit into from
Oct 7, 2022

Conversation

varunkumar
Copy link

This PR adds support for xxhash, one of the fastest non-cryptographic hashes. It replaces md5 hash function for 128 bits.

Attaching benchmark runs. xxh3_128 is consistently faster than md5. As suggested in this issue, xxh3_128 can be made default in new versions of this module.

hashfn: openssl_md5

Iteration-1:
hashfn: openssl_md5
0.148 seconds to add to capacity,  674661.65 entries/second
Number of 1 bits: 271374
Number of 0 bits: 207882
Number of Filter Bits: 479256
Number of slices: 4
Bits per slice: 119814
------
Fraction of 1 bits at capacity: 0.566
0.111 seconds to check false positives,  897309.76 checks/second
Requested FP rate: 0.1000
Experimental false positive rate: 0.1048
Projected FP rate (Goel/Gupta): 0.102603
-------------------

Iteration-2:
hashfn: openssl_md5
0.154 seconds to add to capacity,  651165.07 entries/second
Number of 1 bits: 271374
Number of 0 bits: 207882
Number of Filter Bits: 479256
Number of slices: 4
Bits per slice: 119814
------
Fraction of 1 bits at capacity: 0.566
0.111 seconds to check false positives,  900389.84 checks/second
Requested FP rate: 0.1000
Experimental false positive rate: 0.1048
Projected FP rate (Goel/Gupta): 0.102603
-------------------

hashfn: xxh3_128

Iteration-1:
hashfn: xxh3_128
0.133 seconds to add to capacity,  751715.88 entries/second
Number of 1 bits: 271006
Number of 0 bits: 208250
Number of Filter Bits: 479256
Number of slices: 4
Bits per slice: 119814
------
Fraction of 1 bits at capacity: 0.565
0.095 seconds to check false positives, 1049305.27 checks/second
Requested FP rate: 0.1000
Experimental false positive rate: 0.1025
Projected FP rate (Goel/Gupta): 0.102603
-------------------

Iteration-2:
hashfn: xxh3_128
0.137 seconds to add to capacity,  729703.06 entries/second
Number of 1 bits: 271006
Number of 0 bits: 208250
Number of Filter Bits: 479256
Number of slices: 4
Bits per slice: 119814
------
Fraction of 1 bits at capacity: 0.565
0.094 seconds to check false positives, 1061892.13 checks/second
Requested FP rate: 0.1000
Experimental false positive rate: 0.1025
Projected FP rate (Goel/Gupta): 0.102603
-------------------

Iteration-3:
hashfn: xxh3_128
0.135 seconds to add to capacity,  738980.23 entries/second
Number of 1 bits: 271006
Number of 0 bits: 208250
Number of Filter Bits: 479256
Number of slices: 4
Bits per slice: 119814
------
Fraction of 1 bits at capacity: 0.565
0.094 seconds to check false positives, 1060439.67 checks/second
Requested FP rate: 0.1000
Experimental false positive rate: 0.1025
Projected FP rate (Goel/Gupta): 0.102603

@joseph-fox
Copy link
Owner

@varunkumar many thanks for this PR. Are you happy to add a brief note on replacing MD5 with xxh3_128 to the readme?

@joseph-fox joseph-fox merged commit f6e93c6 into joseph-fox:master Oct 7, 2022
@varunkumar
Copy link
Author

@joseph-fox Sure. Have you published a new version? Can you please help me with the version? I can add a brief note about the breaking change with that version.

@joseph-fox
Copy link
Owner

@varunkumar it would be very helpful if you could add the note before we release the new version, so others will be able to read your updates on pypi. Thank you.

@varunkumar
Copy link
Author

Mind adding hackoctoberfest labels to this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants