-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to rapidhash #17422
base: main
Are you sure you want to change the base?
Upgrade to rapidhash #17422
Conversation
@microsoft-github-policy-service agree |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gating on @lhecker as the final signoff, as he's the one who chose and integrated wyahsh
Thanks for all the effort here, Nicolas! I have some concerns, plus a couple "favorite things" about what we're doing today. Right now, our implementation of wyhash is fully contained and inlined into I'm concerned that we are also only replacing the 64-bit hash function, so we're not fully pulling wyhash from our code. We will forever have two OSS hash library dependencies rather than just the one. All that said, I'd still be interested... but we are also not using it for any data larger than, say, 1kb. I'm not certain the performance boost for our specific workload is worth switching, all other things taken into consideration. If we were going to be processing untold heaps of data I might feel differently, but... eh. It's written now, but so is the code we had before. 🤷🏻 Footnotes |
Hey @DHowett, thanks a lot for the review! The global variable should not be replicated accross translation units because rapidhash.h is only used by hash.h, which states '#pragma once' before including it. For extra security, I've added '#pragma once' to rapidhash.h :) I've also just updated the rapidhash header to a newer version, which has greater compatibility with C++. Having the 64-bit hash function undocked may be a favorable thing looking forward. wyhash has been deprecated by its owner, it may be counterproductive to use unmaintained code. |
Thanks!
I'm glad that it was easy to update. That is definitely a point in its favor! FWIW, though, on maintenance: does wyhash need maintenance? Will |
It is unclear if wyhash or rapidhash will benefit from maintenance. New language standards may introduce qualifiers or notions from which the hash functions could benefit. I do agree that this change is not absolutely necesary, although it does leave the codebase modernized and will produce a slight battery life improvement on devices using the terminal. I can also change the PR to implement rapidhash in-place within hash.h, if that's preferable. Thanks, |
Upgrading 64-bit from wyhash to rapidhash.
The wyhash owner has recognized rapidhash as the official successor in its repo.
rapidhash has shown to be faster and of higher quality than wyhash.
It is also officially compatible with MSVC across all architectures.
References and Relevant Issues
rapidhash is currently the fastest recommended hash function by SMHasher.
rapidhash is currently the fastest passing hash in SMHasher3.
Collision-based study showing a collision probability lower than wyhash and close to ideal.