-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace std::hash with our custom hash function #2952
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2952 +/- ##
=======================================
Coverage 93.51% 93.51%
=======================================
Files 1121 1121
Lines 42913 42919 +6
=======================================
+ Hits 40131 40137 +6
Misses 2782 2782 ☔ View full report in Codecov by Sentry. |
14b5d32
to
f127718
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -133,31 +134,52 @@ inline void Hash::operation( | |||
template<> | |||
inline void Hash::operation( | |||
const double& key, common::hash_t& result, common::ValueVector* /*keyVector*/) { | |||
result = std::hash<double>()(key); | |||
// 0 and -0 are not byte-equivalent, but should have the same hash | |||
if (key == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if adding 0.0 can also solve the "0 and -0 are not byte-equivalent" problem without an if/else
branch.
auto newKey = key + 0.0;
result = murmurhash64(*reinterpret_cast<const uint64_t*>(&newKey));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does appear to also work.
@benjaminwinger I merged this, so it can be in our dev build to be verified by the bug reporter. |
std::hash is implementation dependent and not guaranteed to be the same between different stdlibs, which means our current hash index layout is not always portable.
std::hash
also apparently is not required to produce the same result for the same input in different runs of the same program, and while there hasn't been any evidence of that yet, it could cause issues in future.From https://en.cppreference.com/w/cpp/utility/hash:
Fixes #2943.
I did a quick benchmark on a long string (first five paragraphs of lorem ipsum), and this took 123ns/iteration on average, compared to 117ns/iteration with std::hash, which is fairly reasonable.