-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compress amount of hashed bytes for isize
values in StableHasher
#93432
Compress amount of hashed bytes for isize
values in StableHasher
#93432
Conversation
This comment has been minimized.
This comment has been minimized.
ec697c8
to
7ee5c39
Compare
7ee5c39
to
720bbd6
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 720bbd6b51ab22a50553eaaac5cbdfd79b1d6694 with merge d656dfe0ca9bf66a22e275de70afd93922600fe2... |
☀️ Try build successful - checks-actions |
Queued d656dfe0ca9bf66a22e275de70afd93922600fe2 with parent 427eba2, future comparison URL. |
Finished benchmarking commit (d656dfe0ca9bf66a22e275de70afd93922600fe2): comparison url. Summary: This benchmark run shows 265 relevant improvements 🎉 to instruction counts.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. @bors rollup=never |
Performance looks great. Can you update the PR description? |
720bbd6
to
fc1b82a
Compare
Done. I also added one additional test case. Do you want me to remove |
This comment has been minimized.
This comment has been minimized.
fc1b82a
to
8de59be
Compare
By the way, I tried the same "trick" for |
I don't think it makes much of difference in practice. Just remarked on it for style reasons. @bors r+ |
📌 Commit 8de59be has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (1be5c8f): comparison url. Summary: This benchmark run shows 268 relevant improvements 🎉 to instruction counts.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
Did you verify that this gives the same results on big-endian and on little-endian platforms? We need stable hashes to be independent of endianess because of cross compilation scenarios where the standard library is compiled with on little-endian and then accessed by a compiler that uses big-endian. Unfortunately, I don't think we are testing this in CI anywhere. For the |
I didn't verify it, but the implementation converts the value to little-endian as a first step (same as before - https://github.com/rust-lang/rust/pull/93432/files#diff-4856ba39281cdd51c0a49fdce1eff68fac5a8ee6787b33bee702cda44b07c815R140). I suppose that if we always convert it to little-endian, it should work the same on big-endian systems? |
Hrrm, that's a good point. |
That's a very good point, I didn't realize that it has to be the same across big/little endian usages. So, something like this? fn write_isize(&mut self, i: isize) {
...
let value = i as u64;
if value < 0xFF {
self.write_u8(value as u8);
} else {
self.write_u8(0xFF);
hash_value(..., value.to_le());
}
} |
Yes, that looks good. The same sequence of bytes gets written, regardless of endianess. |
…r=the8472 Fix `isize` optimization in `StableHasher` for big-endian architectures This PR fixes a problem with the stable hash optimization introduced in rust-lang#93432. As `@michaelwoerister` has [found out](rust-lang#93432 (comment)), the original implementation wouldn't produce the same hash on little/big architectures. r? `@the8472`
This is another attempt to land #92103, this time hopefully with a correct implementation w.r.t. stable hashing guarantees. The previous PR was reverted because it could produce the same hash for different values even in quite simple situations. I have since added a basic test that should guard against that situation, I also added a new test in this PR, specialised for this optimization.
Why this optimization helps
Since the original PR, I have tried to analyze why this optimization even helps (and why it especially helps for
clap
). I found that the vast majority of stable-hashingi64
actually comes from hashingisize
(which is converted toi64
in the stable hasher). I only found a single place where is this datatype used directly in the compiler, and this place has also been showing up in traces that I used to find out when isisize
being hashed. This place isrustc_span::FileName::DocTest
, however, I suppose that isizes also come from other places, but they might not be so easy to find (there were some other entries in the trace).clap
hashes about 8.5 millionisize
s, and all of them fit into a single byte, which is why this optimization has helped it quite a lot.Now, I'm not sure if special casing
isize
is the correct solution here, maybe something could be done with thatisize
insideDocTest
or in other places, but that's for another discussion I suppose. In this PR, instead of hardcoding a special case insideSipHasher128
, I instead put it intoStableHasher
, and only used it forisize
(I tested that fori64
it doesn't help, or at least not forclap
and other few benchmarks that I was testing).New approach
Since the most common case is a single byte, I added a fast path for hashing
isize
values which positive value fits within a single byte, and a cold path for the rest of the values.To avoid the previous correctness problem, we need to make sure that each unique
isize
value will produce a unique hash stream to the hasher. By hash stream I mean a sequence of bytes that will be hashed (a different sequence should produce a different hash, but that is of course not guaranteed).We have to distinguish different values that produce the same bit pattern when we combine them. For example, if we just simply skipped the leading zero bytes for values that fit within a single byte,
(0xFF, 0xFFFFFFFFFFFFFFFF)
and(0xFFFFFFFFFFFFFFFF, 0xFF)
would send the same hash stream to the hasher, which must not happen.To avoid this situation, values
[0, 0xFE]
are hashed as a single byte. When we hash a larger (treatingisize
asu64
) value, we first hash an additional byte0xFF
. Since0xFF
cannot occur when we apply the single byte optimization, we guarantee that the hash streams will be unique when hashing two values(a, b)
and(b, a)
ifa != b
:a
andb
are within[0, 0xFE]
, their hash streams will be different.a
andb
are within[0, 0xFE]
, their hash streams will be different.a
is within[0, 0xFE]
andb
isn't, when we hash(a, b)
, the hash stream will definitely not begin with0xFF
. When we hash(b, a)
, the hash stream will definitely begin with0xFF
. Therefore the hash streams will be different.r? @the8472