-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid hashing more than is strictly necessary, in the compiler. #56308
Comments
I have a branch which adds functions for interning based on this PR. I suggest that further work be based on that. cc @nnethercote |
@Zoxc That is, change the implementation of the |
Sure. I think we have to change things to use a |
Oh, another place where we could take advantage of computing the hash only once is the query engine, in case we had to compute a query, and we're inserting the result. |
Another note, not directly related to interners: We have some things in the compiler that already contain high-quality hash values of themselves. E.g. it's unnecessary work to hash a rust/src/librustc/ty/context.rs Line 919 in 691a7f8
There are probably a few additional, similar cases. |
@michaelwoerister I think we can maybe use a wrapper for the hash state that uses specialization to avoid double-hashing, in that case, instead of relying on |
I think the raw_entry API would be the nicer way to do it. It seems to be tailored to support a case just like this (among other use cases). |
@michaelwoerister Yes but it would require making an entire wrapper of the |
What would your solution look like? |
Use raw_entry for more efficient interning Fixes rust-lang#56308 (comment)
I'll reopen this because the discussion mentioned things other than interners. Also, I still have to write those code examples for @michaelwoerister. |
Avoid rehashing Fingerprint as a map key This introduces a no-op `Unhasher` for map keys that are already hash- like, for example `Fingerprint` and its wrapper `DefPathHash`. For these we can directly produce the `u64` hash for maps. The first use of this is `def_path_hash_to_def_id: Option<UnhashMap<DefPathHash, DefId>>`. cc rust-lang#56308 r? @eddyb
The interners currently do use the raw entry APIs:
|
We should still look for other uses of hash maps with keys that are already hashes |
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang#56308
I ran some analysis to find the maps changed in #120076 by looking for things put into HashMaps that looked like they were already hashes, using a few heuristics (e.g., similar number of 1 and 0 bits). I think that found most or all of the problem places. I don't see a good way to assert we don't regress there -- with a TypeId for non 'static things, we could assert in Hash64/Hash128 etc impls that we're not hashing them with FxHash (or that we're only hashing them with Unhash). But that assert doesn't exist today, and I don't think adding e.g. specialization or a doc(hidden) method on Hasher in std makes sense for this at this time. I think the next big opportunity for reducing hashing would be maps with dense, relatively small integer valued keys, where we actually don't need to hash because they're sufficiently dense that a vec makes more sense. We saw big wins with #119977 from both (a) removing the hashing and (b) improved cache locality due to keeping things closer together. But those typically aren't as easy of a transformation to make, and to some extent are a time/space tradeoff. |
WIP: vendor rustc-hash and override str/length_prefix methods Locally avoiding length and str sentinel hashing adds 2,611,200 cases of hashes with just one add_to_hash call (out of roughly 43,063,296 total cases). Unclear what kind of effect this will have on runtime performance at this time, but seems like it's worth trying! cc rust-lang#56308
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang#56308
WIP: vendor rustc-hash and override str/length_prefix methods Locally avoiding length and str sentinel hashing adds 2,611,200 cases of hashes with just one add_to_hash call (out of roughly 43,063,296 total cases). Unclear what kind of effect this will have on runtime performance at this time, but seems like it's worth trying! cc rust-lang#56308
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang/rust#56308
Use UnhashMap for a few more maps This avoids a few cases of hashing data that's already hashed. cc rust-lang/rust#56308
It should be possible to build cheaper interners than what we currently have, using
raw_entry
.cc @rust-lang/compiler @gankro @Amanieu
The text was updated successfully, but these errors were encountered: