Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make FxHash the default HashMap hasher #7107

Closed
wants to merge 6 commits into from

Conversation

james7132
Copy link
Member

@james7132 james7132 commented Jan 6, 2023

Objective

Speed up all HashMap queries in the engine. Shrink the dependency tree.

Solution

Instead of using aHash as the default hasher for all of the engine's HashMap and HashSets, use FxHash instead. This collapses StableHashMap and StableHashSet into the normal HashMap/HashSet.

This also removes the fxhash dependency on bevy_ecs, instead having bevy_utils depend on rustc-hash instead, which is better maintained, hit 1.0 already, and has zero dependencies.

This does mean that none of the types should now be exposed to external untrusted inputs (i.e. networking), since FxHash is not HashDOS resistant.

Performance

From an external source, performance could increase as much as 4% in HashMap bound code.


Changelog

Removed: bevy::utils::AHash.
Removed: bevy::utils::StableHashMap.
Removed: bevy::utils::StableHashSet.

Migration Guide

If you were relying on bevy::utils::StableHashMap or bevy::utils::StableHashSet, use bevy::utils::HashMap and bevy::utils::HashSet instead. The default hash map and set now have a stable iteration order.

@james7132 james7132 added C-Performance A change motivated by improving speed, memory usage or compile times A-Utils Utility functions and types labels Jan 6, 2023
@cart
Copy link
Member

cart commented Jan 6, 2023

In the early days of Bevy I chose aHash over FXHash because it performed measurably better for small index-sized keys (this was back when we were using HashMap<Entity, EntityLocation> with "stable entity uuids")
The scope of the benchmarks was limited to bevy_ecs. Now that this is piped through "everywhere", I'm curious to see what the perf difference is.

@james7132
Copy link
Member Author

From a gut instinct, the only cases where we weren't using FxHash already is in reflection and assets, and by proxy rendering. The only major use in a hot location in ECS now is just Components, which already uses FxHash.

Rendering, in particular, I don't expect large changes since the bottleneck lies elsewhere.

@AxiomaticSemantics
Copy link
Contributor

AxiomaticSemantics commented Jan 6, 2023

In the early days of Bevy I chose aHash over FXHash because it performed measurably better for small index-sized keys (this was back when we were using HashMap<Entity, EntityLocation> with "stable entity uuids") The scope of the benchmarks was limited to bevy_ecs. Now that this is piped through "everywhere", I'm curious to see what the perf difference is.

I've noticed no change at all in code that is mutating 1_000_000 entities in 500 systems, if that means anything. rustc_hash is used by the rust compiler internally, it was specifically created for it's speed without relying on architecture specific instruction like AES.

@recatek
Copy link
Contributor

recatek commented Jan 6, 2023

Adding some context from the discord discussion the other day around this: Gut check from reading the related material is that FxHash is faster for the act of hashing on small keys (not necessarily larger keys), whereas aHash is more likely to aid in avoiding collisions in storage and lookup. The difference between the two is minor and it's difficult to determine the best without profiling. Consolidating dependencies would also be good in its own right.

If we do end up doing benchmark comparisons on this, might be worth throwing xxhash-rust into the mix.

@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jan 8, 2023
@alice-i-cecile
Copy link
Member

bors try

bors bot added a commit that referenced this pull request Jan 8, 2023
@bors
Copy link
Contributor

bors bot commented Jan 8, 2023

try

Build failed:

@alice-i-cecile alice-i-cecile removed the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jan 8, 2023
@alice-i-cecile
Copy link
Member

I'm happy to merge this without further benchmarking based on @AxiomaticSemantics's experiments once CI is passing.

@james7132
Copy link
Member Author

I actually want to fully test this on some stress tests outside of pure ECS tests, especially for rendering, since the only real API heavily reliant on HashMaps in the ECS is World/EntityRef::get and friends, and those are already using FxHash. I just haven't had the time to run them.

@james7132
Copy link
Member Author

james7132 commented Jan 17, 2023

Finally had some time to dig into this. Compared this against main with the typical many_foxes stress test with the stress-test profile.

On a frame level, there seems to be a ~2.9% (4.08ms -> 3.96ms) performance improvement.

image

When looking at main_opaque_pass_3d, the render graph node in particular that is fairly heavily reliant on faster HashMap lookups, there is a similar 2% improvement (341.64us -> 334.54us)

image

queue_material_meshes, which is also reliant on the same hashmap lookups, improved ~7% (84.93us -> 79.55%).

These results are in line with the expectations in the aforementioned performance guidance about hashing (single digit percentage improvements in use cases bound by HashMap lookup.

With that said, this is still focused primarily on a case where there's really only one entry in the HashMap, so it's probably benefiting from the faster hashing, but it's unclear if the quality of it's output will have a more pronounced impact on more varied scenes.

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those benchmarks have convinced me; feel free to merge once you pass CI.

@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jan 18, 2023
@james7132 james7132 removed the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jan 18, 2023
@james7132
Copy link
Member Author

Holding off from merging this since:

  • Recent tests against main show no reasonable difference, and the lack of the default BuildHasher impl from hashbrown is a usability regression.
  • We cannot remove ahash from our dependency tree as gpu-descriptor is reliant on it.

As it currently stands, that's probably not enough to justify merging this.

@alice-i-cecile
Copy link
Member

Agreed. I'm going to close this out for now. Dear reader, feel free to revive this PR if you have a good argument for why it should be merged.

@james7132 james7132 deleted the fxhash-default branch March 14, 2023 04:18
github-merge-queue bot pushed a commit that referenced this pull request Feb 5, 2024
# Objective

We currently over/underpromise hash stability:
- `HashMap`/`HashSet` use `BuildHasherDefault<AHasher>` instead of
`RandomState`. As a result, the hash is stable within the same run.
- [aHash isn't stable between devices (and
versions)](https://github.com/tkaitchuck/ahash?tab=readme-ov-file#goals-and-non-goals),
yet it's used for `StableHashMap`/`StableHashSet`
- the specialized hashmaps are stable

Interestingly, `StableHashMap`/`StableHashSet` aren't used by Bevy
itself (anymore).

## Solution
Add/fix documentation

## Alternatives
For `StableHashMap`/`StableHashSet`:
- remove them
- revive #7107

---

## Changelog
- added iteration stability guarantees for different hashmaps
tjamaan pushed a commit to tjamaan/bevy that referenced this pull request Feb 6, 2024
# Objective

We currently over/underpromise hash stability:
- `HashMap`/`HashSet` use `BuildHasherDefault<AHasher>` instead of
`RandomState`. As a result, the hash is stable within the same run.
- [aHash isn't stable between devices (and
versions)](https://github.com/tkaitchuck/ahash?tab=readme-ov-file#goals-and-non-goals),
yet it's used for `StableHashMap`/`StableHashSet`
- the specialized hashmaps are stable

Interestingly, `StableHashMap`/`StableHashSet` aren't used by Bevy
itself (anymore).

## Solution
Add/fix documentation

## Alternatives
For `StableHashMap`/`StableHashSet`:
- remove them
- revive bevyengine#7107

---

## Changelog
- added iteration stability guarantees for different hashmaps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Utils Utility functions and types C-Performance A change motivated by improving speed, memory usage or compile times
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants