-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace HashMap implementation with SwissTable (as an external crate) #58623
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Is this blocked on the review of #56241 ? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Can we do a crater run to see if there are any regressions? |
Sort-of. I've been including most of the feedback from that thread into hashbrown itself. |
This comment has been minimized.
This comment has been minimized.
@bors: try While review progresses let's in parallel start some programmatic analysis! (aka perf runs, crater runs, etc) |
This comment has been minimized.
This comment has been minimized.
This issue should probably be addressed before merging: https://github.com/Amanieu/hashbrown/issues/47 |
@bors: try |
⌛ Trying commit 76003ba0c21df9b8bc889192d26c04f2568623d5 with merge abade53a649583e40ed07c26ee10652703f09b58... |
💥 Test timed out |
@rust-timer build abade53a649583e40ed07c26ee10652703f09b58 |
@rust-lang/infra shouldn't it have created an issue for the Miri toolstate failure? |
@RalfJung "test-fail" won't create an issue since there could be false positive, only "build-fail" would create an issue. |
25: Release 0.5.0, deprecated r=cuviper a=cuviper This is the last subtree merge before `hashbrown` replaced the `std` types in rust-lang/rust#58623. I am also marking this crate deprecated -- folks who want to parallelize these types going forward should use `hashbrown`'s own "rayon" feature. Co-authored-by: Florian Hartwig <florian.j.hartwig@gmail.com> Co-authored-by: Meltinglava <meltinglavaoutland@gmail.com> Co-authored-by: Mazdak Farrokhzad <twingoow@gmail.com> Co-authored-by: Andy Russell <arussell123@gmail.com> Co-authored-by: bors <bors@rust-lang.org> Co-authored-by: Guillaume Gomez <guillaume1.gomez@gmail.com> Co-authored-by: Steven Fackler <sfackler@gmail.com> Co-authored-by: John Kåre Alsaker <john.kare.alsaker@gmail.com> Co-authored-by: Hidehito Yabuuchi <hdht.ybuc@gmail.com> Co-authored-by: Corey Farwell <coreyf@rwell.org> Co-authored-by: kennytm <kennytm@gmail.com> Co-authored-by: Alexander Regueiro <alexreg@me.com> Co-authored-by: Stein Somers <git@steinsomers.be> Co-authored-by: Ryan Marcus <ryan@ryanmarc.us> Co-authored-by: Mark Rousskov <mark.simulacrum@gmail.com> Co-authored-by: Wiktor Kuchta <wiktorkuchta@protonmail.com> Co-authored-by: Anthony Ramine <n.oxyde@gmail.com> Co-authored-by: Scott McMurray <scottmcm@users.noreply.github.com>
This includes a `size_of` regression for a few DOM types, due to rust-lang/rust#58623 which replaces the implementation of `HashMap` in the standard library to Hashbrown. Although `size_of<HashMap>` grows, it’s not obvious how total memory usage is going to be impacted: Hashbrown only has one `u8` instead of one `usize` of overhead per hash table bucket for storing (part of) a hash, and so might allocate less memory itself. Hashbrown also typically has better run time performance: https://github.com/rust-lang/hashbrown#performance Still, I’ve filed rust-lang/hashbrown#69 about potentially reducing the `size_of<HashMap>` regression.
Upgrade to rustc 1.36.0-nightly (e305df184 2019-04-24) This includes a `size_of` regression for a few DOM types, due to rust-lang/rust#58623 which replaces the implementation of `HashMap` in the standard library to Hashbrown. Although `size_of<HashMap>` grows, it’s not obvious how total memory usage is going to be impacted: Hashbrown only has one `u8` instead of one `usize` of overhead per hash table bucket for storing (part of) a hash, and so might allocate less memory itself. Hashbrown also typically has better run time performance: https://github.com/rust-lang/hashbrown#performance Still, I’ve filed rust-lang/hashbrown#69 about potentially reducing the `size_of<HashMap>` regression. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/23263) <!-- Reviewable:end -->
= note: required because it appears within the type `std::marker::PhantomData<(std::rc::Rc<()>, std::rc::Rc<()>)>` | ||
= note: required because it appears within the type `std::collections::hash::table::RawTable<std::rc::Rc<()>, std::rc::Rc<()>>` | ||
= note: required because of the requirements on the impl of `std::marker::Send` for `hashbrown::raw::RawTable<(std::rc::Rc<()>, std::rc::Rc<()>)>` | ||
= note: required because it appears within the type `hashbrown::map::HashMap<std::rc::Rc<()>, std::rc::Rc<()>, std::collections::hash_map::RandomState>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand from this UI test update that the compiler may "leak" to the user that std::collections::HashMap
is actually implemented in the hashbrown
crate? I think this might be confusion to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s not especially different from the current situation of exposing implementation types like RawTable or BTreeNode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except those are all types from std
itself. So the user might see a private type name but at least they know where it's coming from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kind of thing already caused a lot of confusion with the "simple" issue of re-exporting of core
via std
. I feel like I haven't seen that problem in a while; is there some existing solution we can use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a way to do this reexport that produces a path that starts with std
, I just don't know how. There are reexports of core
/ alloc
items in std
that handle this correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::HashMap isn't a re-export, though. It's a newtype around hashbrown::Hashmap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also reexport hashbrown
as an "implementation detail" or something, but I don't know if you can get rustc
to use std
paths in error messages without it being publicly reachable, which we'd want to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also, for this error in particular, have unsafe impl<K: Send, V: Send, S: Send> Send for HashMap<K, V, S> {}
but I'm not sure that's a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if you can get rustc to use std paths in error messages without it being publicly reachable, which we'd want to avoid.
What if you re-export them as pub items, but the re-export itself is not actually publicly reachable?
Re: the fxhash crate @archseer. The maintainer seems to be inactive, so no PRs are being reviewed or accepted. Currently it also does not support no_std. |
Pkgsrc changes: * NetBSD/sparc64 disabling of "packed" removed ("packed" removed upstream) * Adapt src_libstd_build.rs patch, update sed'ing of .cargo-checksum.json Build verified on NetBSD 8.0/amd64. Upstream changes: Version 1.36.0 (2019-07-04) ========================== Language -------- - [Non-Lexical Lifetimes are now enabled on the 2015 edition.][59114] - [The order of traits in trait objects no longer affects the semantics of that object.][59445] e.g. `dyn Send + fmt::Debug` is now equivalent to `dyn fmt::Debug + Send`, where this was previously not the case. Libraries --------- - [`HashMap`'s implementation has been replaced with `hashbrown::HashMap` implem entation.][58623] - [`TryFromSliceError` now implements `From<Infallible>`.][60318] - [`mem::needs_drop` is now available as a const fn.][60364] - [`alloc::Layout::from_size_align_unchecked` is now available as a const fn.][6 0370] - [`String` now implements `BorrowMut<str>`.][60404] - [`io::Cursor` now implements `Default`.][60234] - [Both `NonNull::{dangling, cast}` are now const fns.][60244] - [The `alloc` crate is now stable.][59675] `alloc` allows you to use a subset of `std` (e.g. `Vec`, `Box`, `Arc`) in `#![no_std]` environments if the environment has access to heap memory allocation. - [`String` now implements `From<&String>`.][59825] - [You can now pass multiple arguments to the `dbg!` macro.][59826] `dbg!` will return a tuple of each argument when there is multiple arguments. - [`Result::{is_err, is_ok}` are now `#[must_use]` and will produce a warning if not used.][59648] Stabilized APIs --------------- - [`VecDeque::rotate_left`] - [`VecDeque::rotate_right`] - [`Iterator::copied`] - [`io::IoSlice`] - [`io::IoSliceMut`] - [`Read::read_vectored`] - [`Write::write_vectored`] - [`str::as_mut_ptr`] - [`mem::MaybeUninit`] - [`pointer::align_offset`] - [`future::Future`] - [`task::Context`] - [`task::RawWaker`] - [`task::RawWakerVTable`] - [`task::Waker`] - [`task::Poll`] Cargo ----- - [Cargo will now produce an error if you attempt to use the name of a required dependency as a feature.][cargo/6860] - [You can now pass the `--offline` flag to run cargo without accessing the netw ork.][cargo/6934] You can find further change's in [Cargo's 1.36.0 release notes][cargo-1-36-0]. Clippy ------ There have been numerous additions and fixes to clippy, see [Clippy's 1.36.0 rel ease notes][clippy-1-36-0] for more details. Misc ---- Compatibility Notes ------------------- - [`std::arch::x86::_rdtsc` returns `u64` instead of `i64`][stdsimd/559] - [`std::arch::x86_64::_mm_shuffle_ps` takes an `i32` instead of `u32` for `mask `][stdsimd/522] - With the stabilisation of `mem::MaybeUninit`, `mem::uninitialized` use is no longer recommended, and will be deprecated in 1.38.0. [60318]: rust-lang/rust#60318 [60364]: rust-lang/rust#60364 [60370]: rust-lang/rust#60370 [60404]: rust-lang/rust#60404 [60234]: rust-lang/rust#60234 [60244]: rust-lang/rust#60244 [58623]: rust-lang/rust#58623 [59648]: rust-lang/rust#59648 [59675]: rust-lang/rust#59675 [59825]: rust-lang/rust#59825 [59826]: rust-lang/rust#59826 [59445]: rust-lang/rust#59445 [59114]: rust-lang/rust#59114 [cargo/6860]: rust-lang/cargo#6860 [cargo/6934]: rust-lang/cargo#6934 [`VecDeque::rotate_left`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.rotate_left [`VecDeque::rotate_right`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.rotate_right [`Iterator::copied`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#tymethod.copied [`io::IoSlice`]: https://doc.rust-lang.org/std/io/struct.IoSlice.html [`io::IoSliceMut`]: https://doc.rust-lang.org/std/io/struct.IoSliceMut.html [`Read::read_vectored`]: https://doc.rust-lang.org/std/io/trait.Read.html#method.read_vectored [`Write::write_vectored`]: https://doc.rust-lang.org/std/io/trait.Write.html#method.write_vectored [`str::as_mut_ptr`]: https://doc.rust-lang.org/std/primitive.str.html#method.as_mut_ptr [`mem::MaybeUninit`]: https://doc.rust-lang.org/std/mem/union.MaybeUninit.html [`pointer::align_offset`]: https://doc.rust-lang.org/std/primitive.pointer.html#method.align_offset [`future::Future`]: https://doc.rust-lang.org/std/future/trait.Future.html [`task::Context`]: https://doc.rust-lang.org/beta/std/task/struct.Context.html [`task::RawWaker`]: https://doc.rust-lang.org/beta/std/task/struct.RawWaker.html [`task::RawWakerVTable`]: https://doc.rust-lang.org/beta/std/task/struct.RawWakerVTable.html [`task::Waker`]: https://doc.rust-lang.org/beta/std/task/struct.Waker.html [`task::Poll`]: https://doc.rust-lang.org/beta/std/task/enum.Poll.html [clippy-1-36-0]: https://github.com/rust-lang/rust-clippy/blob/master/CHANGELOG.md#rust-136 [cargo-1-36-0]: https://github.com/rust-lang/cargo/blob/master/CHANGELOG.md#cargo-136-2019-07-04 [stdsimd/522]: rust-lang/stdarch#522 [stdsimd/559]: rust-lang/stdarch#559
This is the hash map implementation now used in the Rust standard library: * https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html#a-new-hashmapk-v-implementation * rust-lang/rust#58623 * https://crates.io/crates/hashbrown Differential Revision: https://phabricator.services.mozilla.com/D71740
This is the hash map implementation now used in the Rust standard library: * https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html#a-new-hashmapk-v-implementation * rust-lang/rust#58623 * https://crates.io/crates/hashbrown Differential Revision: https://phabricator.services.mozilla.com/D71740
This is the hash map implementation now used in the Rust standard library: * https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html#a-new-hashmapk-v-implementation * rust-lang/rust#58623 * https://crates.io/crates/hashbrown Differential Revision: https://phabricator.services.mozilla.com/D71740 UltraBlame original commit: 06dcf8ddbfdefed44cb653c7bf533196ae360da3
This is the hash map implementation now used in the Rust standard library: * https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html#a-new-hashmapk-v-implementation * rust-lang/rust#58623 * https://crates.io/crates/hashbrown Differential Revision: https://phabricator.services.mozilla.com/D71740 UltraBlame original commit: 06dcf8ddbfdefed44cb653c7bf533196ae360da3
This is the hash map implementation now used in the Rust standard library: * https://blog.rust-lang.org/2019/07/04/Rust-1.36.0.html#a-new-hashmapk-v-implementation * rust-lang/rust#58623 * https://crates.io/crates/hashbrown Differential Revision: https://phabricator.services.mozilla.com/D71740 UltraBlame original commit: 06dcf8ddbfdefed44cb653c7bf533196ae360da3
This is the same as #56241 except that it imports
hashbrown
as an external crate instead of copying the implementation into libstd.This includes a few API changes (all unstable):
try_reserve
is added toHashSet
.raw_entry
API.search_bucket
has been removed from theraw_entry
API (doesn't work with SwissTable).