-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for HashSet
entry APIs
#60896
Comments
(For Consider |
For |
That might not be too useful if |
AIUI generic code gets the debug-assertion setting of the place where it's monomorphized. |
@cuviper that doesn't sound right because macros are expanded way before monomorphozation |
Oh, maybe it's only overflow checks that are affected that way. Nevermind then. |
It seems like a misuse of the entry, to insert a different key than was looked up for (a misuse std should not allow). But I can't find a way to "break" this using some test code. From a cursory look I think the current implementation can't be fooled, it hashes and inserts it as if it is a new key. Is that intended? It seems like that fact supercharges this API - |
You can abuse this to insert the same element multiple times in a set: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=c4d27e6b11b646030d1370351887ebc1 Although this may arguably be a bug in |
We have two options here:
|
The current implementation has a benign behaviour in this scenario, and that's a problem, because users can rely on it, and then we are locked in to the implementation specific behaviour (and can no longer change the hashmap impl). The following should also be an option on the table:
An |
Note that |
I agree. There are several places misuse of |
I expect most of the time folks would just want a lazy pub fn get_or_insert_owned<Q: ?Sized>(&mut self, value: &Q) -> &T
where T: Borrow<Q>,
Q: Hash + Eq + ToOwned<Owned = T>,
{
self.map.raw_entry_mut().from_key(value)
.or_insert_with(|| (value.to_owned(), ())).0
} We would only be relying on the contract of equality between borrowed and owned values. |
There are some |
There are a few missing features if the goal is to emulate Consider the following entry-based match map.raw_entry_mut().from_hash(hash, |k| is_same(k)) {
RawMutEntry::Occupied(entry) =>
entry.get(),
RawMutEntry::Vacant(entry) =>
entry.insert(produce_key()?, produce_value()?).0
} Versus the following entry-based set.get_or_insert_with(like_a_value, |v| produce_value().unwrap()) In the In the Allowing for graceful failure could be as simple as the following slight changes. fn get_or_insert_with<Q: ?Sized, F>(&mut self, value: &Q, f: F) -> Option<&T>
where
T: Borrow<Q>,
Q: Hash + Eq,
F: FnOnce(&Q) -> Option<T> -or- fn get_or_insert_with<Q: ?Sized, F, E>(&mut self, value: &Q, f: F) -> Result<&T, E>
where
T: Borrow<Q>,
Q: Hash + Eq,
F: FnOnce(&Q) -> Result<T, E> Spoofing |
Note that the |
That is a very good point; spoofing is clearly out of scope. The ability to fail gracefully, however, is part of match map.entry(key) {
Entry::Occupied(entry) =>
entry.get(),
Entry::Vacant(entry) =>
entry.insert(produce_value()?)
} One might expect the following to work: match set.entry(like_a_value) {
Entry::Occupied(entry) =>
entry.get(),
Entry::Vacant(entry) =>
entry.insert(produce_value()?)
} It would be much closer to the |
I think that will also fall afoul of the concerns about |
It seems that the presence of |
…sKalbertodt Add HashSet::get_or_insert_owned This is an extension for tracking issue rust-lang#60896. The more-general `get_or_insert_with` has potential for misuse, so we might remove it, but I think `get_or_insert_owned` covers most use cases.
…sKalbertodt Add HashSet::get_or_insert_owned This is an extension for tracking issue rust-lang#60896. The more-general `get_or_insert_with` has potential for misuse, so we might remove it, but I think `get_or_insert_owned` covers most use cases.
It seems that these methods do not handle the following case without looking up
|
The current signature pub fn get_or_insert(&mut self, value: T) -> &T feels a bit odd. It at least needs to be pub fn get_or_insert(&mut self, value: T) -> &mut T Even in this form, that's an artificially restricted API. There's a fully elaborate, albeit unwieldy form, which both indicates whether insert happened and returns pub fn get_or_insert(&mut self, value: T) -> (&mut T, Option<T>) |
@matklad I think you forget it's a
but I agree it's should return the previous value:
Is there any function in std that do something similar to copy the order semantic in the tuple ? |
There are cases where mutating an item doesn't change it's equality. Something like this: struct Person {
/// Unique integer id, Eq, Hash and Ord are defined in terms of ids
id: u32,
email: String,
}
impl PartialEq for Person {
fn eq(&self, other: &Person) -> bool { self.id == other.id }
} Storing a Though, I see that we never expose |
MmMmMmM, good point. But look like very error prone, user could easily make mistake since they got a mut ref so easily, the only reason to do that would be to be able to use
I would really advice to just split or just clone the |
I think having an |
I think an I have a use a case where I'd like to check for an existing equivalent entry in the |
FYI, I've opened a pull request for a set |
I would really like a non-getting variant of pub fn insert_owned<Q: ?Sized>(&mut self, value: &Q) -> bool
where T: Borrow<Q>,
Q: Hash + Eq + ToOwned<Owned = T>, and if the alternative of a full |
@Nemo157 I think that’s the usual use case for https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.raw_entry_mut, right? That would indicate that there should be a “raw entry” equivalent for sets as well as a “regular entry” equivalent. |
Looking at rust-lang/hashbrown#342 it seems to me that the proposed entry API doesn't solve the issue of having to clone the element if you want to obtain a reference to it after inserting it. |
In #123657 I am removing This method is unsound because it allows inserting a key at the "wrong" position in a Instead, |
We need a better word than "unsound", because it doesn't expose memory unsafety in itself. It does however pose a problem if |
What about: |
The #[inline]
fn add_and_already_there(set_: &mut HashSet<String>, item: &str) -> bool {
let initial_length = set_.len();
set_.get_or_insert_owned(item);
set_.len() != initial_length
} I'm implementing the CVM counting algorithm that has been getting much attention. |
I've found the signature converts a Such code should be fine, but compiler refused to compile it, since we cannot call #![feature(hash_set_entry)]
use std::collections::HashSet;
fn main() {
let mut hs = HashSet::new();
let str1 = "hello";
let a = hs.get_or_insert(str1);
if hs.contains(str1) {
println!("{a}")
}
} I'm requesting a hybrid borrow mode in IRLO, which allow both borrow self exclusively or normally. |
Copying my reply here too. I believe this problem could be fixed by providing a loss-less /// Inserts a value in the hash set, if not already present.
///
/// Always returns a shared reference to the set and a shared reference to the entry.
/// The entry reference is either the inserted one when the result is `Ok`, or the existing
/// one when the result is `Err`. In this last case, the non-inserted entry is returned too.
fn get_or_insert(&mut self, value: T) -> (&Self, Result<&T, (&T, T)>); |
Fix accidental $O(n^2)$ in `collect_columns`. There are the following ways to insert a clone into a hash set: - **clone before check:** Basically `set.insert(x.clone())`. That's rather expensive if you have a lot of duplicates. - **iter & clone:** That's what we do right now, but that's in $O(n^2)$. - **check & insert:** Basically `if !set.contains(x) {set.insert(x.clone())}` That requires two hash probes though. - **entry-based API:** Sadly the stdlib doesn't really offer any handle/entry-based APIs yet (see rust-lang/rust#60896 ), but `hashbrown` does, so we can use the nice `set.get_or_insert_owned(x)` which will only clone the reference if it doesn't exists yet and only hashes once. We now use the last approach.
Fix accidental $O(n^2)$ in `collect_columns`. There are the following ways to insert a clone into a hash set: - **clone before check:** Basically `set.insert(x.clone())`. That's rather expensive if you have a lot of duplicates. - **iter & clone:** That's what we do right now, but that's in $O(n^2)$. - **check & insert:** Basically `if !set.contains(x) {set.insert(x.clone())}` That requires two hash probes though. - **entry-based API:** Sadly the stdlib doesn't really offer any handle/entry-based APIs yet (see rust-lang/rust#60896 ), but `hashbrown` does, so we can use the nice `set.get_or_insert_owned(x)` which will only clone the reference if it doesn't exists yet and only hashes once. We now use the last approach.
Hashbrown 0.15 now has an assert in |
Rollup merge of rust-lang#120077 - SUPERCILEX:set-entry, r=Amanieu Add Set entry API See https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/HashSet.3A.3Aentry/near/413224639 and rust-lang#60896 (comment) Closes rust-lang/rfcs#1490
HashSet
entry APIs
Feature gate:
#![feature(hash_set_entry)]
This is a tracking issue for
Entry
and entry-like methods onHashSet
.Public API
The
get_or_insert[_with]
methods provide a simplification of theEntry
API forHashSet
, withnames chosen to match the similar methods on
Option
. The fullEntry
API mimics thatof
HashMap
, though without methods for the map value (which is just()
in a set).Steps / History
Entry
API: Add Set entry API #120077Unresolved Questions
See also #133549 for
BTreeSet
.Footnotes
https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html ↩
The text was updated successfully, but these errors were encountered: