Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BtreeMap range_search spruced up #68499

Merged
merged 2 commits into from
Feb 7, 2020
Merged

Conversation

ssomers
Copy link
Contributor

@ssomers ssomers commented Jan 24, 2020

#39457 created a lower level entry point for range_search to operate on, but it's really not hard to move it up a level of abstraction, making it somewhat shorter and reusing existing unsafe code (new_edge is unsafe although it is currently not tagged as such).

Benchmark added. Comparison says there's no real difference:

>cargo benchcmp old3.txt new3.txt --threshold 5
 name                                           old3.txt ns/iter  new3.txt ns/iter  diff ns/iter   diff %  speedup
 btree::map::find_seq_100                       19                21                           2   10.53%   x 0.90
 btree::map::range_excluded_unbounded           3,117             2,838                     -279   -8.95%   x 1.10
 btree::map::range_included_unbounded           1,768             1,871                      103    5.83%   x 0.94
 btree::set::intersection_10k_neg_vs_10k_pos    35                37                           2    5.71%   x 0.95
 btree::set::intersection_staggered_100_vs_10k  2,488             2,314                     -174   -6.99%   x 1.08
 btree::set::is_subset_10k_vs_100               3                 2                           -1  -33.33%   x 1.50

r? @Mark-Simulacrum

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 24, 2020
@bors
Copy link
Contributor

bors commented Jan 30, 2020

☔ The latest upstream changes (presumably #68659) made this pull request unmergeable. Please resolve the merge conflicts.

@ssomers ssomers force-pushed the btree_search_tidying branch 2 times, most recently from 93b3c83 to 9c44443 Compare January 30, 2020 09:04
Copy link
Member

@Mark-Simulacrum Mark-Simulacrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me, just let me know what you think of my comment(s) and whether you want to do them here or not

src/liballoc/collections/btree/map.rs Show resolved Hide resolved
@@ -51,7 +51,7 @@ where
/// exist in the node itself, it may exist in the subtree with that index
/// (if the node has subtrees). If the key doesn't exist in node or subtree,
/// the returned index is the position or subtree to insert at.
pub fn search_linear<BorrowType, K, V, Type, Q: ?Sized>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move/copy the doc comment here onto search_range since that's the common "entry point" now?

Doesn't have to be in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, though I couldn't find any other documentation describing a returned enum (other than Result or Option).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, not sure how that's relevant? We rarely encounter returned enums in the wild, I think (though I'm sure now that I've said it I'll think of dozens of examples once I post this).

If you're looking for a format or so to follow, I wouldn't worry that much about that. I'd be far more interested in just some docs pointing out what the variants mean; basically same as the docs here do wrt to false/true and the index, but applied to the enum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it hard to describe without mentioning the variant (or at least, the GoDown variant which is somewhat confusing).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning the variant is good (and indeed I would expect the docs to do so).

@ssomers ssomers force-pushed the btree_search_tidying branch from 9c44443 to 43ad9ed Compare February 7, 2020 01:05
@ssomers
Copy link
Contributor Author

ssomers commented Feb 7, 2020

whether you want to do

Oops, I wanted and I did.

/// Looks up a given key in a (sub)tree headed by the given node, recursively.
/// Returns a `Found` with the handle of the matching KV, if any. Otherwise,
/// returns a `GoDown` with the handle of the possible leaf edge at which to insert
/// the key. The `GoDown` handle is irrelevant if `BorrowType` is `marker::Immut`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ Thanks for adding some docs here too :)

/// returns a `GoDown` with the handle of the edge where the key might be found.
/// If the node is a leaf, a `GoDown` edge is not an actual edge but a possible edge;
/// in other words, it is the location at which to insert the key, or irrelevant if
/// `BorrowType` is `marker::Immut`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I imagine it's not irrelevant if the type is immut -- it's still that edge, right? e.g. a hypothetical [T]::binary_search-like API on BTree would want that index/edge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. It's all in the "node is a leaf" case, why would you care if you won't insert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, maybe it's not my business what someone wants to do with it, but it feels wrong to suggest it's the place to insert if it's an immutable handle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's (eventual) plans to add something like the LinkedList Cursor API to BTreeMap so that you could actually navigate in the conceptual "Array" that the map represents. (In fact, the linked list cursors are considered as a proving ground for that API design).

As another example -- though also an API that I believe BTree's don't provide today -- say I have a non-existent needle x, and I want to find the predecessor/successor of that x without traversing O(n) elements (as the iterator API would force me to). With this index, I could return that information fairly easily with O(log(n)) accesses, right? (where n is the length/size of the map).

Copy link
Contributor Author

@ssomers ssomers Feb 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I didn't want to say "if BorrowType is marker::Mut", because I guess you can use it for "marker::Owned", but I'm not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the way binary_search on slices describes it, and this is the same API (I believe) so I think it's reasonable to do so here too.

If the value is found then Result::Ok is returned, containing the index of the matching element. If there are multiple matches, then any one of the matches could be returned. If the value is not found then Result::Err is returned, containing the index where a matching element could be inserted while maintaining sorted order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. How about "belongs"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see that comment. Yes, "could be inserted" sounds good, and "place to insert" also doesn't say the handle can be used to insert, it's just a location.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, going offline now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The quote was from the binary search docs, to be clear).

Could be inserted is fine, or some other wording. Anything works :)

@ssomers ssomers force-pushed the btree_search_tidying branch from 43ad9ed to ae03e16 Compare February 7, 2020 01:41
@Mark-Simulacrum
Copy link
Member

@bors r+

Great, thanks!

@bors
Copy link
Contributor

bors commented Feb 7, 2020

📌 Commit ae03e16 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 7, 2020
@bors
Copy link
Contributor

bors commented Feb 7, 2020

⌛ Testing commit ae03e16 with merge b5e21db...

bors added a commit that referenced this pull request Feb 7, 2020
BtreeMap range_search spruced up

#39457 created a lower level entry point for `range_search` to operate on, but it's really not hard to move it up a level of abstraction, making it somewhat shorter and reusing existing unsafe code (`new_edge` is unsafe although it is currently not tagged as such).

Benchmark added. Comparison says there's no real difference:
```
>cargo benchcmp old3.txt new3.txt --threshold 5
 name                                           old3.txt ns/iter  new3.txt ns/iter  diff ns/iter   diff %  speedup
 btree::map::find_seq_100                       19                21                           2   10.53%   x 0.90
 btree::map::range_excluded_unbounded           3,117             2,838                     -279   -8.95%   x 1.10
 btree::map::range_included_unbounded           1,768             1,871                      103    5.83%   x 0.94
 btree::set::intersection_10k_neg_vs_10k_pos    35                37                           2    5.71%   x 0.95
 btree::set::intersection_staggered_100_vs_10k  2,488             2,314                     -174   -6.99%   x 1.08
 btree::set::is_subset_10k_vs_100               3                 2                           -1  -33.33%   x 1.50
```

r? @Mark-Simulacrum
@bors
Copy link
Contributor

bors commented Feb 7, 2020

☀️ Test successful - checks-azure
Approved by: Mark-Simulacrum
Pushing b5e21db to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 7, 2020
@bors bors merged commit ae03e16 into rust-lang:master Feb 7, 2020
@ssomers ssomers deleted the btree_search_tidying branch March 27, 2020 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants