Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite BTreeMap to use parent pointers. #30426

Merged
merged 2 commits into from
Jan 17, 2016
Merged

Conversation

gereeter
Copy link
Contributor

Despite being over 700 lines shorter, this implementation should use less memory than the previous one and is faster on at least insertions and iteration, the latter improving approximately 5x.

Technically a [breaking-change] due to removal of deprecated functions.

cc @apasel422 @gankro @goyox86

Fixes #27865.

Review on Reviewable

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@Gankra
Copy link
Contributor

Gankra commented Dec 17, 2015

🙀 🎆 🎊

@Gankra Gankra assigned Gankra and unassigned alexcrichton Dec 17, 2015
@Gankra
Copy link
Contributor

Gankra commented Dec 17, 2015

wtf github is censoring the two most important files because this PR is too epic.

@Gankra
Copy link
Contributor

Gankra commented Dec 17, 2015

Note to self: make sure that we don't actually guarantee the no-allocation property.

@goyox86
Copy link
Contributor

goyox86 commented Dec 17, 2015

🎉 🎉 🎉

@bors
Copy link
Contributor

bors commented Dec 17, 2015

☔ The latest upstream changes (presumably #30325) made this pull request unmergeable. Please resolve the merge conflicts.

@gereeter
Copy link
Contributor Author

Make tidy fixed, but it is really ugly. There is no pretty way to format a very long function signature under 100 characters.

@gereeter
Copy link
Contributor Author

Rebased.

@arthurprs
Copy link
Contributor

I'm a little concerned about allocating on creation, is there any way around it?

Edit: I believe using an Option on the root is a small price to pay for allocation free creation.

@Gankra
Copy link
Contributor

Gankra commented Dec 18, 2015

Note that the PR description is incorrect in describing this as a regression; this has always been the case.

@arthurprs
Copy link
Contributor

I see but the reasoning is the same. Adding an option around the root node adds a single highly predictable branch, it shouldn't add more than a nanosecond or two to the timings.

@bors
Copy link
Contributor

bors commented Dec 18, 2015

☔ The latest upstream changes (presumably #29973) made this pull request unmergeable. Please resolve the merge conflicts.

@bluss
Copy link
Member

bluss commented Dec 18, 2015

The old BTreeMap code seemed to allocate in ::new() as well, calling Node::make_leaf_root(6). Is there an allocation vs no allocation on empty change?

@gereeter
Copy link
Contributor Author

Oh, whoops - I misremembered the BTreeMap behavior. There is no allocation on empty change.

@eddyb
Copy link
Member

eddyb commented Dec 19, 2015

@gereeter You might want to take it out of the PR description then (as @bors would immortalize it).

@Gankra
Copy link
Contributor

Gankra commented Dec 19, 2015

@arthurprs your issue is orthogonal to this PR. Changing the map to use Option (or whatevs) can be done at a later date if desirable.

@arthurprs
Copy link
Contributor

@gankro fair enough

@Gankra
Copy link
Contributor

Gankra commented Dec 21, 2015

@gereeter needs rebase :(

Will start review once rebase is done (sorry I've been cleaning up some loose ends at home).

@gereeter
Copy link
Contributor Author

Rebased. I'm still working on figuring out the segfault, which is especially annoying because it isn't triggering in my simplified tests and isn't hitting any debug assertions.

@Gankra
Copy link
Contributor

Gankra commented Dec 22, 2015

Argh, Github is actually making it impossible to comment on the code!

@Gankra
Copy link
Contributor

Gankra commented Dec 22, 2015

I'm reviewing this on Reviewable.


Comments from the review on Reviewable.io

@Gankra
Copy link
Contributor

Gankra commented Dec 30, 2015

Review status: 0 of 5 files reviewed at latest revision, 69 unresolved discussions, some commit checks failed.


src/libcollections/btree/map.rs, line 69 [r2] (raw file):
This for is redundant, no?

Note to self: drain here isn't totally sufficient; still need to free the root.


src/libcollections/btree/node.rs, line 32 [r2] (raw file):
Genuinely curious if there is any language where this could actually be expressed (without unnecessary indirection).


src/libcollections/btree/node.rs, line 43 [r2] (raw file):
Why for the love of god is this not B. Are you trying to kill a researcher?


src/libcollections/btree/node.rs, line 48 [r2] (raw file):
Should this be *const for variance..? Gonna make the code even more brutal. 😿 Maybe later...


src/libcollections/btree/node.rs, line 53 [r2] (raw file):
I think this file would benefit from hoisting all struct definitions to the top, so that someone reading this code can get an understanding of all the types in action.


src/libcollections/btree/node.rs, line 60 [r2] (raw file):
🤘


src/libcollections/btree/node.rs, line 83 [r2] (raw file):
Style nit: comments on top of field


src/libcollections/btree/node.rs, line 108 [r2] (raw file):
Greatest collection of no-ops I ever did see.


src/libcollections/btree/node.rs, line 114 [r2] (raw file):
I would appreciate a comment indicating height is necessary to track because a Leaf/Internal cannot be otherwise distinguished.


src/libcollections/btree/node.rs, line 118 [r2] (raw file):
Any reason not to punt this to BTreeMap itself? Iterators?


src/libcollections/btree/node.rs, line 129 [r2] (raw file):
NOTE: We should run rustfmt over this file when review complete.


src/libcollections/btree/node.rs, line 153 [r2] (raw file):
Please explain why this is like this. (I assume: because it's owned, and therefore its address isn't stable)


src/libcollections/btree/node.rs, line 160 [r2] (raw file):
doc return value. Unclear why.

Also: opposite of shrink is grow?

Bikeshed: push_level, pop_level


src/libcollections/btree/node.rs, line 174 [r2] (raw file):
This is So Very close to self.as_mut()...


src/libcollections/btree/node.rs, line 184 [r2] (raw file):
"any of the other children are elements of the root" -- messed up sentence?

Also: this assumes this node stores no keys/vals.


src/libcollections/btree/node.rs, line 208 [r2] (raw file):
Note: shrink/enlarge do not adjust any lens, so this is assumed to be done by the user?

Note to self: verify that this is done at usage sites!


src/libcollections/btree/node.rs, line 224 [r2] (raw file):
Jesus Tap Dancing Christ


src/libcollections/btree/node.rs, line 225 [r2] (raw file):
Bikeshed: NodeRef<Lifetime, Mutability, Type, K, V> ~= &'a mut Internal<K, V>


src/libcollections/btree/node.rs, line 228 [r2] (raw file):
I would appreciate dox on what this is for


src/libcollections/btree/node.rs, line 229 [r2] (raw file):
Note to self:

Owned -> Mut

Decoupling Lifetime and Mutability over

Ownership = Owned | Immut<'a> | Mut<'a>

has two benefits:

  1. NodeRef<*, Mut> can always (&mut self) -> &mut
  2. NodeRef<Borrow<'a>, *> can always (self) -> &'a

Otherwise, the two applicable impls would need to be selected manually and duplicated. However this makes everything just a bit more complicated. Alternatively, one could take the Ownership route and select subsets with trait dispatch (Ownership: Mutable; Ownership: Lifetime<'a>).


src/libcollections/btree/node.rs, line 232 [r2] (raw file):
Anywhere this is used? The impl of merge?


src/libcollections/btree/node.rs, line 240 [r2] (raw file):
Note to self: I believe this is correct for Mutability = Mutable, because this is basically saying &&mut T is Send when T is Sync. &&mut T ~= &&T, so this fine.

Can't pull an &mut T out and boot-strap a Send of T.


src/libcollections/btree/node.rs, line 247 [r2] (raw file):
nods... this seems right.

Interestingly, I think you could use impl<marker::Borrowed<'a>, K, V, marker::Immut, Type> Send for NodeRef where &'a (K, V) Send


src/libcollections/btree/node.rs, line 281 [r2] (raw file):
This sort of thing could be factored to another const to be DRY.


src/libcollections/btree/node.rs, line 290 [r2] (raw file):
I hate that this is how we have to write this sort of thing.


src/libcollections/btree/node.rs, line 293 [r2] (raw file):
GASP

Not making full aggressive use of elision?! (this is reasonable)

Would be interested in a comment explaining wtf this is useful for (guessing Owned -> Borrowed, but is &'old -> &'a desirable)?.

Note to self: 'old: 'a is implied by the existence (WF'ness) of &'a self


src/libcollections/btree/node.rs, line 316 [r2] (raw file):
👼

(also doc plz)


src/libcollections/btree/node.rs, line 324 [r2] (raw file):
Why do we ascend into an Edge?


src/libcollections/btree/node.rs, line 326 [r2] (raw file):
🙈 generriiiiiiics


src/libcollections/btree/node.rs, line 335 [r2] (raw file):
*mut _ seems slightly less distracting here to me, for whatever reason


src/libcollections/btree/node.rs, line 348 [r2] (raw file):
Always succeeds because nodes are assumed to be non-empty?


src/libcollections/btree/node.rs, line 399 [r2] (raw file):
bikeshed: yolo, heart_of_the_cards, jesus_take_the_wheel


src/libcollections/btree/node.rs, line 410 [r2] (raw file):
I am intrigued by why this is unsafe! (seems legit)


src/libcollections/btree/node.rs, line 449 [r2] (raw file):
I swear to god 95% of this file is type declarations and mostly-no-op pointer mangles.

Look I'm not even reading these anymore. If the type system isn't catching all possible errors, I don't even know what to believe in (actually this is all hilariously brittle to incorrect type bounds and global assumptions about state).


src/libcollections/btree/node.rs, line 477 [r2] (raw file):
#DealWithIt 😎


src/libcollections/btree/node.rs, line 503 [r2] (raw file):
Interesting that this is a Root...


src/libcollections/btree/node.rs, line 513 [r2] (raw file):
Desugarred: Compute both slices, then discard one. Then ignore the len field. Twice. 😉


src/libcollections/btree/node.rs, line 518 [r2] (raw file):
It seems odd to constantly convert to leaf_mut, internal_mut, etc. Can this conversion just be cached?


src/libcollections/btree/node.rs, line 531 [r2] (raw file):
One convenience too far, eh?


src/libcollections/btree/node.rs, line 532 [r2] (raw file):
rustfmt


src/libcollections/btree/node.rs, line 551 [r2] (raw file):
This procedure doesn't work for roots...?


src/libcollections/btree/node.rs, line 556 [r2] (raw file):
Bikeshed: I can't help but feel like a lot of these would be conceptually leaner as self.key_ptr().offset(idx).

Let's Just Be C?


src/libcollections/btree/node.rs, line 561 [r2] (raw file):
I guess these wouldn't match. But this asymmetry is already kinda weird...


src/libcollections/btree/node.rs, line 615 [r2] (raw file):
Um, technically the type of ForceResult could only take the L, K, V, M args and imply the rest. :neckbeard:


src/libcollections/btree/node.rs, line 637 [r2] (raw file):
WHAT CAN YOU BE? 🙀


src/libcollections/btree/node.rs, line 640 [r2] (raw file):
derive(Copy, Clone)?


src/libcollections/btree/node.rs, line 648 [r2] (raw file):
doc: unsafe because...


src/libcollections/btree/node.rs, line 697 [r2] (raw file):
Officially skimmed. 💯


src/libcollections/btree/node.rs, line 728 [r2] (raw file):
It took me an unreasonable amount of time to realize what this even means. So many types. Getting woozy.

No wait, wtf is an edge from a leaf?! (I guess this is the generic term for "between keys")


src/libcollections/btree/node.rs, line 746 [r2] (raw file):
Small panic attack that Rust had blurred the line of types and values more than I thought. T=B... which is where you split at.

It just occured to me that there's no logic docs in this file... lost in the rewrite, or did we really never have any? (or is it all hoisted up into high-level BTree stuff?)

Also: spooky inference of the handle type, interesting!


src/libcollections/btree/node.rs, line 750 [r2] (raw file):
What does this reborrow accomplish?


src/libcollections/btree/node.rs, line 755 [r2] (raw file):
This seems like a smell that outputting a Root is undesirable.


src/libcollections/btree/node.rs, line 756 [r2] (raw file):
idx - (T + 1) is ever so slightly clearer to me.


src/libcollections/btree/node.rs, line 769 [r2] (raw file):
Unconditional success, 👍 to type systems.


src/libcollections/btree/node.rs, line 793 [r2] (raw file):
The amount of temporary handles that are being created "from scratch" makes me a bit suspicious of this interface. I suppose it makes it "obvious" what the arguments to these functions are...?


src/libcollections/btree/node.rs, line 826 [r2] (raw file):
Skimmed; looks identical to the leaf case. insert_unchecked does all the heavy-lifting (ah, if only we had duck typing...)


src/libcollections/btree/node.rs, line 844 [r2] (raw file):
Bikeshed: This naming is a bit dubious... It's accurate, but handle types are unclear throughout actual usage. Probably fine?


src/libcollections/btree/node.rs, line 929 [r2] (raw file):
Again len - (idx + 1) is semantically closer to what we're doing.


src/libcollections/btree/node.rs, line 945 [r2] (raw file):
I drew a diagram on a white board, so this must be right.


src/libcollections/btree/node.rs, line 967 [r2] (raw file):
doc


src/libcollections/btree/node.rs, line 978 [r2] (raw file):
This is so weird to have cap as a method (cuz we're not gonna merge them into this node, right?)


src/libcollections/btree/node.rs, line 984 [r2] (raw file):
????????????????????????


src/libcollections/btree/node.rs, line 992 [r2] (raw file):
I like that you decided to live a lie for like 7 lines and pretend this isn't all ultra-unsafe


src/libcollections/btree/node.rs, line 1001 [r2] (raw file):
!!!!!

I think these are both supposed to be ptr::write!

(this might be your segfault!)


src/libcollections/btree/node.rs, line 1009 [r2] (raw file):
Note: this is just removing a raw ptr, won't cause any mangling of the deinit node.


src/libcollections/btree/node.rs, line 1015 [r2] (raw file):
Note: "my children are internal"


src/libcollections/btree/node.rs, line 1026 [r2] (raw file):
👍 👍


src/libcollections/btree/node.rs, line 1096 [r2] (raw file):
10/10 module


src/libcollections/btree/node.rs, line 1116 [r2] (raw file):
I MADE IT TO THE END

🎊


Comments from the review on Reviewable.io

@Gankra
Copy link
Contributor

Gankra commented Dec 30, 2015

OK. I have done a first pass of node.rs.

Sorry this took so long. I like to be reasonably thorough for this kind of stuff, and the holidays slammed me. Thankfully, Reviewable allowed me to save my comments up over the past few weeks! Note that I reviewed just the r2 file, and not the diff (because the diff is pointless).

I believe I found what could be your segfault in the impl of merge (see comments).


Comments from the review on Reviewable.io

@gereeter
Copy link
Contributor Author

Oh - I think they are just unused, while the previous BTree implementation used them.

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

Hmm, I can't tell what's causing Travis to fail now.

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

Ah:


failures:
---- btree::set::test_recovery stdout ----
    thread 'btree::set::test_recovery' panicked at 'assertion failed: `(left == right)` (left: `0`, right: `1`)', src/libcollectionstest/btree/set.rs:238

@gereeter
Copy link
Contributor Author

Yup - I'm not sure why I assumed that replace worked the way I implemented it. However, Recover could definitely do with docs.

@gereeter
Copy link
Contributor Author

It passed!

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

Niiice. Are there anymore docs you want to write for this, or are we good to go?

@gereeter
Copy link
Contributor Author

There are more extensive docs that I would like to write eventually, but those will take a while and can always go in a separate PR.

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

Ok then, r=me with squash

@gereeter
Copy link
Contributor Author

Squashed.

@gereeter
Copy link
Contributor Author

Oh, should this be marked [breaking-change] since I deleted at least one deprecated function?

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

I don't think so, but I'll double check (I'm awful at this sort of thing).

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

@huonw says we "might as well" just to be maximally helpful. I don't reckon anyone ever used with_b, but hey, why not.

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

Oh: is clone properly tested at all? Since this is adding some complexity there, I think that would be worth hardening the testing for.

@gereeter
Copy link
Contributor Author

I added a test for clone.

@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

@bors r+

🎊 WE DID IT 🎊

@bors
Copy link
Contributor

bors commented Jan 17, 2016

📌 Commit be4128d has been approved by Gankro

@Gankra Gankra added the relnotes Marks issues that should be documented in the release notes of the next release. label Jan 17, 2016
@Gankra
Copy link
Contributor

Gankra commented Jan 17, 2016

@brson nominating for relnotes because this is a huge improvement.

@bors
Copy link
Contributor

bors commented Jan 17, 2016

⌛ Testing commit be4128d with merge 8846336...

bors added a commit that referenced this pull request Jan 17, 2016
Despite being over 700 lines shorter, this implementation should use less memory than the previous one and is faster on at least insertions and iteration, the latter improving approximately 5x.

Technically a [breaking-change] due to removal of deprecated functions.

cc @apasel422 @gankro @goyox86

Fixes #27865.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/rust-lang/rust/30426)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants