-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTreeMap insert does not honor the B-tree invariant #74834
Comments
Let's say we want to insert in a full node (len() = CAPACITY = 2B-1) at edge-index IN (which is in 0..=CAPACITY), and we accomplish this by splitting the node at kv-index S (which is in 0..CAPACITY), moving the element (key/value pair) at kv-index S into a parent node, keeping the S elements on the left in the left node, and moving the (CAPACITY-S-1) elements on the right to a new right child. The current code chooses S = B leaving the B elements on the left in a node and shipping out the B-2 elements on the right to another node, and then inserts the new element either left or right. The simplest, symmetric rule is to choose S = B-1, the element right in the middle, leaving B-1 elements in the left child and B-1 elements in the right child, and then inserting in either of the children. Possible exceptions
Each extra rule has at least two more disadvantages:
My conclusion is that, beyond the simple, symmetric rule, only exception 1 might be worth implementing. |
Inserting into a BTreeMap, where the key lands in a node that is full (has 11 elements), cause that node to keep 6 elements and split off 5 elements, of which one ends up in the parent node, and the other 4 in a new node. If the key inserted is not big and belongs among the left side of the keys in the node, the left child comes out with 7 elements and the right child has only 4 elements.
I don't think(*) this causes any observable ugliness so it's impossible to demonstrate using outsider testing code. Maybe through memory footprint or performance analysis. I have a commit peeking at the internals to prove it.
Is this really a bug at all?
(where B-1 = MIN_LEN = 5).
(*) For instance, you can't reduce the number of elements further.
remove_kv_tracking
could temporarily reduce it down to 3, but it will then either merge with a sibling or be stocked back up to 4 elements.The text was updated successfully, but these errors were encountered: