-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make functional BST more-functional. #6
Conversation
7cee4c3
to
cce27e3
Compare
//! # Examples | ||
//! | ||
//! ``` | ||
//! use bst::functional::Tree; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sort of duplicative with other doctests but it makes for nice docs.
where | ||
K: cmp::Ord, | ||
{ | ||
match self { | ||
Tree::Leaf => self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now have to return an owned Self
and self
is a borrowed Self
. We could add a Clone
implementation to Tree
(I think we'll want it later anyway) but this seems fine for now.
left: Box<Tree<K, V>>, | ||
right: Box<Tree<K, V>>, | ||
key: Rc<K>, | ||
value: Rc<V>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm conflicted here. I have this here because of how delete_largest
works. It returns an owned key and value and we can't do that unless we Rc
each individual field. I guess it could instead return Rc<Node>
? I'm guessing we'd basically clone the predecessor node and return that. But it'd have to have a different subtree and we can't mutate it 🤔
Going to think on it more but I'm also guessing one of Chris's patented Good Questions™️ will clear this up presently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still going to wait for Chris's wisdom but I'm semi-convinced this is correct (reference counting all the fields of the Node
, that is) - if a new Node
is inserted with the same key as another, we must make a new node that references the key and subtrees of the original Node and I think that's only possible by cloning each field.
**Why?** I didn't think I could do this but the issue was that I had `derive`d `Clone` on `Node` when I should've implemented it directly because it's generic and leverages reference counting. See #6 (comment)
@patientstreetlight bump <3 |
src/functional.rs
Outdated
left: Box::new(n.left.insert(key, value)), | ||
..n | ||
left: Rc::new(n.left.insert(key, value)), | ||
..n.clone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patientstreetlight wondered if this syntax was wasteful - if we were incrementing all the reference counts just to have to decrement the unused left
's reference count after. The docs seem to indicate this would be true:
The entire expression uses the given values for the fields that were specified and moves or copies the remaining fields from the base expression.
(since here the base expression, which is n.clone()
would need to be evaluated fully and then the relevant fields moved out). A test in godbolt confirms. Our options are:
- explicitly clone the three needed fields every time to save on cloning the one field we don't need unnecessarily
- accept the performance hit of incrementing and decrementing the counter for the readability
I lean towards the first one but won't die on either hill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes on performance - while we’re just incrementing and decrementing a counter the compiler does emit conditional instructions since it has to check:
- Will this increment overflow? If so, I should panic
- Will this decrement result in zero in which case we should drop the underlying value
so those may have a realistic performance impact (we could even leave it in for fun and remove it after adding the benchmarks to see!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I'd lean towards (1) as well just because it feels semantically cleaner to me to not do the unneeded increments/decrements, even if it is syntactically uglier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**This Commit** Removes instances of instantiating new `Node`s using the functional update syntax of a clone of an existing `Node`. That is it replaces: ```rust Node { some_field: some_value, ..existing_node.clone() } ``` with ```rust Node { some_field: some_value, other_field: existing_node.other_field.clone(), // etc. } ``` **Why?** For clarity - when using `..node.clone()` what's happening is we are first cloning the node in its entirety (which results in bumping the reference count for all its fields), moving the fields we care about to the new `Node`, and then dropping the remaining fields (and decrementing their reference count). This is a surprise to the reader and is needless and not what we want to express. It also may have a performance impact but that isn't measured. For more details see [this comment thread][0]. [0]: #6 (comment)
src/functional.rs
Outdated
..n | ||
left: Rc::new(n.left.insert(key, value)), | ||
key: n.key.clone(), | ||
right: n.right.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there is some convention of preferring Rc::clone(&rcVal)
over rcVal.clone()
for reference counted pointers.
The Rust Book also says
We could have called a.clone() rather than Rc::clone(&a), but Rust’s convention is to use Rc::clone in this case. The implementation of Rc::clone doesn’t make a deep copy of all the data like most types’ implementations of clone do. The call to Rc::clone only increments the reference count, which doesn’t take much time. Deep copies of data can take a lot of time. By using Rc::clone for reference counting, we can visually distinguish between the deep-copy kinds of clones and the kinds of clones that increase the reference count. When looking for performance problems in the code, we only need to consider the deep-copy clones and can disregard calls to Rc::clone.
I'd say it's worth following that convention here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! 18f0d44
**This Commit** Denies the use of `.clone()` on reference counted pointers as advised by [this `clippy` lint][0] and [The Rust Book][1]. It also addresses instances where the lint was violated. **Why?** It's best practice to make clear the clone is cheap. There is [some debate][2] on the issue and the [standard library no longer explicitly recommends it][3] but it is still noted as more idiomatic. In any case, for us there are no issues making the change and if there is a possibility for being more expressive, we should take it. See #6 (comment) for more details. [0]: https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_ref_ptr [1]: https://doc.rust-lang.org/book/ch15-04-rc.html#using-rct-to-share-data [2]: rust-lang/rust-clippy#2048 [3]: rust-lang/rust#76138
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
(Tree::Leaf, Tree::Node(right)) => Tree::Node(right.clone()), | ||
(Tree::Node(left), Tree::Leaf) => Tree::Node(left.clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More just curious - are these 2 .clone()
calls intentionally not Rc::clone()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct - right
and left
are of type Node
instead of Rc
so we can't use that syntax. We could use Node::clone(left)
but that isn't obviously clearer (in my opinion).
**This Commit** Adds history to the functional BST and no longer requires ownership to `insert`/`delete`. **Why?** Mainly to make benchmarking easier but this has the benefit of more closely matching a functional data structure in that, when new trees are created from "mutating" operations, the history of trees/operations is still accessible. Use struct update syntax for conciseness **Why?** I didn't think I could do this but the issue was that I had `derive`d `Clone` on `Node` when I should've implemented it directly because it's generic and leverages reference counting. See #6 (comment) Explicitly clone needed fields of nodes **This Commit** Removes instances of instantiating new `Node`s using the functional update syntax of a clone of an existing `Node`. That is it replaces: ```rust Node { some_field: some_value, ..existing_node.clone() } ``` with ```rust Node { some_field: some_value, other_field: existing_node.other_field.clone(), // etc. } ``` **Why?** For clarity - when using `..node.clone()` what's happening is we are first cloning the node in its entirety (which results in bumping the reference count for all its fields), moving the fields we care about to the new `Node`, and then dropping the remaining fields (and decrementing their reference count). This is a surprise to the reader and is needless and not what we want to express. It also may have a performance impact but that isn't measured. For more details see [this comment thread][0]. [0]: #6 (comment) Deny `clippy::clone_on_ref_ptr` **This Commit** Denies the use of `.clone()` on reference counted pointers as advised by [this `clippy` lint][0] and [The Rust Book][1]. It also addresses instances where the lint was violated. **Why?** It's best practice to make clear the clone is cheap. There is [some debate][2] on the issue and the [standard library no longer explicitly recommends it][3] but it is still noted as more idiomatic. In any case, for us there are no issues making the change and if there is a possibility for being more expressive, we should take it. See #6 (comment) for more details. [0]: https://rust-lang.github.io/rust-clippy/master/index.html#clone_on_ref_ptr [1]: https://doc.rust-lang.org/book/ch15-04-rc.html#using-rct-to-share-data [2]: rust-lang/rust-clippy#2048 [3]: rust-lang/rust#76138
18f0d44
to
42dd0c6
Compare
This Commit
Adds history to the functional BST and no longer requires ownership to
insert
/delete
.Why?
Mainly to make benchmarking easier but this has the benefit of more
closely matching a functional data structure in that, when new trees are
created from "mutating" operations, the history of trees/operations is
still accessible.