Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add m-ary trees #101

Open
czgdp1807 opened this issue Feb 26, 2020 · 28 comments
Open

Add m-ary trees #101

czgdp1807 opened this issue Feb 26, 2020 · 28 comments
Labels
Milestone

Comments

@czgdp1807
Copy link
Member

czgdp1807 commented Feb 26, 2020

Description of the problem

Adding m-ary trees will be a nice idea. May be a array based(allows printing the trees) or pointer(better space utilisation) or both would be better.

Example of the problem

References/Other comments

.. [1] https://en.wikipedia.org/wiki/M-ary_tree

@vibhu18116
Copy link
Contributor

vibhu18116 commented Mar 2, 2020

Hi, can I start working on this issue as a part of RGSoC20?

@czgdp1807
Copy link
Member Author

Hi, @vibhu18116 Please start by designing an API for m-ary trees. Please let us know if you face any problem/queries.

@vibhu18116
Copy link
Contributor

Thank you for allowing me to work on this.

I believe the API design would closely follow that of binary trees. Hence these are a few APIs I think should form a basis:

  • __new__(cls, m_value = 2, array_flag = 0, key=None, root_data=None, comp=None)

This would initialise a new m-ary tree. m_value will be used to determine the maximum value of m (the number of children of a node), default to 2 (binary tree in that case). array_flag can be set to indicate that an array implementation is required and it can remain 0 for pointer implementation.

Rest of the parameters will be same as binary tree implementation.

  • insert(self, key, data)
  • delete(self, key, **kwargs)
  • search(self, key, **kwargs)
  • __str__(self)

The above four will behave in the same way as in the binary tree. Implementation of the __str__ function will vary depending upon if the tree is implemented through array or pointers.

  • height(self)
    Will return the height of the tree.

  • width(self)
    Will return the width of the tree.

  • convert_to_binary(self, array_flag = 0)
    This will convert an m-ary tree to binary tree and return the pointer to root if array_flag is not set and the array if array_flag is set.

Kindly check if these APIs work.

@czgdp1807
Copy link
Member Author

__new__(cls, m_value = 2, array_flag = 0, key=None, root_data=None, comp=None)

IMO, using only array based implementation will be better here as pointers are somewhat unsafe, so avoid them where ever possible. Instead of m_value, a better name would be max_children. This suggests a nice way to implement using arrays.

Can you please explain the use of the values returned by height and width methods?

convert_to_binary(self, array_flag = 0)

IMO, just to_binary_tree(self) would better.

You may refer, https://github.com/codezonediitj/pydatastructs/blob/master/pydatastructs/trees/binary_trees.py while implementing the above APIs.

@czgdp1807
Copy link
Member Author

Note that m-ary tree is not a search tree, so we don't need to worry about the manner in which nodes are stored. We need a creative way to avoid the exponential space complexity of array based implementation. May be a new type of node inheriting TreeNode can be made which will be having an array storing the indices of it's children node in the array of m-ary trees. Please let me know if you are unable to get my point.

@vibhu18116
Copy link
Contributor

I thought height and width of a tree can provide additional functionality to the user, however other than from feature point of view, I don't find any suitable use case.
So, if you suggest I would drop them.

I would update the names of functions as suggested.

@vibhu18116
Copy link
Contributor

You may refer, https://github.com/codezonediitj/pydatastructs/blob/master/pydatastructs/trees/binary_trees.py while implementing the above APIs.

I checked this implementation, but a lot of functions like insert, delete are not implemented, probably because they depend on comparision key. Since the insert, delete API would follow the same pattern, should I first implement them?

@czgdp1807
Copy link
Member Author

Actually, binary trees are quite abstract, hence their insert and delete operations are left unimplemented. What logic should be followed while making insertions to a binary tree? Both the operations are well defined for binary search trees.
IMO, we can keep insert, delete operations undefined for m-ary trees. We can a complementary m-ary search tree inheriting m-ary tree with insert and delete well defined for them. Here is defined for these kind of trees.
You can do the work in two separate PRs for m-ary and m-ary search trees, one after the previous one.

@vibhu18116
Copy link
Contributor

I get your point. I will try to come up with actual implementation and try to create a PR in a day or two for m-ary tree.

@vibhu18116
Copy link
Contributor

Can I keep to_binary_tree method abstract in m-ary tree, or a concrete implementation is required?

If a concrete implementation is required, then should I modify the m-ary tree itself or create a new binary tree with nodes of type TreeNode?

Currently, I have made a new node type extending TreeNode as suggested earlier which form nodes of m-ary tree.

@czgdp1807
Copy link
Member Author

Can I keep to_binary_tree method abstract in m-ary tree, or a concrete implementation is required?

We can keep it as abstraction in m-ary tree but a concrete implementation can be added in m-ary search tree.

If a concrete implementation is required, then should I modify the m-ary tree itself or create a new binary tree with nodes of type TreeNode?

IMO, a new tree should returned.

vibhu18116 added a commit to vibhu18116/pydatastructs that referenced this issue Mar 12, 2020
Added M_aryTreeNode for use in M_ary tree and an implementation for
M_ary Tree. The work is in reference with issue codezonediitj#101.
@vibhu18116
Copy link
Contributor

I created a PR. However, the Travis checks failed due to some trailing whitespaces. Can you please guide how do I ensure there are no trailing white spaces?

Also can you review the code once to see if it is the desired implementation?

@czgdp1807
Copy link
Member Author

It's time to make a PR for MArySearchTree and the relevant tests.

@vibhu18116
Copy link
Contributor

Okay. I will start work on it soon :)

@czgdp1807
Copy link
Member Author

And I hope you are working on your RGSoC application. The deadline is 30th March according to the latest information I have.
See, https://railsgirlssummerofcode.org/students/application/#apply

@vibhu18116
Copy link
Contributor

Yes. I will start working on application as well.

@vibhu18116
Copy link
Contributor

It's time to make a PR for MArySearchTree and the relevant tests.

I was looking through MArySearchTree on the link you referenced earlier.

However, I don't see a determinate way of insertion.

Say, if the nodes can have 4 children, and root has currently 3 only as a data item, then on trying to insert 5, it can become a part of the root, or go in a child node to the right of 3.

Can you clarify this point once?

@czgdp1807
Copy link
Member Author

I see, the current implementation in m_ary_trees.py doesn't align with the description given here. It should contain a group of keys but it just contains a single key. A minor update is needed on it.

I was looking through MArySearchTree on the link you referenced earlier.

However, I don't see a determinate way of insertion.

Some research is needed.

@vibhu18116
Copy link
Contributor

I see, the current implementation in m_ary_trees.py doesn't align with the description given here. It should contain a group of keys but it just contains a single key. A minor update is needed on it.

I think adding a method to insert a value in the node in MAryTreeNode should work. The value will be added if currently the number of values in the node are less than (max_children - 1), else, error can be thrown.

Does this work?

@czgdp1807
Copy link
Member Author

I think adding a method to insert a value in the node in MAryTreeNode should work.

How it will help in having multiple keys in the node?

@vibhu18116
Copy link
Contributor

So, the node can store the (data, key) as a tuple in a dynamically sized array which can grow up to (m-1) size where m is the maximum number of children a node can have.

The information about the children nodes is already stored in the children array.

@czgdp1807
Copy link
Member Author

Here's what I have thought of,

MAryTreeNode

  1. key_list - A SinglyLinkedList with key, data in LinkedListNode. The framework for linked lists is in linear_data_structures. We can keep array as well but for consistency, tuple might not be a good idea instead, I'd suggest to factor out the key and data part to the abstract node class as almost every node has these two members and then just use the abstract Node class as common type for array of keys.
  2. children - An array of child indices with i-th one pointing to a node whose keys are between key_list[i] and key_list[i-1].

The children array is already there so only 1 needs to be added.

@vibhu18116
Copy link
Contributor

This works. But the array could have been subjected to binary search to search a particular element in that node. However, this won't be possible in the linked list implementation. So, what is your opinion about this?

@czgdp1807
Copy link
Member Author

But the array could have been subjected to binary search to search a particular element in that node.

Well, the maximum size of the key_list won't be very large so, O(log(m)) or O(m) will not make much of a difference. Over optimisation can be avoided here. If m is large then MAryTree will not be of any practical use.

@czgdp1807
Copy link
Member Author

If m is large then MAryTree will not be of any practical use.

Though if you are aware of any use case where m is very large then we can think of using arrays.

@vibhu18116
Copy link
Contributor

I didn't find any specifics about the node size in different applications. So, if you say, I can proceed with a linked list.

@czgdp1807
Copy link
Member Author

Sure, feel free to proceed.

@czgdp1807 czgdp1807 added this to the v0.0.1 milestone May 30, 2020
@czgdp1807 czgdp1807 removed the rgsoc20 label Jun 20, 2020
@Arvind-raj06
Copy link
Member

@czgdp1807 I hope this tree has already been implemented why not close this one!

@czgdp1807 czgdp1807 removed the level3 label Dec 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants