Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add header only cuspatial::quadtree_on_points #639

Merged

Conversation

trxcllnt
Copy link
Contributor

Closes #566.

@trxcllnt trxcllnt requested review from a team as code owners August 11, 2022 05:11
@github-actions github-actions bot added cmake Related to CMake code or build configuration libcuspatial Relates to the cuSpatial C++ library labels Aug 11, 2022
@trxcllnt trxcllnt added 3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 11, 2022
@harrism harrism added this to the header-only C++ API milestone Aug 16, 2022
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work so far, this is a big task. I didn't do a complete review. I will re-review after you go over these comments.

* @note `max_depth` should be less than 16, since Morton codes are represented as `uint32_t`. The
* eventual number of levels may be less than `max_depth` if the number of points is small or
* `min_size` is large.
* @note All quadtree nodes should have fewer than `min_size` number of points except leaf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note is confusing because it seems to contradict below where it says "@param min_size Minimum number of points for a non-leaf quadtree node." If it's the minimum, then how come the non-leaf nodes "should" have fewer than min_size points? Either the parameter is misnamed or the docs need rewritten?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right -- reading through the code, I think this is the max number of points allowed in a node before the node is subdivided into 4 new leaves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harrism @trxcllnt It is the "min_size" to qualify as a non-leaf node, which should be the same as the "max_size" as a leaf-node. Just to want to add that the numbers of nodes for the last level nodes (which need to be leaf nodes) is not limited by the threshold. When min_size/and or max_depth is small, many (could be most) last-level nodes will have large numbers of points under them.

cpp/include/cuspatial/experimental/point_quadtree.cuh Outdated Show resolved Hide resolved
cpp/include/cuspatial/experimental/point_quadtree.cuh Outdated Show resolved Hide resolved
cpp/include/cuspatial/experimental/point_quadtree.cuh Outdated Show resolved Hide resolved
@zhangjianting
Copy link
Contributor

To answer Mark's question directly, non-leaf nodes need to be split and non-last-level nodes after the split should have fewer nodes than min_size. I agree changing it to "max_size" to indicate maximum number of non-last-level leaf nodes might be easier to parse.

@trxcllnt trxcllnt requested a review from harrism August 30, 2022 17:11
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more things.

*
* @see http://www.adms-conf.org/2019-camera-ready/zhang_adms19.pdf for details.
*
* @note `scale` is applied to (x - min.x) and (y - min.y) to convert coordinates into a Morton code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"applied" doesn't reflect that those differences are divided by scale... If there is an intuitive explanation, let's add that. And if not, then put the exact math in here. See #397

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, should probably use a reciprocal of the current scale to make the scale factor actually a directly proportional scale factor (which would be more intuitive).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further discussion: let's save any semantic / breaking change for a later PR if necessary.

@trxcllnt trxcllnt force-pushed the fea/header-only-quadtree-on-points branch from bd72eac to 65a6b8a Compare September 5, 2022 23:26
@harrism harrism requested review from isVoid and removed request for cwharris and zhangjianting September 6, 2022 22:24
Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the late review. I delved into the reference in depth but still got lost in the remove_unqualified_quadtree (and Phase 2 in general, probably running some numbers will help). According to this comment: #639 (comment). It feels like max_size still should be called min_size, otherwise it makes remove_unqualified_quads strange to read. We should give it a more verbose name e.g. min_non_leaf_node_size.

@zhangjianting Is it possible to have a leaf node that's not on the last-level? aka is quadtree full?

I find it fascinating to compute the proximity between points using dilated integer and cluster them simply by dividing the Morton code by 4. Apparently scale is an important factor here. How to pick scale?

Not a lot of code smell in terms of the header only refactoring except for a few small issue (IIFE and structure binding, not all are labeled).

@harrism
Copy link
Member

harrism commented Oct 5, 2022

@gpuccibot merge

@harrism
Copy link
Member

harrism commented Oct 5, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 37d3db5 into rapidsai:branch-22.10 Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team cmake Related to CMake code or build configuration improvement Improvement / enhancement to an existing function libcuspatial Relates to the cuSpatial C++ library non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[FEA] refactor point quadtree API to header-only
4 participants