Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spatial index - fine tune predict_offsets when $conditions->get_index_max_dist is detected #550

Closed
shawnlaffan opened this issue May 17, 2015 · 3 comments
Assignees
Milestone

Comments

@shawnlaffan
Copy link
Owner

Biodiverse::Index::predict_offsets currently has an early return when the condition it is passed has a value for get_index_max_dist.

In such cases, the system generates a box of index offsets based on the size of the max dist. The problem with this approach is that the proportion of redundant offsets increases as the max dist increases, more so for the ellipses and other such shapes.

The likely solution is to run a predict process but using the possible elements within the max distance box.

shawnlaffan added a commit that referenced this issue May 17, 2015
…max_dist.

Currently this still returns early when the distance is less than twice the minimum index resolution.  That ratio needs to be tuned, perhaps.

If we do the subset search then we also now return the subset of offsets if the search results in more offsets than the subset.

There is still old code from long ago, before the shortcut was added, which does effectively the same thing.  It can be cleaned up next.

Updates issue #550

Signed-off-by: Shawn Laffan <shawnlaffan@gmail.com>
shawnlaffan added a commit that referenced this issue May 17, 2015
Apart from not being used in a long while, it also generated the search set repeatedly.

Updates issue #550

Signed-off-by: Shawn Laffan <shawnlaffan@gmail.com>
@shawnlaffan
Copy link
Owner Author

The changes thus far actually result in a slower test suite, but the benefit is over multiple groups in a spatial analysis. The spatial tests only check one group.

Need to also check the block and rectangle cases, as they could probably always return early since their shape will match the initial subset block. Need to add a shape parameter to the spatial conditions metadata to allow for this.

@shawnlaffan shawnlaffan added this to the Release_1.01 milestone May 17, 2015
@shawnlaffan shawnlaffan self-assigned this May 17, 2015
shawnlaffan added a commit that referenced this issue May 19, 2015
If this is square then the index predict_offsets process can then skip the comparison step.

Applies to the sp_rectangle, sp_block and sp_square subs at the moment, with checks for odd cases such as offsets and non-numeric sizes.

Updates issue #550

Signed-off-by: Shawn Laffan <shawnlaffan@gmail.com>
@shawnlaffan
Copy link
Owner Author

Mark as fixed.

Any related problems can be listed under their own issues.

@shawnlaffan
Copy link
Owner Author

As an addendum:
A quick benchmark of a basedata set with 51 x 201 groups using sp_ellipse(major_radius => 10, minor_radius => 5) takes 189 seconds under the new regime, and 301.6 seconds using version 1.0 (PAR executable version).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant