Skip to content

Commit

Permalink
Create GeoSeries.contains_properly method using point_in_polygon. (#…
Browse files Browse the repository at this point in the history
…749)

This PR closes the above named issues relating to creating a .contains method and, more importantly, resolving boundary case inconsistency with `point_in_polygon`.

~As it stands the colinearity test I've added to `is_point_in_polygon` doubles the runtime of brute-force `point_in_polygon` and has no visible effect on the runtime of `quadtree_point_in_polygon`.~

~- Note I need to double check the above benchmark, having set this project down for the last few weeks.~

This depends on #750, please do not review the C++ code here until that PR is merged. Please do review the python code.

## Benchmark

Benchmark results are in, looks like there's no measurable speed difference between 22.12 pre-boundary exclusion and our current implementation:

```
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon
================================================== test session starts ===================================================
platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1
collected 1 item                                                                                                         

api/bench_api.py .                                                                                                 [100%]


---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in s)              Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
bench_point_in_polygon     1.9636  1.9749  1.9678  0.0043  1.9660  0.0045       1;0  0.5082       5           1
---------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=================================================== 1 passed in 16.28s ===================================================
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status
On branch feature/GeoSeries.contains
```
vs `branch-22.12`
```
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon
================================== test session starts ===================================
platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1
collected 1 item                                                                         

api/bench_api.py .                                                                 [100%]


---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in s)              Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
bench_point_in_polygon     1.9516  1.9843  1.9730  0.0126  1.9760  0.0127       1;0  0.5068       5           1
---------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=================================== 1 passed in 16.61s ===================================
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status
On branch benchmark/branch-22.12
```

## Still adding:
- [x] Detailed description of xfail result.
- [x] Self-review existing `.contains` implementation in python.
- [x] Update `.contains` docs when necessary.
- [x] Benchmark again and document here.
- [x] Move binops_with_quadtree.py to next branch.
- [x] `.contains` Examples

Authors:
  - H. Thomson Comer (https://github.com/thomcom)

Approvers:
  - Michael Wang (https://github.com/isVoid)
  - Mark Harris (https://github.com/harrism)

URL: #749
  • Loading branch information
thomcom authored Nov 30, 2022
1 parent 924b570 commit 4ca88ff
Show file tree
Hide file tree
Showing 10 changed files with 759 additions and 14 deletions.
19 changes: 8 additions & 11 deletions python/cuspatial/benchmarks/api/bench_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,12 +118,8 @@ def bench_pairwise_linestring_distance(benchmark, gpu_dataframe):
geometry = gpu_dataframe["geometry"]
benchmark(
cuspatial.pairwise_linestring_distance,
geometry.polygons.ring_offset,
geometry.polygons.x,
geometry.polygons.y,
geometry.polygons.ring_offset,
geometry.polygons.x,
geometry.polygons.y,
geometry,
geometry,
)


Expand Down Expand Up @@ -165,8 +161,8 @@ def bench_quadtree_on_points(benchmark, gpu_dataframe):

def bench_quadtree_point_in_polygon(benchmark, polygons):
polygons = polygons["geometry"].polygons
x_points = (cupy.random.random(10000000) - 0.5) * 360
y_points = (cupy.random.random(10000000) - 0.5) * 180
x_points = (cupy.random.random(50000000) - 0.5) * 360
y_points = (cupy.random.random(50000000) - 0.5) * 180
scale = 5
max_depth = 7
min_size = 125
Expand Down Expand Up @@ -263,15 +259,16 @@ def bench_quadtree_point_to_nearest_linestring(benchmark):


def bench_point_in_polygon(benchmark, gpu_dataframe):
x_points = (cupy.random.random(10000000) - 0.5) * 360
y_points = (cupy.random.random(10000000) - 0.5) * 180
x_points = (cupy.random.random(50000000) - 0.5) * 360
y_points = (cupy.random.random(50000000) - 0.5) * 180
short_dataframe = gpu_dataframe.iloc[0:32]
geometry = short_dataframe["geometry"]
polygon_offset = cudf.Series(geometry.polygons.geometry_offset[0:31])
benchmark(
cuspatial.point_in_polygon,
x_points,
y_points,
geometry.polygons.geometry_offset[0:31],
polygon_offset,
geometry.polygons.ring_offset,
geometry.polygons.x,
geometry.polygons.y,
Expand Down
1 change: 1 addition & 0 deletions python/cuspatial/cuspatial/_lib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ set(cython_sources
interpolate.pyx
nearest_points.pyx
point_in_polygon.pyx
pairwise_point_in_polygon.pyx
polygon_bounding_boxes.pyx
linestring_bounding_boxes.pyx
quadtree.pyx
Expand Down
17 changes: 17 additions & 0 deletions python/cuspatial/cuspatial/_lib/cpp/pairwise_point_in_polygon.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from libcpp.memory cimport unique_ptr

from cudf._lib.column cimport column, column_view


cdef extern from "cuspatial/pairwise_point_in_polygon.hpp" \
namespace "cuspatial" nogil:
cdef unique_ptr[column] pairwise_point_in_polygon(
const column_view & test_points_x,
const column_view & test_points_y,
const column_view & poly_offsets,
const column_view & poly_ring_offsets,
const column_view & poly_points_x,
const column_view & poly_points_y
) except +
42 changes: 42 additions & 0 deletions python/cuspatial/cuspatial/_lib/pairwise_point_in_polygon.pyx
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from libcpp.memory cimport unique_ptr
from libcpp.utility cimport move

from cudf._lib.column cimport Column, column, column_view

from cuspatial._lib.cpp.pairwise_point_in_polygon cimport (
pairwise_point_in_polygon as cpp_pairwise_point_in_polygon,
)


def pairwise_point_in_polygon(
Column test_points_x,
Column test_points_y,
Column poly_offsets,
Column poly_ring_offsets,
Column poly_points_x,
Column poly_points_y
):
cdef column_view c_test_points_x = test_points_x.view()
cdef column_view c_test_points_y = test_points_y.view()
cdef column_view c_poly_offsets = poly_offsets.view()
cdef column_view c_poly_ring_offsets = poly_ring_offsets.view()
cdef column_view c_poly_points_x = poly_points_x.view()
cdef column_view c_poly_points_y = poly_points_y.view()

cdef unique_ptr[column] result

with nogil:
result = move(
cpp_pairwise_point_in_polygon(
c_test_points_x,
c_test_points_y,
c_poly_offsets,
c_poly_ring_offsets,
c_poly_points_x,
c_poly_points_y
)
)

return Column.from_unique_ptr(move(result))
Empty file.
89 changes: 89 additions & 0 deletions python/cuspatial/cuspatial/core/binops/contains.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Copyright (c) 2022, NVIDIA CORPORATION.

from cudf import Series
from cudf.core.column import as_column

from cuspatial._lib.pairwise_point_in_polygon import (
pairwise_point_in_polygon as cpp_pairwise_point_in_polygon,
)
from cuspatial._lib.point_in_polygon import (
point_in_polygon as cpp_point_in_polygon,
)
from cuspatial.utils.column_utils import normalize_point_columns


def contains_properly(
test_points_x,
test_points_y,
poly_offsets,
poly_ring_offsets,
poly_points_x,
poly_points_y,
):
"""Compute from a series of points and a series of polygons which points
are properly contained within the corresponding polygon. Polygon A contains
Point B properly if B intersects the interior of A but not the boundary (or
exterior).
Note that polygons must be closed: the first and last vertex of each
polygon must be the same.
Parameters
----------
test_points_x
x-coordinate of points to test for containment
test_points_y
y-coordinate of points to test for containment
poly_offsets
beginning index of the first ring in each polygon
poly_ring_offsets
beginning index of the first point in each ring
poly_points_x
x-coordinates of polygon vertices
poly_points_y
y-coordinates of polygon vertices
Returns
-------
result : cudf.Series
A Series of boolean values indicating whether each point falls
within its corresponding polygon.
"""

if len(poly_offsets) == 0:
return Series()
(
test_points_x,
test_points_y,
poly_points_x,
poly_points_y,
) = normalize_point_columns(
as_column(test_points_x),
as_column(test_points_y),
as_column(poly_points_x),
as_column(poly_points_y),
)
poly_offsets_column = as_column(poly_offsets, dtype="int32")
poly_ring_offsets_column = as_column(poly_ring_offsets, dtype="int32")

if len(test_points_x) == len(poly_offsets):
pip_result = cpp_pairwise_point_in_polygon(
test_points_x,
test_points_y,
poly_offsets_column,
poly_ring_offsets_column,
poly_points_x,
poly_points_y,
)
else:
pip_result = cpp_point_in_polygon(
test_points_x,
test_points_y,
poly_offsets_column,
poly_ring_offsets_column,
poly_points_x,
poly_points_y,
)

result = Series(pip_result, dtype="bool")
return result
Loading

0 comments on commit 4ca88ff

Please sign in to comment.