Create GeoSeries.contains_properly method using point_in_polygon. (#…

…749) This PR closes the above named issues relating to creating a .contains method and, more importantly, resolving boundary case inconsistency with `point_in_polygon`. ~As it stands the colinearity test I've added to `is_point_in_polygon` doubles the runtime of brute-force `point_in_polygon` and has no visible effect on the runtime of `quadtree_point_in_polygon`.~ ~- Note I need to double check the above benchmark, having set this project down for the last few weeks.~ This depends on #750, please do not review the C++ code here until that PR is merged. Please do review the python code. ## Benchmark Benchmark results are in, looks like there's no measurable speed difference between 22.12 pre-boundary exclusion and our current implementation: ``` (rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon ================================================== test session starts =================================================== platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0 benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1 collected 1 item api/bench_api.py . [100%] ---------------------------------------------- benchmark: 1 tests --------------------------------------------- Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations --------------------------------------------------------------------------------------------------------------- bench_point_in_polygon 1.9636 1.9749 1.9678 0.0043 1.9660 0.0045 1;0 0.5082 5 1 --------------------------------------------------------------------------------------------------------------- Legend: Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile. OPS: Operations Per Second, computed as 1 / Mean =================================================== 1 passed in 16.28s =================================================== (rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status On branch feature/GeoSeries.contains ``` vs `branch-22.12` ``` (rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon ================================== test session starts =================================== platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0 benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1 collected 1 item api/bench_api.py . [100%] ---------------------------------------------- benchmark: 1 tests --------------------------------------------- Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations --------------------------------------------------------------------------------------------------------------- bench_point_in_polygon 1.9516 1.9843 1.9730 0.0126 1.9760 0.0127 1;0 0.5068 5 1 --------------------------------------------------------------------------------------------------------------- Legend: Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile. OPS: Operations Per Second, computed as 1 / Mean =================================== 1 passed in 16.61s =================================== (rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status On branch benchmark/branch-22.12 ``` ## Still adding: - [x] Detailed description of xfail result. - [x] Self-review existing `.contains` implementation in python. - [x] Update `.contains` docs when necessary. - [x] Benchmark again and document here. - [x] Move binops_with_quadtree.py to next branch. - [x] `.contains` Examples Authors: - H. Thomson Comer (https://github.com/thomcom) Approvers: - Michael Wang (https://github.com/isVoid) - Mark Harris (https://github.com/harrism) URL: #749
rapidsai · Nov 30, 2022 · 4ca88ff · 4ca88ff
1 parent 924b570
commit 4ca88ff
Show file tree

Hide file tree

Showing 10 changed files with 759 additions and 14 deletions.
diff --git a/python/cuspatial/benchmarks/api/bench_api.py b/python/cuspatial/benchmarks/api/bench_api.py
@@ -118,12 +118,8 @@ def bench_pairwise_linestring_distance(benchmark, gpu_dataframe):
     geometry = gpu_dataframe["geometry"]
     benchmark(
         cuspatial.pairwise_linestring_distance,
-        geometry.polygons.ring_offset,
-        geometry.polygons.x,
-        geometry.polygons.y,
-        geometry.polygons.ring_offset,
-        geometry.polygons.x,
-        geometry.polygons.y,
+        geometry,
+        geometry,
     )
 
 
@@ -165,8 +161,8 @@ def bench_quadtree_on_points(benchmark, gpu_dataframe):
 
 def bench_quadtree_point_in_polygon(benchmark, polygons):
     polygons = polygons["geometry"].polygons
-    x_points = (cupy.random.random(10000000) - 0.5) * 360
-    y_points = (cupy.random.random(10000000) - 0.5) * 180
+    x_points = (cupy.random.random(50000000) - 0.5) * 360
+    y_points = (cupy.random.random(50000000) - 0.5) * 180
     scale = 5
     max_depth = 7
     min_size = 125
@@ -263,15 +259,16 @@ def bench_quadtree_point_to_nearest_linestring(benchmark):
 
 
 def bench_point_in_polygon(benchmark, gpu_dataframe):
-    x_points = (cupy.random.random(10000000) - 0.5) * 360
-    y_points = (cupy.random.random(10000000) - 0.5) * 180
+    x_points = (cupy.random.random(50000000) - 0.5) * 360
+    y_points = (cupy.random.random(50000000) - 0.5) * 180
     short_dataframe = gpu_dataframe.iloc[0:32]
     geometry = short_dataframe["geometry"]
+    polygon_offset = cudf.Series(geometry.polygons.geometry_offset[0:31])
     benchmark(
         cuspatial.point_in_polygon,
         x_points,
         y_points,
-        geometry.polygons.geometry_offset[0:31],
+        polygon_offset,
         geometry.polygons.ring_offset,
         geometry.polygons.x,
         geometry.polygons.y,

diff --git a/python/cuspatial/cuspatial/_lib/CMakeLists.txt b/python/cuspatial/cuspatial/_lib/CMakeLists.txt
@@ -18,6 +18,7 @@ set(cython_sources
     interpolate.pyx
     nearest_points.pyx
     point_in_polygon.pyx
+    pairwise_point_in_polygon.pyx
     polygon_bounding_boxes.pyx
     linestring_bounding_boxes.pyx
     quadtree.pyx

diff --git a/python/cuspatial/cuspatial/_lib/cpp/pairwise_point_in_polygon.pxd b/python/cuspatial/cuspatial/_lib/cpp/pairwise_point_in_polygon.pxd
@@ -0,0 +1,17 @@
+# Copyright (c) 2022, NVIDIA CORPORATION.
+
+from libcpp.memory cimport unique_ptr
+
+from cudf._lib.column cimport column, column_view
+
+
+cdef extern from "cuspatial/pairwise_point_in_polygon.hpp" \
+        namespace "cuspatial" nogil:
+    cdef unique_ptr[column] pairwise_point_in_polygon(
+        const column_view & test_points_x,
+        const column_view & test_points_y,
+        const column_view & poly_offsets,
+        const column_view & poly_ring_offsets,
+        const column_view & poly_points_x,
+        const column_view & poly_points_y
+    ) except +
diff --git a/python/cuspatial/cuspatial/_lib/pairwise_point_in_polygon.pyx b/python/cuspatial/cuspatial/_lib/pairwise_point_in_polygon.pyx
@@ -0,0 +1,42 @@
+# Copyright (c) 2022, NVIDIA CORPORATION.
+
+from libcpp.memory cimport unique_ptr
+from libcpp.utility cimport move
+
+from cudf._lib.column cimport Column, column, column_view
+
+from cuspatial._lib.cpp.pairwise_point_in_polygon cimport (
+    pairwise_point_in_polygon as cpp_pairwise_point_in_polygon,
+)
+
+
+def pairwise_point_in_polygon(
+    Column test_points_x,
+    Column test_points_y,
+    Column poly_offsets,
+    Column poly_ring_offsets,
+    Column poly_points_x,
+    Column poly_points_y
+):
+    cdef column_view c_test_points_x = test_points_x.view()
+    cdef column_view c_test_points_y = test_points_y.view()
+    cdef column_view c_poly_offsets = poly_offsets.view()
+    cdef column_view c_poly_ring_offsets = poly_ring_offsets.view()
+    cdef column_view c_poly_points_x = poly_points_x.view()
+    cdef column_view c_poly_points_y = poly_points_y.view()
+
+    cdef unique_ptr[column] result
+
+    with nogil:
+        result = move(
+            cpp_pairwise_point_in_polygon(
+                c_test_points_x,
+                c_test_points_y,
+                c_poly_offsets,
+                c_poly_ring_offsets,
+                c_poly_points_x,
+                c_poly_points_y
+            )
+        )
+
+    return Column.from_unique_ptr(move(result))
diff --git a/python/cuspatial/cuspatial/core/binops/__init__.py b/python/cuspatial/cuspatial/core/binops/__init__.py
diff --git a/python/cuspatial/cuspatial/core/binops/contains.py b/python/cuspatial/cuspatial/core/binops/contains.py
@@ -0,0 +1,89 @@
+# Copyright (c) 2022, NVIDIA CORPORATION.
+
+from cudf import Series
+from cudf.core.column import as_column
+
+from cuspatial._lib.pairwise_point_in_polygon import (
+    pairwise_point_in_polygon as cpp_pairwise_point_in_polygon,
+)
+from cuspatial._lib.point_in_polygon import (
+    point_in_polygon as cpp_point_in_polygon,
+)
+from cuspatial.utils.column_utils import normalize_point_columns
+
+
+def contains_properly(
+    test_points_x,
+    test_points_y,
+    poly_offsets,
+    poly_ring_offsets,
+    poly_points_x,
+    poly_points_y,
+):
+    """Compute from a series of points and a series of polygons which points
+    are properly contained within the corresponding polygon. Polygon A contains
+    Point B properly if B intersects the interior of A but not the boundary (or
+    exterior).
+
+    Note that polygons must be closed: the first and last vertex of each
+    polygon must be the same.
+
+    Parameters
+    ----------
+    test_points_x
+        x-coordinate of points to test for containment
+    test_points_y
+        y-coordinate of points to test for containment
+    poly_offsets
+        beginning index of the first ring in each polygon
+    poly_ring_offsets
+        beginning index of the first point in each ring
+    poly_points_x
+        x-coordinates of polygon vertices
+    poly_points_y
+        y-coordinates of polygon vertices
+
+    Returns
+    -------
+    result : cudf.Series
+        A Series of boolean values indicating whether each point falls
+        within its corresponding polygon.
+    """
+
+    if len(poly_offsets) == 0:
+        return Series()
+    (
+        test_points_x,
+        test_points_y,
+        poly_points_x,
+        poly_points_y,
+    ) = normalize_point_columns(
+        as_column(test_points_x),
+        as_column(test_points_y),
+        as_column(poly_points_x),
+        as_column(poly_points_y),
+    )
+    poly_offsets_column = as_column(poly_offsets, dtype="int32")
+    poly_ring_offsets_column = as_column(poly_ring_offsets, dtype="int32")
+
+    if len(test_points_x) == len(poly_offsets):
+        pip_result = cpp_pairwise_point_in_polygon(
+            test_points_x,
+            test_points_y,
+            poly_offsets_column,
+            poly_ring_offsets_column,
+            poly_points_x,
+            poly_points_y,
+        )
+    else:
+        pip_result = cpp_point_in_polygon(
+            test_points_x,
+            test_points_y,
+            poly_offsets_column,
+            poly_ring_offsets_column,
+            poly_points_x,
+            poly_points_y,
+        )
+
+    result = Series(pip_result, dtype="bool")
+    return result