Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cast result of pip_bitmap_column_to_binary_array to cupy array #14

Closed
wants to merge 713 commits into from

Conversation

isVoid
Copy link
Owner

@isVoid isVoid commented Jul 30, 2024

Description

The performance regression in rapidsai#1413 is due to numba's DeviceNDArray
is slow in slicing. Recent cudf's DataFrame construction has simplified the construction and delegated construction
to similar logic that handles __cuda_array_interface__. Since the construction involves slicing the array, we need
this operation to be fast. In that sense, we should cast the use of DeviceNDArray to cupy array to support fast
slicing.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

harrism and others added 30 commits April 5, 2023 21:48
Closes rapidsai#1006
Closes rapidsai#1007

I realized that the `join_quadtree_and_bounding_boxes()` tests are redundant to the `quadtree_point_in_polygon` tests and `quadtree_point_to_nearest_linestring` tests since the "small" test is duplicated in those. So I eliminated that test and added expectations for `join_quadtree_and_bounding_boxes` in those tests.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Paul Taylor (https://github.com/trxcllnt)

URL: rapidsai#1019
Forward-merge branch-23.04 to branch-23.06
This PR adds `linestring-polygon` distance API. This API divides up the work into two parts: point-in-polygon test and a load-balanced all-pairs segment-segment distance compute kernel.

Closes rapidsai#1027 
Depends on rapidsai#1026 
Contributes to rapidsai#757

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - H. Thomson Comer (https://github.com/thomcom)

URL: rapidsai#1011
Forward-merge branch-23.04 to branch-23.06
Closes rapidsai#1015
Depends on rapidsai#1009 

This PR implements `intersects` and all of the feature combinations that depend exclusively on intersects, as listed in rapidsai#1015.

Authors:
  - H. Thomson Comer (https://github.com/thomcom)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1016
Forward-merge branch-23.04 to branch-23.06
…functor (rapidsai#1043)

This PR fixes rapidsai#1042 . In `point_polygon_intersects`, the key comparator incorrectly uses the binary operator that's used for values.

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - H. Thomson Comer (https://github.com/thomcom)
  - Paul Taylor (https://github.com/trxcllnt)

URL: rapidsai#1043
Forward-merge branch-23.04 to branch-23.06
Closes rapidsai#1014

This PR simply reduces the size of some of the equals tests from 10000 to 100. Long run time had to do with serializing 10000x Shapely objects.

It also fixes a bug in `MultiPointMultiPointEquals` that I guess was able to exist due to the size of the old test and the seed of the random generator.

Authors:
  - H. Thomson Comer (https://github.com/thomcom)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1051
…ai#1048)

Updates tests to use `thrust::host_vector<bool>` instead of `std::vector<bool>` and in one case `std::vector<int>` where int was logically bool.

closes rapidsai#823

Authors:
  - Christopher Harris (https://github.com/cwharris)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1048
use make_device_vector in pairwise_point_in_polygon_test

closes rapidsai#825

Authors:
  - Christopher Harris (https://github.com/cwharris)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1049
This PR changes the C++ hausdorff distance interface to return the owning item that contains the interface and a table view into the result. This avoids a numba reshape that slows performance when there are many columns. 

Python API benchmark (10K input spaces)
branch-23.06:
```
--------------------------------------------------------- benchmark: 1 tests ---------------------------------------------------------
Name (time in s)                                     Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------
bench_directed_hausdorff_distance_many_spaces     1.4473  1.4938  1.4688  0.0168  1.4655  0.0175       2;0  0.6808       5           1
--------------------------------------------------------------------------------------------------------------------------------------
```

this pr:
```
-------------------------------------------------------------- benchmark: 1 tests --------------------------------------------------------------
Name (time in ms)                                      Min       Max      Mean   StdDev    Median      IQR  Outliers     OPS  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------
bench_directed_hausdorff_distance_many_spaces     221.5014  288.3597  237.1724  28.7577  223.4605  21.0320       1;1  4.2163       5           1
------------------------------------------------------------------------------------------------------------------------------------------------
```

6.5x speedup.

contributes to rapidsai#1013 

The CPP benchmarks shows no significant performance regression:
```
Comparing branch2304.json to after.json
Benchmark                                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------
HausdorffBenchmark/hausdorff/32/4/manual_time                    -0.0369         -0.0519             0             0             1             0
HausdorffBenchmark/hausdorff/64/4/manual_time                    -0.0245         -0.0480             0             0             1             1
HausdorffBenchmark/hausdorff/512/4/manual_time                   -0.0173         -0.0125             1             1             1             1
HausdorffBenchmark/hausdorff/4096/4/manual_time                  +0.0603         +0.1068             8             8             8             9
HausdorffBenchmark/hausdorff/8192/4/manual_time                  -0.0256         -0.0240            18            18            18            18
HausdorffBenchmark/hausdorff/32/8/manual_time                    -0.0068         +0.0780             0             0             1             1
HausdorffBenchmark/hausdorff/64/8/manual_time                    -0.0168         -0.0213             0             0             1             1
HausdorffBenchmark/hausdorff/512/8/manual_time                   -0.0216         -0.0242             1             1             2             2
HausdorffBenchmark/hausdorff/4096/8/manual_time                  -0.0296         -0.0092            12            12            13            13
HausdorffBenchmark/hausdorff/8192/8/manual_time                  +0.0314         +0.0328            28            29            28            29
HausdorffBenchmark/hausdorff/32/64/manual_time                   -0.0139         -0.0879             0             0             1             1
HausdorffBenchmark/hausdorff/64/64/manual_time                   -0.0001         +0.0343             0             0             1             1
HausdorffBenchmark/hausdorff/512/64/manual_time                  -0.0935         -0.0816             5             5             6             6
HausdorffBenchmark/hausdorff/4096/64/manual_time                 -0.0299         -0.0295           186           180           187           181
HausdorffBenchmark/hausdorff/8192/64/manual_time                 -0.1097         -0.1094           685           610           685           610
HausdorffBenchmark/hausdorff/32/128/manual_time                  +0.0065         +0.1026             0             0             1             1
HausdorffBenchmark/hausdorff/64/128/manual_time                  +0.0035         -0.0169             1             1             1             1
HausdorffBenchmark/hausdorff/512/128/manual_time                 -0.0312         -0.0299            10            10            11            10
HausdorffBenchmark/hausdorff/4096/128/manual_time                -0.0388         -0.0388           613           589           613           589
HausdorffBenchmark/hausdorff/8192/128/manual_time                -0.0080         -0.0080          2353          2335          2354          2335

```

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - H. Thomson Comer (https://github.com/thomcom)

URL: rapidsai#916
Addressing upstream rapids-cmake change: rapidsai/rapids-cmake#397

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Robert Maynard (https://github.com/robertmaynard)

URL: rapidsai#1070
This PR is updating the runner labels to use ARC V2 self-hosted runners for GPU jobs. This is needed to resolve the auto-scalling issues.

Authors:
  - Jordan Jacobelli (https://github.com/jjacobelli)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: rapidsai#1066
This PR fixes a bug in `pairwise_linestring_intersection`, where `point_flags` is incorrectly used to remove points in points array due to leftover results from find_duplicate_points.

Previously, the code incorrectly  assumed that `point_flags` will always be written to prior to `merge_point_on_segment` kernel, however, this may not be true if the result segment array is empty, and the kernel launch could be evaded, leading to that the second `remove_if` operates on the "leftover" from the previous `point_flag` ramnant.

This fix is to only run segment cleanup kernels if there is at least one segment result in the output. This not only fixes rapidsai#1067 , but also serve as an optimization to avoid unnecessary kernel launching.

It should be pointed out that the result from python isn't exactly 1-1 matching with geopandas, as geopandas returns
```python
0    MULTIPOINT (1.00000 1.00000, 0.00000 0.00000)
1    MULTIPOINT (1.00000 1.00000, 0.00000 0.00000)
2    MULTIPOINT (1.00000 1.00000, 0.00000 0.00000)
```
While cuspatial returns the result in 6 point rows with geometry offsets.

Authors:
  - Michael Wang (https://github.com/isVoid)
  - Mark Harris (https://github.com/harrism)

Approvers:
  - H. Thomson Comer (https://github.com/thomcom)
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1069
This PR updates the clang-format version used by pre-commit.  Fixes rapidsai#1078.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Mark Harris (https://github.com/harrism)

Approvers:
  - H. Thomson Comer (https://github.com/thomcom)
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1072
Instead of using `rapids-get-rapids-version-from-git` we can just hardcode the version and use `update-version.sh` to update it

Authors:
  - Jordan Jacobelli (https://github.com/jjacobelli)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: rapidsai#1088
This PR adds linestring-polygon column API.

closes rapidsai#1028

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - H. Thomson Comer (https://github.com/thomcom)
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1030
…apidsai#1081)

Fixes rapidsai#1083.

* Consolidates all headers into `include/cuspatial/`. This works without conflicts because header-only API headers have the `.cuh` extension while column-based API headers have the `.cpp` extension.
* Consolidates and reorganizes tests. Header-only and column-based tests are stored together.
* Updates documentation to remove references to `experimental` and also renames the `REFACTORING_GUIDE.md` to `HEADER_ONLY_API_GUIDE.md` and updates it.

Authors:
  - Mark Harris (https://github.com/harrism)

Approvers:
  - H. Thomson Comer (https://github.com/thomcom)
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1081
This PR contains 2 major additions:
1. Range casting methods. Developer can now cast a `multipolygon_range` to a `multilinestring_range` or a `multipoint_range`. This change is included in `multipolygon_range.cuh` and `multipolygon_range_test.cu`.
2. Pairwise polygon-polygon distance. This change is separated in two parts:
    1. linestring-linestring compute kernel is refactored into algorithm/linetring_distance.cuh. 
    2. This kernel is then reused to compute polygon ring distances.

Closes rapidsai#1052

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1065
This PR refreshes the hausdorff clustering example as a notebook.
closes rapidsai#1013

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Mark Harris (https://github.com/harrism)
  - H. Thomson Comer (https://github.com/thomcom)

URL: rapidsai#922
Closes rapidsai#1046
Closes rapidsai#1036

This PR adds a new binary predicate test dispatch test system. The test dispatcher creates one test for each ordered pair of features from the set (point, linestring, polygon) and each predicate that feature tuple can be applied to:

- contains
- covers
- crosses
- disjoint
- geom_equals
- intersects
- overlaps
- touches
- within

The combination of 9 predicates and 33 tests creates 297 tests that cover all possible combinations of simple features and their predicates.

While development is underway, the test dispatcher automatically `xfails` any test that fails or hasn't been implemented yet in order to pass CI.

The test dispatcher outputs diagnostic results during each test run. An output file `test_binpred_test_dispatch.log` is created containing all of the failing tests, including their name, a visual description of the feature tuple and relationship, the shapely objects used to create the test, and the runtime name of the test so it is easy for a developer (myself) to identify which test failed and rerun it. It also creates four .csv files during runtime that collect the results of each test pass or fail relative to which predicate is being run and which pair of features are being tested. These .csv files can be displayed using `tests/binpred/summarize_binpred_test_dispatch_results.py`, which will output a dataframe of each CSV file thusly:

```
(rapids) coder ➜ ~/cuspatial/python/cuspatial/cuspatial $ python tests/binpreds/summarize_binpred_test_dispatch_results.py 
     predicate  predicate_passes
0  geom_equals                20
1   intersects                11
2       covers                17
3      crosses                 6
4     disjoint                 9
5     overlaps                20
6       within                14
     predicate  predicate_fails
0     contains               33
1  geom_equals               13
2   intersects               22
3       covers               16
4      crosses               27
5     disjoint               24
6     overlaps               13
7      touches               33
8       within               19
                                             feature  feature_passes
0     (<ColumnType.POINT: 1>, <ColumnType.POINT: 1>)              14
1   (<ColumnType.POINT: 1>, <ColumnType.POLYGON: 4>)              22
2  (<ColumnType.LINESTRING: 3>, <ColumnType.LINES...              16
3  (<ColumnType.LINESTRING: 3>, <ColumnType.POLYG...              38
4  (<ColumnType.POINT: 1>, <ColumnType.LINESTRING...               7
                                             feature  feature_fails
0     (<ColumnType.POINT: 1>, <ColumnType.POINT: 1>)              4
1  (<ColumnType.POINT: 1>, <ColumnType.LINESTRING...             20
2   (<ColumnType.POINT: 1>, <ColumnType.POLYGON: 4>)             14
3  (<ColumnType.LINESTRING: 3>, <ColumnType.LINES...             20
4  (<ColumnType.LINESTRING: 3>, <ColumnType.POLYG...             52
5  (<ColumnType.POLYGON: 4>, <ColumnType.POLYGON:...             90
```
Without additional modifications, the test dispatcher produces 97 passing results and 200 xfailing results. In a shortly upcoming PR, updates to the basic predicates and binpred test architecture increase the number of passing results to 274, with 25 xfails. These PRs are separated for ease of review.

Authors:
  - H. Thomson Comer (https://github.com/thomcom)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1085
…nt touch at endpoints is miscomputed as a degenerate segment (rapidsai#1093)

Fixes rapidsai#1091

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1093
This contribution adds `pairwise_multipoint_equals_count` to the column and header-only APIs. `pairwise_multipoint_equals_count` counts the number of times that each point in the lhs occurs in the rhs.

```
auto result = pairwise_multipoint_equals_count(
    {{{0, 0}},{{1, 1, 2, 2}},{{0, 0}, {1, 1}, {2, 2}}},
    {
        {{0, 0}, {1, 1}, {2, 2}}
        {{0, 0}, {1, 1}, {2, 2}}
        {{0, 0}, {1, 1}, {2, 2}}
    }
)
result = {1, 2, 3}
```

Written while pairing with @isVoid.

Authors:
  - H. Thomson Comer (https://github.com/thomcom)
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Michael Wang (https://github.com/isVoid)
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1022
This change passes through the value of `SCCACHE_S3_NO_CREDENTIALS` to our `conda` builds, enabling devs to utilize the `sccache` cache that's populated by CI when they are reproducing build issues locally as per [these](https://docs.rapids.ai/resources/reproducing-ci/) instructions.

Authors:
  - Jake Awe (https://github.com/AyodeAwe)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: rapidsai#1109
…idsai#1110)

This PR pins cuml dependency in notebook testing environment to nightlies as required in the CI environment.

Authors:
  - Michael Wang (https://github.com/isVoid)
  - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: rapidsai#1110
…an edge, result is asserted false (rapidsai#1108)

Fixes rapidsai#1103, current algorithm tests if the test point is collinear with an edge of the polygon, the point is asserted to be on the edge. This is not true, because for a point to be on the edge, the point also needs to be within the range where the edge covers. Collinearity test only test if the point is covered by the line that the edge coincides.

This PR fixes this bug by adding additional tests to guarantee that the point is on an edge iff the point is collinear with the line where the edge coincides as well as the x coordinate of the point is within the closed range of the edge's x coordinates.

This PR also fixes an additional bug where the col-linearity flag is not reset after each iteration of a ring.

Authors:
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1108
trxcllnt and others added 29 commits May 6, 2024 20:45
The `JOIN_POINT_IN_POLYGON_LARGE_TEST_EXP` test isn't returning any points because they aren't evenly distributed.

The first commit adds a check to ensure the test fails when no results are returned. The second commit fixes the `JOIN_POINT_IN_POLYGON_LARGE_TEST_EXP` test.

Fixes rapidsai#1380.

Authors:
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1346
* Update the `cuda11.8-conda` devcontainer's base image
* Remove the devcontainer when the VSCode window closes
* Adds a descriptive name to the running container:
  ```shell
  $ docker ps -a
  CONTAINER ID   IMAGE              ...  NAMES
  0dbb364fe544   vsc-cuspatial-...  ...  rapids-cuspatial-24.06-cuda12.2-conda
  
  $ docker rm -f rapids-cuspatial-24.06-cuda12.2-conda
  ```

Authors:
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#1375
…pidsai#1381)

Followup to rapidsai#1346.

* Fixes some typos/omissions in types and CMake.
* Adds a new test that OOMs when quadtree_point_in_polygon is passed too many input polygons.
* Fixes quadtree spatial join to handle overflow while counting and more conservatively allocate output buffers.

Fixes rapidsai#890.

* [Failing test run](https://github.com/rapidsai/cuspatial/actions/runs/8979838628/job/24662981350#step:7:840)
* [Passing test run](https://github.com/rapidsai/cuspatial/actions/runs/8981106226/job/24666403165#step:7:840)

Authors:
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1381
Similar to rapidsai/cudf#15552, we are testing [building RAPIDS with CCCL's main branch](NVIDIA/cccl#1667) to get ahead of any breaking changes.

Authors:
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#1382
Fix forward-merge `branch-24.06` into `branch-24.08`
Contributes to rapidsai/build-planning#62.

It looks like this project's wheels and conda recipes have unnecessary dependencies on `setuptools`. I suspect those are left over from before the project was cut over to `scikit-build-core`.

This proposes removing those.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Jake Awe (https://github.com/AyodeAwe)

URL: rapidsai#1389
Forward-merge branch-24.06 into branch-24.08
closes rapidsai#1395

`pairwise_linestring_intersection`, tested in this file, returns a `cudf.Column` for one of it's arguments and used `to_pandas` to test it's output. in 24.08, the output of `Column.to_pandas` was changed to a `pandas.Index` instead of a `pandas.Series` so modified the test accordingly

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Paul Taylor (https://github.com/trxcllnt)
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1398
Recently devcontainer names were updated to include the current user's name. However, in GitHub Codespaces, the username is not defined. As a result, the container name starts with a dash. This is not allowed by GitHub Codespaces, so it fails to launch.

This PR adds a default value of `anon` to the devcontainer username.

See rapidsai/cudf#15784 for more information.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Paul Taylor (https://github.com/trxcllnt)
  - Mark Harris (https://github.com/harrism)
  - James Lamb (https://github.com/jameslamb)

URL: rapidsai#1396
This PR removes text builds of the documentation, which we do not currently use for anything. Contributes to rapidsai/build-planning#71.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Mark Harris (https://github.com/harrism)
  - Jake Awe (https://github.com/AyodeAwe)

URL: rapidsai#1394
Contributes to rapidsai/build-planning#31
Contributes to rapidsai/dependency-file-generator#89

Proposes introducing `rapids-build-backend` as this project's build backend, to reduce the complexity of various CI/build scripts.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#1393
…#1400)

An upstream change in cudf uncovered a bug in `_GeoSeriesUtility._from_data` where `False` was being passed to `cudf.Series(index=)` which is an invalid value. The only valid `index` values are an actual `cudf.Index` or `None`, so setting to `None` as a more appropriate default value

Also updates the location of `assert_eq` which moved in rapidsai/cudf#16063

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Paul Taylor (https://github.com/trxcllnt)
  - Mark Harris (https://github.com/harrism)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#1400
This PR fixes a bug in the multipolygon geometry iterator. The geometry iterator was returning part iterators by mistake, which masked an issue that we were patching out in rapids-cmake's CCCL. See rapidsai/rapids-cmake#511.

With this fix, we can remove that patch from rapids-cmake's CCCL: rapidsai/rapids-cmake#640

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Michael Schellenberger Costa (https://github.com/miscco)
  - Mark Harris (https://github.com/harrism)
  - Michael Wang (https://github.com/isVoid)
  - Paul Taylor (https://github.com/trxcllnt)

URL: rapidsai#1402
Contributes to rapidsai/build-planning#80

Adds constraints to avoid pulling in CMake 3.30.0, for the reasons described in that issue.

Authors:
  - James Lamb (https://github.com/jameslamb)
  - Paul Taylor (https://github.com/trxcllnt)

Approvers:
  - Paul Taylor (https://github.com/trxcllnt)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#1401
Exploring notebook fixes with geopandas 1.0.1

Authors:
  - Benjamin Zaitlen (https://github.com/quasiben)
  - Michael Wang (https://github.com/isVoid)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#1404
…, skip long-running notebook, fix some docs (rapidsai#1407)

Unblocks CI.

CI in https://github.com/rapidsai/docker has been failing because of errors in the `cuspatial_api_examples.ipynb` notebook. Specifically, that notebook refers unconditionally to local files that are not guaranteed to exist.

To fix this, that proposes the following:

* in the notebook, conditionally recreate those files if they don't yet exist
* always test the notebooks in `docs/` in CI here

CI **here** is failing because the `nyc_taxi_years_correlation` notebook recently started taking a prohibitively long time to run. This proposes:

* skipping `nyc_taxi_years_correlation` notebook in CI

And some other small things noticed while doing all this:

* fixing typo (`rint -> ring`) and argument ordering in `GeoSeries.from_polygons_xy()` docs

Authors:
   - James Lamb (https://github.com/jameslamb)

Approvers:
   - Ray Douglass (https://github.com/raydouglass)
This PR updates the latest CUDA build/test version 12.2.2 to 12.5.1.

Contributes to rapidsai/build-planning#73

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - https://github.com/jakirkham

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - https://github.com/jakirkham

URL: rapidsai#1405
Previously `cmake` was added to `requirements/host`. However it is a build tool. So should be placed in `requirements/build`. This makes that change in relevant recipes.

Authors:
  - https://github.com/jakirkham

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: rapidsai#1409
rapidsai/cudf#16285 makes `_from_data` explicitly requires the `data.values()` to all be a `ColumnBase`. This PR either ensures they are columns or just goes through the normal `GeoDataFrame`/`DataFrame` constructor if they are not.

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#1415
Contributes to rapidsai/build-planning#31

In short, RAPIDS DLFW builds want to produce wheels with unsuffixed dependencies, e.g. `cudf` depending on `rmm`, not `rmm-cu12`.

This PR is part of a series across all of RAPIDS to try to support that type of build by setting up CUDA-suffixed and CUDA-unsuffixed dependency lists in `dependencies.yaml`.

For more details, see:
* rapidsai/build-planning#31 (comment)
* rapidsai/cudf#16183

## Notes for Reviewers

### Why target 24.08?

This is targeting 24.08 because:

1. it should be very low-risk
2. getting these changes into 24.08 prevents the need to carry around patches for every library in DLFW builds using RAPIDS 24.08

Authors:
  - James Lamb (https://github.com/jameslamb)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#1414
@isVoid isVoid closed this Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.