Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken hausdorff test #1295

Merged
merged 3 commits into from
Nov 16, 2023
Merged

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Nov 15, 2023

This PR revises the gold result of hausdorff empty input test to address
an upstream change in cudf.

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@isVoid isVoid requested a review from a team as a code owner November 15, 2023 02:23
@isVoid isVoid requested review from trxcllnt and thomcom November 15, 2023 02:23
@github-actions github-actions bot added the Python Related to Python code label Nov 15, 2023
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix seems fine to me, although I wonder if maybe it should be changed the other way in the long term i.e. have cuspatial return an empty RangeIndex rather than force cudf to produce an empty Index.

@harrism
Copy link
Member

harrism commented Nov 15, 2023

Indeed that's what chatGPT thinks. https://chat.openai.com/share/43a1b380-e356-4b5e-8c87-08b07003435b

How hard would it be to fix cuSpatial to match cuDF here?

@isVoid
Copy link
Contributor Author

isVoid commented Nov 15, 2023

Maybe not that hard - probably need one more argument for _from_columns call.

@isVoid
Copy link
Contributor Author

isVoid commented Nov 15, 2023

Fundamentally these two calls are different:

In [4]: print(cudf.DataFrame([]).columns)
RangeIndex(start=0, stop=0, step=1)

In [5]: print(cudf.DataFrame._from_columns([], range(0)).columns)
Index([], dtype='object')

One is the public API that cuspatial generate the gold result from, and the other is the internal factory function that cuspatial depends upon. I think the proper way forward is that cuspatial shouldn't rely on using internal function from cuDF and attempt to reconstruct the result using public API.

@isVoid
Copy link
Contributor Author

isVoid commented Nov 15, 2023

Ooops, I looked at the wrong place - empty input has a shortcut that directly return an empty dataframe. So the actual cause of error is:

In [3]: cudf.DataFrame([]).columns
Out[3]: RangeIndex(start=0, stop=0, step=1)

In [4]: cudf.DataFrame().columns
Out[4]: Index([], dtype='object')

@isVoid
Copy link
Contributor Author

isVoid commented Nov 15, 2023

Ok I think I prefer this fix - not having to fix the golden result seems a better way forward.

@bdice bdice added bug Something isn't working improvement Improvement / enhancement to an existing function labels Nov 15, 2023
@harrism harrism added non-breaking Non-breaking change and removed improvement Improvement / enhancement to an existing function labels Nov 15, 2023
@vyasr
Copy link
Contributor

vyasr commented Nov 16, 2023

I agree I prefer this fix. Thanks for indulging my question!

@raydouglass raydouglass merged commit 252165c into rapidsai:branch-23.12 Nov 16, 2023
57 of 58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Related to Python code
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

6 participants