Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change IPv4 convert APIs to support UINT32 instead of INT64 #16489

Merged
merged 3 commits into from
Aug 8, 2024

Conversation

davidwendt
Copy link
Contributor

@davidwendt davidwendt commented Aug 2, 2024

Description

Changes the integer type for cudf::strings::ipv4_to_integers and cudf::strings::integers_to_ipv4 to use UINT32 types instead of INT64. The INT64 type was originally chosen because libcudf did not support unsigned types at the time.
This is a breaking change since the basic input/output type is changed.

Closes #16324

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added feature request New feature or request 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Aug 2, 2024
@davidwendt davidwendt self-assigned this Aug 2, 2024
@github-actions github-actions bot added the Python Affects Python cuDF API. label Aug 2, 2024
Copy link
Contributor

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python changes look good

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Aug 5, 2024
@davidwendt davidwendt marked this pull request as ready for review August 6, 2024 16:44
@davidwendt davidwendt requested review from a team as code owners August 6, 2024 16:44
@davidwendt davidwendt added breaking Breaking change and removed non-breaking Non-breaking change labels Aug 6, 2024
if self.dtype != cudf.dtype("int64"):
raise TypeError("Only int64 type can be converted to ip")
if self.dtype != cudf.dtype("uint32"):
raise TypeError("Only uint32 type can be converted to ip")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this method be public or private? It seems to be a superset of the pandas API, which makes me wonder if it should be exposed here. It's already only accessible via the column class which is not really user facing. We could instead expose the functionality through pylibcudf which would keep an option open for anyone who needs the feature to use it without depending on cudf python.

If we do want it to be usable through cudf python, I think it should be promoted to a series method and given docs in the python layer, etc. Otherwise I think we should consider deprecating the python api and moving towards an approach more like the above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exposed through pylibcudf as far as I know

def int2ip(Column input_col):

I'm not able to assess promoting to a series method, etc. It seems a reasonable suggestion to me but probably beyond my skill. So that may need to be done as a separate PR by someone who knows what entails I think.

Copy link
Contributor

@brandon-b-miller brandon-b-miller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python approval, one non blocking question

Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit cc75b05 into rapidsai:branch-24.10 Aug 8, 2024
82 checks passed
@davidwendt davidwendt deleted the ipv4-uint32 branch August 8, 2024 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team breaking Breaking change feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[FEA] Support UINT32 columns as an input type for integers_to_ipv4
4 participants