-
Notifications
You must be signed in to change notification settings - Fork 934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure dtype objects are passed within Column.astype #17978
Conversation
I broke off some changes into #18008 as I think there might need to be some bug/feature implementations needed to make dtype objects required here |
Broken off from #17978 This passes more dtype objects to `Column.astype` but does not _ensure_ that a dtype object is passes unlike in #17978 Also it appears we avoid surfacing from warnings from pandas within our tests. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #18008
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this PR has undergone a large number of changes since I approved, so I'm just going to block for the moment. Please let me know when you want another review.
closes #17997 Will help unblock #17978 where we will need to interpret `dtype="interval"` as "interval without a subtype" instead of "interval with float64 subtype" Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #18017
Partially broken off from #17978 Since Column objects are technically private, not marking this as breaking. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #18066
…into cln/dtype/astype
@vyasr this PR is a ready for another round of review when you have the time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was very confused about when ordered could be None for categoricals when reviewing, but I guess that's what you do cudf.dtype("category")
because that's what pandas gives you in that case. Does this really make much sense? I guess any other choice would be equally arbitrary, but it seems worse to me for pandas to allow ordered to be None than to make an arbitrary choice of ordered=True or False when you construct an empty one.
Nevertheless, this PR is fine. It needs conflict resolution though.
Yeah I don't like that
|
/merge |
Description
Continuation of #17918
astype
calls might be fromSeries/Index/etc.
, but IMO it's OK if we are a bit stricter and pass dtype objects when calling those methods tooColumn.as_*_column
since they are called byColumn.astype
Checklist