-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: preserve freq in DTI/TDI.factorize #38120
Conversation
thanks @jbrockmendel I'll look in detail tomorrow and add tests as required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we need to add special casing since extension array interface doesn't support the sort kwarg, xref #33838
this is going to be hard to backport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you can get this passing on the backport as well ok to merge, but pls put up a separate patch for that to see if its even possible.
if is_extension_array_dtype(dtype): | ||
values = dtype.construct_array_type()._from_sequence(values) | ||
cls = dtype.construct_array_type() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elsif here (why are using isintance when we should be using is_extension_array_dtype)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the elif part of the comment, I think that Brock maybe keeping the early return separate for readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elsif here
sure
why are using isintance when we should be using is_extension_array_dtype
- perf, is_foo_dtype is slow
- to catch DTA/TDA
(... i think. most of this is a resurrected branch from a ways back)
sure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blocking as will need a release note before merge, i'll add if #38137 successful
@@ -1645,6 +1645,24 @@ def _with_freq(self, freq): | |||
arr._freq = freq | |||
return arr | |||
|
|||
# -------------------------------------------------------------- | |||
|
|||
def factorize(self, na_sentinel=-1, sort: bool = False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_values_for_factorize and _from_factorized in DatetimeLikeArrayMixin are now only used by PeriodArray? should these be moved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
values_for_factorize is still used (incorrectly) in core.util.hashing, so for now this is still needed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
maybe doable, in #38137, the ci has not completed yet but looks like tests are passing (except for np-dev which is unrelated and failing on other PRs) |
yeah ok with just making this work here, can fixup later / cleanup |
@jreback it would be good to get this into master before 1.2.0rc0 if it's ready. not critical but would only need one backport. backporting to 1.1.5 does not block 1.2.0rc0 if more discussion needed, fine. no need to be under any pressure to merge. |
@simonjayhawkins no i think this is fine. I am sure that @jbrockmendel will have a cleanup on top of this, but good for now. thanks. |
This comment has been minimized.
This comment has been minimized.
Co-authored-by: jbrockmendel <jbrockmendel@gmail.com>
cc @simonjayhawkins
closes #35563
this is an old branch I resurrected and rebased, so has some excess cruft that would need to be removed if we want to move forward.
LMK if there are any tests you'd like to see added based on the other issue (or just go ahead and push them if you like)