-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: preserve freq in DTI/TDI.factorize #38120
Changes from 18 commits
6d741b1
e466d71
a553174
23911ef
516d232
0e51930
678251d
abb5913
c96c1ac
7c66389
4442306
c0544be
4a5b4ac
a4a0d08
b8b99cb
3a2dfc7
82db4ad
e5bd5f5
a907781
6eaedea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1645,6 +1645,24 @@ def _with_freq(self, freq): | |
arr._freq = freq | ||
return arr | ||
|
||
# -------------------------------------------------------------- | ||
|
||
def factorize(self, na_sentinel=-1, sort: bool = False): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. _values_for_factorize and _from_factorized in DatetimeLikeArrayMixin are now only used by PeriodArray? should these be moved? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. values_for_factorize is still used (incorrectly) in core.util.hashing, so for now this is still needed here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||
if self.freq is not None: | ||
# We must be unique, so can short-circuit (and retain freq) | ||
codes = np.arange(len(self), dtype=np.intp) | ||
uniques = self.copy() # TODO: copy or view? | ||
if sort and self.freq.n < 0: | ||
codes = codes[::-1] | ||
# TODO: overload __getitem__, a slice indexer returns same type as self | ||
# error: Incompatible types in assignment (expression has type | ||
# "Union[DatetimeLikeArrayMixin, Union[Any, Any]]", variable | ||
# has type "TimelikeOps") [assignment] | ||
uniques = uniques[::-1] # type: ignore[assignment] | ||
return codes, uniques | ||
# FIXME: shouldn't get here; we are ignoring sort | ||
return super().factorize(na_sentinel=na_sentinel) | ||
|
||
|
||
# ------------------------------------------------------------------- | ||
# Shared Constructor Helpers | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elsif here (why are using isintance when we should be using is_extension_array_dtype)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the elif part of the comment, I think that Brock maybe keeping the early return separate for readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
(... i think. most of this is a resurrected branch from a ways back)