-
-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for categorical where reductions #1237
Conversation
@@ -1793,32 +1818,34 @@ def _build_combine(self, dshape, antialias, cuda, partitioned): | |||
invalid = isminus1 if self.selector.uses_row_index(cuda, partitioned) else isnull | |||
|
|||
@ngjit | |||
def combine_cpu_2d(aggs, selector_aggs): | |||
ny, nx = aggs[0].shape | |||
def combine_cpu(aggs, selector_aggs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of similar but not quite identical code here that I am planning to refactor in a separate PR.
Codecov Report
@@ Coverage Diff @@
## main #1237 +/- ##
==========================================
- Coverage 83.52% 83.37% -0.15%
==========================================
Files 35 35
Lines 8778 8832 +54
==========================================
+ Hits 7332 7364 +32
- Misses 1446 1468 +22
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks! Is this the end of it? I.e., are there any combinations of by/where/reductions with cpu/gpu/dask that are still unsupported? Or is that entire cross product now covered somewhere?
After rebase tests are failing with some bokeh-panel incompatibility when running examples. That is nothing to do with this PR, so merging this and will deal with example problem separately. |
Fixes #1210.
This adds support for categorical
where
reductions on CPU and GPU, with and without Dask.An example is
This returns a 4D
xarray.DataArray
of shape(ny, nx, ncat, n)
containing for each pixel and category the indexes of the 3 rows in the suppliedDataFrame
that have the maximum values of the"mass"
column.To return the values from another column instead of row indexes this would be
We can replace
max_n
in this example withmax
,min
,first
,last
,min_n
,first_n
, orlast_n
.Support is also added for
and the
last
,first_n
andlast_n
equivalents as these are implemented usingwhere
under certain circumstances (GPU and/or Dask).