Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fixes for Dataset.reduce() and n-dimensional cumsum/cumprod #2156

Merged
merged 4 commits into from
May 18, 2018

Conversation

shoyer
Copy link
Member

@shoyer shoyer commented May 17, 2018

Fixes GH1470, "Dataset.mean drops coordinates"

Fixes a bug where non-scalar data-variables that did not include the
aggregated dimension were not properly reduced:

Previously::

    >>> ds = Dataset({'x': ('a', [2, 2]), 'y': 2, 'z': ('b', [2])})
    >>> ds.var('a')
    <xarray.Dataset>
    Dimensions:  (b: 1)
    Dimensions without coordinates: b
    Data variables:
        x        float64 0.0
        y        float64 0.0
        z        (b) int64 2

Now::

    >>> ds.var('a')
    <xarray.Dataset>
    Dimensions:  (b: 1)
    Dimensions without coordinates: b
    Data variables:
        x        int64 0
        y        int64 0
        z        (b) int64 0

Finally, adds support for n-dimensional cumsum() and cumprod(), reducing
over all dimensions of an array. (This was necessary as part of the above fix.)

  • Closes Dataset.mean drops coordinates #1470 (remove if there is no corresponding issue, which should only be the case for minor changes)
  • Tests added (for all bug fixes or enhancements)
  • Tests passed (for all non-documentation changes)
  • Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

Fixes GH1470, "Dataset.mean drops coordinates"

Fixes a bug where non-scalar data-variables that did not include the
aggregated dimension were not properly reduced:

    Previously::

        >>> ds = Dataset({'x': ('a', [2, 2]), 'y': 2, 'z': ('b', [2])})
        >>> ds.var('a')
        <xarray.Dataset>
        Dimensions:  (b: 1)
        Dimensions without coordinates: b
        Data variables:
            x        float64 0.0
            y        float64 0.0
            z        (b) int64 2

    Now::

        >>> ds.var('a')
        <xarray.Dataset>
        Dimensions:  (b: 1)
        Dimensions without coordinates: b
        Data variables:
            x        int64 0
            y        int64 0
            z        (b) int64 0

Finally, adds support for n-dimensional cumsum() and cumprod(), reducing
over all dimensions of an array. (This was necessary as part of the above fix.)
def test_cumprod_2d():
inputs = np.array([[1, 2], [3, 4]])

expected = np.array([[1, 2], [3, 2*3*4]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E226 missing whitespace around arithmetic operator

keep_attrs=keep_attrs,
allow_lazy=allow_lazy,
**kwargs)
# drop other data_vars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errant comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intentional, but I don't think it actually adds much value.

@shoyer shoyer merged commit c346d3b into pydata:master May 18, 2018
@shoyer shoyer mentioned this pull request May 25, 2018
8 tasks
@shoyer shoyer deleted the dataset-reduce-fixup branch July 25, 2018 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dataset.mean drops coordinates
3 participants