Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests for datasets with units #3447

Merged
merged 22 commits into from
Nov 9, 2019

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Oct 25, 2019

As a follow-up to #3238, this adds tests for datasets. Replacing assert_equal_with_units with assert_identical and adding tests for the toplevel functions are not included, these will be new PRs.

  • Tests added
  • Passes black . && mypy . && flake8

As a reference for myself, this is the list of methods from the documentation of Dataset:

  • creation: Dataset()
  • contents: copy, assign, assign_coords, assign_attrs, pipe, merge, rename, rename_vars, rename_dims, swap_dims, expand_dims, drop, drop_dims, set_coords, reset_coords
  • comparisons: equals, broadcast_equals, identical
  • indexing: loc, isel, sel, head, tail, thin, squeeze, interp, interp_like, reindex, reindex_like, set_index, reset_index, reorder_levels
  • missing value handling: isnull, notnull, combine_first, count, dropna, fillna, ffill, bfill, interpolate_na, where, isin
  • computation: apply, reduce, groupby, groupby_bins, rolling, rolling_exp, coarsen, resample, diff, quantile, differentiate, integrate
  • aggregation: all, any, argmax, argmin, max, min, mean, median, prod, sum, std, var
  • ndarray methods: astype, argsort, clip, conj, conjugate, imag, round, real, cumsum, cumprod, rank
  • grouped operations: assign, assign_coords, first, last, fillna, where, quantile
  • reshaping and reorganizing: transpose, stack, unstack, to_stacked_array, shift, roll, sortby

these methods are not covered:

  • creation: decode_cf
  • dictionary interface: __getitem__, __setitem__, __delitem__, update, items, values
  • broadcast_like (since I also forgot this for DataArray, I'll create a new PR for it)

Copy link
Collaborator

@max-sixty max-sixty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 on the thoroughness of these!

xarray/tests/test_units.py Outdated Show resolved Hide resolved
xarray/tests/test_units.py Show resolved Hide resolved
@keewis keewis changed the title WIP: tests for datasets with units tests for datasets with units Oct 30, 2019
@keewis
Copy link
Collaborator Author

keewis commented Oct 30, 2019

I think this is ready for review and merge. I am especially interested in comments on Dataset.__init__ and Dataset.test_merge, since at the moment they seem to test the same code.

There is still a lot to do, and I will also have to go over all tests (DataArray and Dataset) to remove any obvious mistakes (I found a lot from the code in #3238), but if there are any that are not so obvious, they probably won't be found until someone works on making the tests pass.

@max-sixty
Copy link
Collaborator

Looks really good. I spent 20 mins looking through and looks very thorough. I didn't read through every function. Let me know if there are any other ones you want someone to look through

I am especially interested in comments on Dataset.__init__ and Dataset.test_merge, since at the moment they seem to test the same code.

I had a good look through these. (though I didn't fully understand re 'test the same code', they seem to test the relevant init / merge functions?)

@keewis
Copy link
Collaborator Author

keewis commented Oct 30, 2019

I think they at least partially test the same code: if two DataArray objects that get passed to the Dataset constructor as data variables have the same dimensions / coordinates, they get merged. I think that happens even if the values of the dims / coords are equal.

@max-sixty
Copy link
Collaborator

I think they at least partially test the same code: if two DataArray objects that get passed to the Dataset constructor as data variables have the same dimensions / coordinates, they get merged. I think that happens even if the values of the dims / coords are equal.

Ah great, because merge happens in a dataset's init, yes. Let's keep both tests though

@keewis
Copy link
Collaborator Author

keewis commented Nov 8, 2019

Is there anything left to do before this is ready to merge? I'd like to use some of the changes to the helper functions made here in #3493.

@max-sixty
Copy link
Collaborator

Great @keewis . As above, I haven't looked through all (1750!!) LoC, but looks good on the code I have. Let me / us know if there's anything you want us to look into more.

Thank you v much!

@max-sixty max-sixty merged commit ffc3275 into pydata:master Nov 9, 2019
@keewis keewis deleted the tests-for-datasets-with-units branch November 9, 2019 12:38
@keewis
Copy link
Collaborator Author

keewis commented Nov 9, 2019

thanks, @max-sixty

@keewis
Copy link
Collaborator Author

keewis commented Nov 9, 2019

I think we need to review most if not all tests once we try to make them pass, right now there are too many errors (mostly from pint) to know if they actually do what they should.

@keewis keewis mentioned this pull request Nov 10, 2019
5 tasks
dcherian added a commit to dcherian/xarray that referenced this pull request Nov 12, 2019
* upstream/master:
  add missing pint integration tests (pydata#3508)
  DOC: update bottleneck repo url (pydata#3507)
  add drop_sel, drop_vars, map to api.rst (pydata#3506)
  remove syntax warning (pydata#3505)
  Dataset.map, GroupBy.map, Resample.map (pydata#3459)
  tests for datasets with units (pydata#3447)
  fix pandas-dev tests (pydata#3491)
  unpin pseudonetcdf (pydata#3496)
  whatsnew corrections (pydata#3494)
dcherian added a commit to dcherian/xarray that referenced this pull request Nov 12, 2019
* upstream/master:
  add missing pint integration tests (pydata#3508)
  DOC: update bottleneck repo url (pydata#3507)
  add drop_sel, drop_vars, map to api.rst (pydata#3506)
  remove syntax warning (pydata#3505)
  Dataset.map, GroupBy.map, Resample.map (pydata#3459)
  tests for datasets with units (pydata#3447)
  fix pandas-dev tests (pydata#3491)
  unpin pseudonetcdf (pydata#3496)
  whatsnew corrections (pydata#3494)
  drop_vars; deprecate drop for variables (pydata#3475)
  uamiv test using only raw uamiv variables (pydata#3485)
  Optimize dask array equality checks. (pydata#3453)
dcherian added a commit to dcherian/xarray that referenced this pull request Nov 13, 2019
* upstream/master:
  format indexing.rst code with black (pydata#3511)
  add missing pint integration tests (pydata#3508)
  DOC: update bottleneck repo url (pydata#3507)
  add drop_sel, drop_vars, map to api.rst (pydata#3506)
  remove syntax warning (pydata#3505)
  Dataset.map, GroupBy.map, Resample.map (pydata#3459)
  tests for datasets with units (pydata#3447)
  fix pandas-dev tests (pydata#3491)
  unpin pseudonetcdf (pydata#3496)
  whatsnew corrections (pydata#3494)
  drop_vars; deprecate drop for variables (pydata#3475)
  uamiv test using only raw uamiv variables (pydata#3485)
  Optimize dask array equality checks. (pydata#3453)
  Propagate indexes in DataArray binary operations. (pydata#3481)
  python 3.8 tests (pydata#3477)
dcherian added a commit to dcherian/xarray that referenced this pull request Nov 13, 2019
commit d430ae0
Author: dcherian <deepak@cherian.net>
Date:   Wed Nov 13 08:27:04 2019 -0700

    proper fix.

commit 7fd69be
Author: dcherian <deepak@cherian.net>
Date:   Wed Nov 13 08:03:26 2019 -0700

    fix whats-new merge.

commit 4489394
Merge: 279ff1d b74f80c
Author: dcherian <deepak@cherian.net>
Date:   Wed Nov 13 08:03:06 2019 -0700

    Merge remote-tracking branch 'upstream/master' into fix/plot-broadcast

    * upstream/master:
      format indexing.rst code with black (pydata#3511)
      add missing pint integration tests (pydata#3508)
      DOC: update bottleneck repo url (pydata#3507)
      add drop_sel, drop_vars, map to api.rst (pydata#3506)
      remove syntax warning (pydata#3505)
      Dataset.map, GroupBy.map, Resample.map (pydata#3459)
      tests for datasets with units (pydata#3447)
      fix pandas-dev tests (pydata#3491)
      unpin pseudonetcdf (pydata#3496)
      whatsnew corrections (pydata#3494)
      drop_vars; deprecate drop for variables (pydata#3475)
      uamiv test using only raw uamiv variables (pydata#3485)
      Optimize dask array equality checks. (pydata#3453)
      Propagate indexes in DataArray binary operations. (pydata#3481)
      python 3.8 tests (pydata#3477)

commit 279ff1d
Author: dcherian <deepak@cherian.net>
Date:   Wed Nov 13 08:02:44 2019 -0700

    Undo the transpose change and add test to make sure transposition is right.

commit c9cc698
Author: dcherian <deepak@cherian.net>
Date:   Wed Nov 13 08:01:39 2019 -0700

    Test to make sure transpose is right

commit 9b35ecf
Author: dcherian <deepak@cherian.net>
Date:   Sat Nov 2 15:49:08 2019 -0600

    Additional test.

commit 7aed950
Author: dcherian <deepak@cherian.net>
Date:   Sat Nov 2 15:20:07 2019 -0600

    make plotting work with transposed nondim coords.
@keewis keewis mentioned this pull request Dec 8, 2019
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants