Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

Avoid setting the .data attribute #156

Merged
merged 5 commits into from
Oct 28, 2019
Merged

Avoid setting the .data attribute #156

merged 5 commits into from
Oct 28, 2019

Conversation

andersy005
Copy link
Contributor

@andersy005 andersy005 commented Oct 24, 2019

Fixes #155
Fixes #154

esmlab/core.py Outdated

if self.time_bound is not None:
ds[self.tb_name].data = self.time_bound.data
self.time_bound[self.time_coord_name].data = groupby_coord.data
ds[self.tb_name] = self.time_bound

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe keep .data on the RHS so that any attributes are not overwritten

@andersy005
Copy link
Contributor Author

@dcherian,

When you get a chance, can you take a look at this? and if everything looks good, please merge it.

@andersy005 andersy005 requested a review from dcherian October 28, 2019 15:01
Copy link

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the code that well at all (!) but this stuff looks sane.

I would add xarray.testing._assert_internal_invariants(dataset_or_dataarray) in your tests to be sure that you're creating valid xarray objects.

@@ -504,9 +493,9 @@ def compute_mon_anomaly(self, slice_mon_clim_time=None):
# reset month to become a variable
computed_dset = computed_dset.reset_coords('month')

computed_dset[self.time_coord_name].data = self.time.data
computed_dset[self.time_coord_name] = self.time

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep .data on RHS? But it seems like you're rewriting attributes later, in which case this might not matter.

@@ -156,9 +156,8 @@ def test_mon_climatology(ds_name, decoded, variables, time_coord_name):
ds = esmlab.datasets.open_dataset(ds_name, decode_times=decoded)
esm = ds.esmlab.set_time(time_coord_name=time_coord_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is esm an xarray object, if so use xarray.testing._assert_internal_invariants(esm) to make sure you aren't breaking any assumptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esm here is an instance of esmlab.core.EsmlabAccessor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so check esm._ds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so check esm._ds?

Done.

@andersy005
Copy link
Contributor Author

I don't know the code that well at all (!) but this stuff looks sane.

I myself have started having a hard time understanding the existing codebase. It's been a while since I looked at esmlab's code base. So, I spent the last four days looking at the existing code base, and my take-away is that esmlab's codebase is full of hacks. And this makes it difficult to debug or even maintain :(

I am personally in favor of helping with pushing most of the general functionality into xarray and only keeping things that are domain-specific. I just found out that there has been recent progress in pydata/xarray#2922, and I am excited about this. Once pydata/xarray#2922 is merged, we can move most if not all existing functionality from https://github.com/NCAR/esmlab/blob/master/esmlab/statistics.py into xarray.

@dcherian
Copy link

Yeah +1 on moving things upstream.

From a quick look, it seems like esmlab basically adds time-aware weighted groupby/resampling; time bounds consistency; and time units/calendargpropagation (pydata/xarray#1614).

For the last one, it would be good to document how much of the table in pydata/xarray#1614 is implemented and what's left.

@andersy005
Copy link
Contributor Author

Thank you, @dcherian! I am merging this for the time being.

@andersy005 andersy005 merged commit 532e940 into NCAR:master Oct 28, 2019
@klindsay28
Copy link

@andersy005 , f3a548d breaks the code snippet from #155 for me:

import xarray as xr
import esmlab
ds = xr.open_dataset('/glade/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/tseries/FG_CO2_ocn_piControl_00.nc')
ds_ann = esmlab.resample(ds, freq='ann')

I get the following:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/glade/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/esmlab_dev/esmlab/core.py", line 769, in resample
    weights=weights, method=method
  File "/glade/work/klindsay/miniconda3/envs/CESM2_coup_carb_cycle_JAMES_tst/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/glade/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/esmlab_dev/esmlab/core.py", line 545, in compute_ann_mean
    computed_dset = dset.apply(weighted_mean_arr, wgts=wgts)
  File "/glade/work/klindsay/miniconda3/envs/CESM2_coup_carb_cycle_JAMES_tst/lib/python3.7/site-packages/xarray/core/dataset.py", line 4140, in apply
    for k, v in self.data_vars.items()
  File "/glade/work/klindsay/miniconda3/envs/CESM2_coup_carb_cycle_JAMES_tst/lib/python3.7/site-packages/xarray/core/dataset.py", line 4140, in <dictcomp>
    for k, v in self.data_vars.items()
  File "/glade/work/klindsay/analysis/CESM2_coup_carb_cycle_JAMES/esmlab_dev/esmlab/core.py", line 536, in weighted_mean_arr
    (darr * wgts).resample({self.time_coord_name: 'A'}).sum(dim=self.time_coord_name)
  File "/glade/work/klindsay/miniconda3/envs/CESM2_coup_carb_cycle_JAMES_tst/lib/python3.7/site-packages/xarray/core/dataarray.py", line 2509, in func
    if not reflexive
  File "/glade/work/klindsay/miniconda3/envs/CESM2_coup_carb_cycle_JAMES_tst/lib/python3.7/site-packages/xarray/core/variable.py", line 1906, in func
    if not reflexive
TypeError: unsupported operand type(s) for *: 'cftime._cftime.DatetimeNoLeap' and 'float'

@klindsay28
Copy link

@andersy005 , if I revert the following change within f3a548d, then the code snippet from #155 runs with no error

--- a/esmlab/core.py
+++ b/esmlab/core.py
@@ -87,7 +87,6 @@ class EsmlabAccessor(object):
         ds[self.time_coord_name] = groupby_coord.data
 
         if self.time_bound is not None:
-            ds[self.tb_name] = self.time_bound
             self.time_bound[self.time_coord_name] = groupby_coord.data
         self.time_bound_diff = self.compute_time_bound_diff(ds)

I don't understand how this codes fits in with the rest of esmlab, so I'm unable to comment more productively.

@andersy005
Copy link
Contributor Author

@klindsay28, sorry for the breaking changes :(. I should have tested this PR against your example prior merging, The existing tests need further improvement to make sure that we are testing the most common use cases. I am going to look into this and will open a PR with a fix for this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

indexes related error from xarray v0.14.0 when calling esmlab.resample(ds, freq='ann') Fix Tests
3 participants