Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed behavior for replacing coordinates on dataset. #3377

Closed
jbusecke opened this issue Oct 7, 2019 · 5 comments · Fixed by #3393
Closed

Changed behavior for replacing coordinates on dataset. #3377

jbusecke opened this issue Oct 7, 2019 · 5 comments · Fixed by #3393
Labels

Comments

@jbusecke
Copy link
Contributor

jbusecke commented Oct 7, 2019

MCVE Code Sample

We noticed a change in behavior in xarray that broke a test in xgcm.

Consider this code example:

import xarray as xr
import numpy as np
x = np.arange(5)
data = xr.DataArray(np.random.rand(5), coords={'x':x
}, dims=['x'])
x_c = np.arange(5) + 0.5
data_c = xr.DataArray(np.random.rand(5), coords={'x_c':x_c
}, dims=['x_c'])
ds = xr.Dataset({'data':data, 'data_c':data_c})
del ds['data_c']

ds['x_c'] = ds['x_c'][:3]
ds

Expected Output

In previous versions of xarray this resulted in

<xarray.Dataset>
Dimensions:  (x: 5, x_c: 3)
Coordinates:
  * x_c      (x_c) float64 0.5 1.5 2.5
  * x        (x) int64 0 1 2 3 4
Data variables:
    data     (x) float64 0.3828 0.6016 0.3603 0.414 0.35

(I tested right now with an older v0.11.3 but this works with v0.13 as well.

In the current master branch instead the coordinate gets padded with nans:

<xarray.Dataset>
Dimensions:  (x: 5, x_c: 5)
Coordinates:
  * x_c      (x_c) float64 0.5 1.5 2.5 nan nan
  * x        (x) int64 0 1 2 3 4
Data variables:
    data     (x) float64 0.8369 0.4197 0.3058 0.7419 0.8126

Problem Description

We fixed the test in a new PR, but @dcherian encouraged me to submit this.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 15:17:50) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 18.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None

xarray: 0.13.0+24.g4254b4af
pandas: 0.25.1
numpy: 1.17.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.5.0
distributed: 2.5.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
setuptools: 41.2.0
pip: 19.2.3
conda: None
pytest: 5.2.0
IPython: 7.8.0
sphinx: None

@max-sixty
Copy link
Collaborator

Thanks for submitting. Confirmed. I couldn't work out the cause quickly, can have another look later; unless anyone knows immediately.

@dcherian dcherian added the bug label Oct 7, 2019
This was referenced Oct 7, 2019
@jhamman
Copy link
Member

jhamman commented Oct 11, 2019

I looked into this a bit tonight. I think this may be related to @shoyer's recent explicit indexes refactor (#3234). I'm having trouble following some of the new logic but it seems we're not updating the size of the x_c dimension during the __setitem__ step in @jbusecke's example. I have a PR coming with a regression test -- #3392.

@shoyer
Copy link
Member

shoyer commented Oct 11, 2019

Yes, it was definitely my explicit indexes refactor. Apparently this wasn't tested?

Here's a simpler reproduction:

import xarray as xr
ds = xr.Dataset({"x": [0, 1, 2]})
ds["x"] = ds["x"][:2]
assert ds.sizes["x"] == 2, ds

@shoyer
Copy link
Member

shoyer commented Oct 11, 2019

@jbusecke kudos to you for catching this so early, before we even made a release!

#3393 should fix the immediate issue, though it's a pretty awkward fix.

@jbusecke
Copy link
Contributor Author

Glad that this orphaned test (we ended up removing it, because the function was not called anymore) was still useful!

And many thanks to @dcherian for suggesting to test xgxm with the upstream master!

Sent with GitHawk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants