Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

holoviews 1.10 no longer respects the order of coordinates of an xarray dataset #2577

Closed
marcbernot opened this issue Apr 19, 2018 · 12 comments

Comments

@marcbernot
Copy link

There is a change of behaviour between holoviews 1.9.5 and 1.10 with the way to build kdims from coordinates of an xarray dataset.

I noticed this because I have an images stack with coordinates (x,y,time) ; let's call it ds.
It is handy to visualize it with hv.Dataset(ds).to(hv.Image) because holoviews nicely infers that x and y are the kdims of the image and uses a menu for the time. With 1.10 though, the kdims become (time,x,y) and the inference is no longer what we could expect.

Here is a small exemple to reproduce this :

import numpy as np,holoviews as hv,xarray as xr
from collections import OrderedDict
data = np.zeros((3,4,5))
coords = OrderedDict(b=range(3), c=range(4), a=range(5))
dataset=xr.Dataset(coords=coords)
dataset['value']=([*dataset.coords.keys()], data)

The output is an xarray with dimensions (a,b,c) and coordinates (b,c,a) as we can check with print([*dataset.dims],[*dataset.coords]).

This command hv.Dataset(dataset).kdims gives:

  • output of 1.9.5 : [Dimension('b'), Dimension('c'), Dimension('a')]
  • output of 1.10 : [Dimension('a'), Dimension('b'), Dimension('c')]

Has it something to do with changes in the XArrayInterface?

@marcbernot marcbernot changed the title holoview 1.10 no longer respects the order of coordinates of an xarray dataset holoviews 1.10 no longer respects the order of coordinates of an xarray dataset Apr 19, 2018
@marcbernot
Copy link
Author

I guess line 132 of core/data/xarray.py
kdims = sorted(kdims, key=lambda x: (xrdims.index(x) if x in xrdims else float('inf'), x))
is when the reordering happens so this relates to this commit.

@philippjfr
Copy link
Member

Thanks for reporting this. It does seem to be a significant change in behavior and I think we should return to the old behavior for 1.10.1.

@jlstevens
Copy link
Contributor

Worth fixing for 1.10.1 then.

@philippjfr
Copy link
Member

I'm no longer sure this is something we can do. It seems xarray does not guarantee a fixed coord ordering at the Dataset level even when given an OrderedDict of coords, which is why we switched away from that originally.

@philippjfr
Copy link
Member

Additionally, it seems your example doesn't even seem to work in python 2. When I run this in python 2:

import numpy as np,holoviews as hv,xarray as xr
from collections import OrderedDict
data = np.zeros((3,4,5))
coords = OrderedDict(b=range(3), c=range(4), a=range(5))
dataset = xr.Dataset(coords=coords)
dataset['value']=(list(dataset.coords.keys()), data)
dataset

I get the following error:

ValueError: conflicting sizes for dimension 'a': length 3 on 'value' and length 5 on 'a'

@philippjfr
Copy link
Member

philippjfr commented Apr 19, 2018

Okay ignore my last message, that was simply down to the fact that in Python 2 and earlier versions of Py3 the OrderedDict constructor does not respect the order of keywords.

@philippjfr
Copy link
Member

So the real reason I think supporting this is problematic is that unless you remember to use an OrderedDict to specify the coords then the order of dimensions will be arbitrary and if you want a specific order then you declare that using the kdims.

@philippjfr
Copy link
Member

If there was some way to tell whether the coord order was defined explicitly or was just decided implicitly by the dictionary ordering I'd be happy to make use of that but as it is I'm inclined to stick with the new behavior for consistency.

@jlstevens
Copy link
Contributor

I think that is a very reasonable decision. Perhaps one thing to do is to mention this in the docs?

@marcbernot
Copy link
Author

Ok, I'm not sure to understand what is the new behavior ; is it to infer kdims based on dataset.dims?

Do you mean that we cannot rely on consistent coordinates ordering from xarray.dataset (coords could come from a dictionary) but we can rely on the ordering provided by dimensions (because it is sorted). So if holoviews is to infer kdims it should do this by using dataset.dims rather than dataset.coords.

I would agree if dataset.coords behaved like a dictionary but it acts as an OrderedDict (see dataset.coords.variables). And this is expected I guess since a dataset is supposed to be an in-memory representation of a netCDF file.
So dataset.coords.variables could act as a consistent source for holoviews.

What causes the potential inconsistency is that xarray allows the coordinates to be defined with a dictionary which can lead to incompatible corresponding netCDF files and so different inferences of kdims (with holoviews 1.9.5). But this inconsistency is within xarray, not holoviews.
The way people create xarray.datasets is not the responsibility of holoviews, and I'm not sure it should drive decisions about what holoviews does with datasets.
So this inconsistency should rather be reported and treated by xarray, no?

In my opinion, it is worth exploring the options here because I don't think that defining kdims as the lexicographic order of dimensions is that useful ; it would most of the time need an explicit reindexing.

@philippjfr
Copy link
Member

Ok, I'm not sure to understand what is the new behavior ; is it to infer kdims based on dataset.dims?

That's correct, it currently respects the dims ordering.

But this inconsistency is within xarray, not holoviews. The way people create xarray.datasets is not the responsibility of holoviews, and I'm not sure it should drive decisions about what holoviews does with datasets.

I think that's a good point. I'm also persuaded by the fact that dictionary will be ordered in future and support for Python 2 will be dropped. Also not making use of the specified coordinate ordering would be a shame.

I'll reopen a PR to address this shortly.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants