Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"_center" postfix on axis label resulting from groupby_bins persists after renaming variable #4322

Closed
lamorton opened this issue Aug 7, 2020 · 5 comments · Fixed by #4794

Comments

@lamorton
Copy link

lamorton commented Aug 7, 2020

What happened:

I used groupby_bins + sum to reduce the resolution of my dataset along 'x' dimension. I didn't like the 'x_bins_center' label, so I renamed the x-axis dim/coord to simply 'x.' However, the "_center" postfix is not part of the variable name -- it appears to be some tweaking of the x-axis label when plotting. So now I am stuck with "_center" tagged at the end of the x-axis label, even after the units.

What you expected to happen:

It would make more sense if the '_center' were part of the variable name. That way, the name displayed on the plot is the same one that I need to access the variable in the dataset. Also, when I rename the variable, I will be able to change the way it displays. Furthermore, that will prevent the issue with "_center" getting pasted on after the units.

Minimal Complete Verifiable Example:

import xarray as xr
import numpy as np
data_vars={'y':('x',np.ones((101)),{'units':'kg/m'})}
coords={'x':('x',np.linspace(0,1,101,endpoint=True),{'units':'m'})}
ds = xr.Dataset(data_vars,coords)
dsd = ds.groupby_bins('x',np.linspace(0,1,11,endpoint=True),right=False).sum(dim='x')
dsd.y.plot() #Shows that the x-axis is named "x_bins_center"
dsd = dsd.rename({'x_bins_center':'x'}) #Fails: 
>ValueError: cannot rename 'x_bins_center' because it is not a variable or dimension in this dataset
dsd = dsd.rename({'x_bins':'x'}) #Succeeds, b/c the variable is ACTUALLY named 'x_bins'
dsd.x.attrs['units']='m'
dsd.y.plot() #x-axis label is "x [m]_center"  -- there's a sneaky renaming thing that is appending _center to the end of the label

Anything else we need to know?:

Plots Here's the 1st plot showing the default x-axis label prior to renaming:

Original_plot

Here's the 2nd plot showing the mangled x-axis label after I renamed the variable & reestablished the units:

Butchered_plot

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.7 (default, Mar 23 2020, 17:31:31)
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 19.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.1

xarray: 0.16.0
pandas: 1.0.3
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.4.2
pydap: None
h5netcdf: 0.8.0
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.20.0
distributed: None
matplotlib: 3.1.3
cartopy: None
seaborn: None
numbagg: None
pint: 0.11
setuptools: 49.2.0.post20200714
pip: 20.1.1
conda: None
pytest: 5.4.1
IPython: 7.13.0
sphinx: 3.1.2

@dcherian
Copy link
Contributor

dcherian commented Aug 7, 2020

This is because x_bins contains interval objects. .plot is calculating the center of the interval and letting you know that it is doing so.

If you want that line plot please use ax.set_xlabel, otheriwse try .plot.step which will more faithfully represent the intervals with label x_bins

@lamorton
Copy link
Author

lamorton commented Aug 7, 2020

@dcherian: OK, thanks, now I understand why it is happening -- there's no unambiguous way to represent the intervals as floats, so one needs to use either the left/right/midpoint & indicate that. For my case, I think I will just replace the array of intervals with the array of midpoints of the intervals.

The "_center" tag still doesn't work with the automatic units labeling though:

import xarray as xr
import numpy as np
data_vars={'y':('x',np.ones((101)),{'units':'kg/m'})}
coords={'x':('x',np.linspace(0,1,101,endpoint=True),{'units':'m'})}
ds = xr.Dataset(data_vars,coords)
dsd = ds.groupby_bins('x',np.linspace(0,1,11,endpoint=True),right=False).sum(dim='x')
dsd.x_bins.attrs = dsd.x_bins.attrs
dsd.y.plot() #The x-axis label still looks like "x [m]_center"

The "_center" tag should be applied before the "[m]" one.

@dcherian
Copy link
Contributor

dcherian commented Aug 7, 2020

Oh sorry I misunderstood that bit. Yes, that looks like a bug. A PR would be welcome. I guess you should add a suffix kwarg to the plot.utils.label_from_attrs function.

@keewis
Copy link
Collaborator

keewis commented Nov 26, 2020

if I understand correctly, this happens here:

xarray/xarray/plot/plot.py

Lines 295 to 300 in 5883a46

xplt, yplt, hueplt, xlabel, ylabel, hue_label = _infer_line_data(darray, x, y, hue)
# Remove pd.Intervals if contained in xplt.values and/or yplt.values.
xplt_val, yplt_val, xlabel, ylabel, kwargs = _resolve_intervals_1dplot(
xplt.values, yplt.values, xlabel, ylabel, kwargs
)

where xlabel and ylabel are extracted from xplt and yplt using label_from_attrs. In _resolve_intervals_1dplot xlabel and ylabel simply get the _center suffix, it is not used for anything else. Wouldn't it be possible to modify the calls of _infer_line_data and _resolve_intervals_1dplot to something like this:

    xplt, yplt, hueplt, hue_label = _infer_line_data(darray, x, y, hue) 
  
    # Remove pd.Intervals if contained in xplt.values and/or yplt.values. 
    xplt_val, yplt_val, x_suffix, y_suffix, kwargs = _resolve_intervals_1dplot( 
        xplt.values, yplt.values, kwargs 
    )
    xlabel = label_from_attrs(xplt, extra=x_suffix)
    ylabel = label_from_attrs(yplt, extra=y_suffix)

and then have _resolve_intervals_1dplot return either "_center" or "" as x_suffix and y_suffix.

@dcherian
Copy link
Contributor

Yes I think something like this (though ugly) will be necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants