Skip to content
forked from pydata/xarray

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into fix/groupby-reduc…
Browse files Browse the repository at this point in the history
…e-multiple-dims

* upstream/master:
  minor lint tweaks (pydata#3429)
  Hack around pydata#3440 (pydata#3442)
  Update Terminology page to account for multidimensional coordinates (pydata#3410)
  Use cftime master for upstream-dev build (pydata#3439)
  • Loading branch information
dcherian committed Oct 24, 2019
2 parents 163bb81 + 652dd3c commit 4d14fca
Show file tree
Hide file tree
Showing 18 changed files with 56 additions and 67 deletions.
5 changes: 2 additions & 3 deletions ci/azure/install.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ steps:
--pre \
--upgrade \
matplotlib \
pandas \
pandas=0.26.0.dev0+628.g03c1a3db2 \ # FIXME https://github.com/pydata/xarray/issues/3440
scipy
# numpy \ # FIXME https://github.com/pydata/xarray/issues/3409
pip install \
Expand All @@ -25,8 +25,7 @@ steps:
git+https://github.com/dask/dask \
git+https://github.com/dask/distributed \
git+https://github.com/zarr-developers/zarr \
git+https://github.com/Unidata/cftime.git@refs/pull/127/merge
# git+https://github.com/Unidata/cftime # FIXME PR 127 not merged yet
git+https://github.com/Unidata/cftime
condition: eq(variables['UPSTREAM_DEV'], 'true')
displayName: Install upstream dev dependencies

Expand Down
4 changes: 2 additions & 2 deletions doc/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -286,12 +286,12 @@ How to build the *xarray* documentation
Requirements
~~~~~~~~~~~~
Make sure to follow the instructions on :ref:`creating a development environment above <contributing.dev_env>`, but
to build the docs you need to use the environment file ``doc/environment.yml``.
to build the docs you need to use the environment file ``ci/requirements/doc.yml``.

.. code-block:: none
# Create and activate the docs environment
conda env create -f doc/environment.yml
conda env create -f ci/requirements/doc.yml
conda activate xarray-docs
# or with older versions of Anaconda:
Expand Down
6 changes: 3 additions & 3 deletions doc/terminology.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@ Terminology

----

**Coordinate:** An array that labels a dimension of another ``DataArray``. Loosely, the coordinate array's values can be thought of as tick labels along a dimension. There are two types of coordinate arrays: *dimension coordinates* and *non-dimension coordinates* (see below). A coordinate named ``x`` can be retrieved from ``arr.coords[x]``. A ``DataArray`` can have more coordinates than dimensions because a single dimension can be assigned multiple coordinate arrays. However, only one coordinate array can be a assigned as a particular dimension's dimension coordinate array. As a consequence, ``len(arr.dims) <= len(arr.coords)`` in general.
**Coordinate:** An array that labels a dimension or set of dimensions of another ``DataArray``. In the usual one-dimensional case, the coordinate array's values can loosely be thought of as tick labels along a dimension. There are two types of coordinate arrays: *dimension coordinates* and *non-dimension coordinates* (see below). A coordinate named ``x`` can be retrieved from ``arr.coords[x]``. A ``DataArray`` can have more coordinates than dimensions because a single dimension can be labeled by multiple coordinate arrays. However, only one coordinate array can be a assigned as a particular dimension's dimension coordinate array. As a consequence, ``len(arr.dims) <= len(arr.coords)`` in general.

----

**Dimension coordinate:** A coordinate array assigned to ``arr`` with both a name and dimension name in ``arr.dims``. Dimension coordinates are used for label-based indexing and alignment, like the index found on a :py:class:`pandas.DataFrame` or :py:class:`pandas.Series`. In fact, dimension coordinates use :py:class:`pandas.Index` objects under the hood for efficient computation. Dimension coordinates are marked by ``*`` when printing a ``DataArray`` or ``Dataset``.
**Dimension coordinate:** A one-dimensional coordinate array assigned to ``arr`` with both a name and dimension name in ``arr.dims``. Dimension coordinates are used for label-based indexing and alignment, like the index found on a :py:class:`pandas.DataFrame` or :py:class:`pandas.Series`. In fact, dimension coordinates use :py:class:`pandas.Index` objects under the hood for efficient computation. Dimension coordinates are marked by ``*`` when printing a ``DataArray`` or ``Dataset``.

----

**Non-dimension coordinate:** A coordinate array assigned to ``arr`` with a name in ``arr.dims`` but a dimension name *not* in ``arr.dims``. These coordinate arrays are useful for auxiliary labeling. However, non-dimension coordinates are not indexed, and any operation on non-dimension coordinates that leverages indexing will fail. Printing ``arr.coords`` will print all of ``arr``'s coordinate names, with the assigned dimensions in parentheses. For example, ``coord_name (dim_name) 1 2 3 ...``.
**Non-dimension coordinate:** A coordinate array assigned to ``arr`` with a name in ``arr.coords`` but *not* in ``arr.dims``. These coordinates arrays can be one-dimensional or multidimensional, and they are useful for auxiliary labeling. As an example, multidimensional coordinates are often used in geoscience datasets when :doc:`the data's physical coordinates (such as latitude and longitude) differ from their logical coordinates <examples/multidimensional-coords>`. However, non-dimension coordinates are not indexed, and any operation on non-dimension coordinates that leverages indexing will fail. Printing ``arr.coords`` will print all of ``arr``'s coordinate names, with the corresponding dimension(s) in parentheses. For example, ``coord_name (dim_name) 1 2 3 ...``.

----

Expand Down
2 changes: 2 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ Documentation
:py:meth:`Dataset.resample` and explicitly state that a
datetime-like dimension is required. (:pull:`3400`)
By `Justus Magin <https://github.com/keewis>`_.
- Update the terminology page to address multidimensional coordinates. (:pull:`3410`)
By `Jon Thielen <https://github.com/jthielen>`_.

Internal Changes
~~~~~~~~~~~~~~~~
Expand Down
2 changes: 1 addition & 1 deletion xarray/backends/h5netcdf_.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ def prepare_variable(
dtype=dtype,
dimensions=variable.dims,
fillvalue=fillvalue,
**kwargs
**kwargs,
)
else:
nc4_var = self.ds[name]
Expand Down
2 changes: 1 addition & 1 deletion xarray/backends/zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -467,7 +467,7 @@ def open_zarr(
drop_variables=None,
consolidated=False,
overwrite_encoded_chunks=False,
**kwargs
**kwargs,
):
"""Load and decode a dataset from a Zarr store.
Expand Down
4 changes: 3 additions & 1 deletion xarray/core/accessor_dt.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,9 @@ def __init__(self, obj):
)
self._obj = obj

def _tslib_field_accessor(name, docstring=None, dtype=None):
def _tslib_field_accessor( # type: ignore
name: str, docstring: str = None, dtype: np.dtype = None
):
def f(self, dtype=dtype):
if dtype is None:
dtype = self._obj.dtype
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/arithmetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
dataset_join=dataset_join,
dataset_fill_value=np.nan,
kwargs=kwargs,
dask="allowed"
dask="allowed",
)

# this has no runtime function - these are listed so IDEs know these
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/coordinates.py
Original file line number Diff line number Diff line change
Expand Up @@ -367,7 +367,7 @@ def remap_label_indexers(
indexers: Mapping[Hashable, Any] = None,
method: str = None,
tolerance=None,
**indexers_kwargs: Any
**indexers_kwargs: Any,
) -> Tuple[dict, dict]: # TODO more precise return type after annotations in indexing
"""Remap indexers from obj.coords.
If indexer is an instance of DataArray and it has coordinate, then this coordinate
Expand Down
10 changes: 4 additions & 6 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2967,15 +2967,13 @@ def expand_dims(
for a in axis:
if a < -result_ndim or result_ndim - 1 < a:
raise IndexError(
"Axis {a} is out of bounds of the expanded"
" dimension size {dim}.".format(
a=a, v=k, dim=result_ndim
)
f"Axis {a} of variable {k} is out of bounds of the "
f"expanded dimension size {result_ndim}"
)

axis_pos = [a if a >= 0 else result_ndim + a for a in axis]
if len(axis_pos) != len(set(axis_pos)):
raise ValueError("axis should not contain duplicate" " values.")
raise ValueError("axis should not contain duplicate values")
# We need to sort them to make sure `axis` equals to the
# axis positions of the result array.
zip_axis_dim = sorted(zip(axis_pos, dim.items()))
Expand Down Expand Up @@ -3131,7 +3129,7 @@ def reorder_levels(
coord = self._variables[dim]
index = self.indexes[dim]
if not isinstance(index, pd.MultiIndex):
raise ValueError("coordinate %r has no MultiIndex" % dim)
raise ValueError(f"coordinate {dim} has no MultiIndex")
new_index = index.reorder_levels(order)
variables[dim] = IndexVariable(coord.dims, new_index)
indexes[dim] = new_index
Expand Down
51 changes: 19 additions & 32 deletions xarray/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,7 @@ def _sanitize_slice_element(x):
if isinstance(x, np.ndarray):
if x.ndim != 0:
raise ValueError(
"cannot use non-scalar arrays in a slice for "
"xarray indexing: {}".format(x)
f"cannot use non-scalar arrays in a slice for xarray indexing: {x}"
)
x = x[()]

Expand Down Expand Up @@ -128,9 +127,9 @@ def convert_label_indexer(index, label, index_name="", method=None, tolerance=No
# unlike pandas, in xarray we never want to silently convert a
# slice indexer into an array indexer
raise KeyError(
"cannot represent labeled-based slice indexer for "
"dimension %r with a slice over integer positions; "
"the index is unsorted or non-unique" % index_name
"cannot represent labeled-based slice indexer for dimension "
f"{index_name!r} with a slice over integer positions; the index is "
"unsorted or non-unique"
)

elif is_dict_like(label):
Expand Down Expand Up @@ -190,7 +189,7 @@ def convert_label_indexer(index, label, index_name="", method=None, tolerance=No
)
indexer = get_indexer_nd(index, label, method, tolerance)
if np.any(indexer < 0):
raise KeyError("not all values found in index %r" % index_name)
raise KeyError(f"not all values found in index {index_name!r}")
return indexer, new_index


Expand All @@ -208,7 +207,7 @@ def get_dim_indexers(data_obj, indexers):
if k not in data_obj.dims and k not in data_obj._level_coords
]
if invalid:
raise ValueError("dimensions or multi-index levels %r do not exist" % invalid)
raise ValueError(f"dimensions or multi-index levels {invalid!r} do not exist")

level_indexers = defaultdict(dict)
dim_indexers = {}
Expand All @@ -223,8 +222,8 @@ def get_dim_indexers(data_obj, indexers):
for dim, level_labels in level_indexers.items():
if dim_indexers.get(dim, False):
raise ValueError(
"cannot combine multi-index level indexers "
"with an indexer for dimension %s" % dim
"cannot combine multi-index level indexers with an indexer for "
f"dimension {dim}"
)
dim_indexers[dim] = level_labels

Expand Down Expand Up @@ -326,7 +325,7 @@ def tuple(self):
return self._key

def __repr__(self):
return "{}({})".format(type(self).__name__, self.tuple)
return f"{type(self).__name__}({self.tuple})"


def as_integer_or_none(value):
Expand Down Expand Up @@ -362,9 +361,7 @@ def __init__(self, key):
k = as_integer_slice(k)
else:
raise TypeError(
"unexpected indexer type for {}: {!r}".format(
type(self).__name__, k
)
f"unexpected indexer type for {type(self).__name__}: {k!r}"
)
new_key.append(k)

Expand Down Expand Up @@ -395,20 +392,17 @@ def __init__(self, key):
elif isinstance(k, np.ndarray):
if not np.issubdtype(k.dtype, np.integer):
raise TypeError(
"invalid indexer array, does not have "
"integer dtype: {!r}".format(k)
f"invalid indexer array, does not have integer dtype: {k!r}"
)
if k.ndim != 1:
raise TypeError(
"invalid indexer array for {}, must have "
"exactly 1 dimension: ".format(type(self).__name__, k)
f"invalid indexer array for {type(self).__name__}; must have "
f"exactly 1 dimension: {k!r}"
)
k = np.asarray(k, dtype=np.int64)
else:
raise TypeError(
"unexpected indexer type for {}: {!r}".format(
type(self).__name__, k
)
f"unexpected indexer type for {type(self).__name__}: {k!r}"
)
new_key.append(k)

Expand Down Expand Up @@ -439,23 +433,20 @@ def __init__(self, key):
elif isinstance(k, np.ndarray):
if not np.issubdtype(k.dtype, np.integer):
raise TypeError(
"invalid indexer array, does not have "
"integer dtype: {!r}".format(k)
f"invalid indexer array, does not have integer dtype: {k!r}"
)
if ndim is None:
ndim = k.ndim
elif ndim != k.ndim:
ndims = [k.ndim for k in key if isinstance(k, np.ndarray)]
raise ValueError(
"invalid indexer key: ndarray arguments "
"have different numbers of dimensions: {}".format(ndims)
f"have different numbers of dimensions: {ndims}"
)
k = np.asarray(k, dtype=np.int64)
else:
raise TypeError(
"unexpected indexer type for {}: {!r}".format(
type(self).__name__, k
)
f"unexpected indexer type for {type(self).__name__}: {k!r}"
)
new_key.append(k)

Expand Down Expand Up @@ -574,9 +565,7 @@ def __setitem__(self, key, value):
self.array[full_key] = value

def __repr__(self):
return "{}(array={!r}, key={!r})".format(
type(self).__name__, self.array, self.key
)
return f"{type(self).__name__}(array={self.array!r}, key={self.key!r})"


class LazilyVectorizedIndexedArray(ExplicitlyIndexedNDArrayMixin):
Expand Down Expand Up @@ -627,9 +616,7 @@ def __setitem__(self, key, value):
)

def __repr__(self):
return "{}(array={!r}, key={!r})".format(
type(self).__name__, self.array, self.key
)
return f"{type(self).__name__}(array={self.array!r}, key={self.key!r})"


def _wrap_numpy_scalars(array):
Expand Down
8 changes: 4 additions & 4 deletions xarray/core/missing.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def __call__(self, x):
self._yi,
left=self._left,
right=self._right,
**self.call_kwargs
**self.call_kwargs,
)


Expand All @@ -93,7 +93,7 @@ def __init__(
copy=False,
bounds_error=False,
order=None,
**kwargs
**kwargs,
):
from scipy.interpolate import interp1d

Expand Down Expand Up @@ -126,7 +126,7 @@ def __init__(
bounds_error=False,
assume_sorted=assume_sorted,
copy=copy,
**self.cons_kwargs
**self.cons_kwargs,
)


Expand All @@ -147,7 +147,7 @@ def __init__(
order=3,
nu=0,
ext=None,
**kwargs
**kwargs,
):
from scipy.interpolate import UnivariateSpline

Expand Down
2 changes: 1 addition & 1 deletion xarray/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ def _interpolate(self, kind="linear"):
assume_sorted=True,
method=kind,
kwargs={"bounds_error": False},
**{self._dim: self._full_index}
**{self._dim: self._full_index},
)


Expand Down
5 changes: 3 additions & 2 deletions xarray/core/rolling.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import functools
from typing import Callable

import numpy as np

Expand Down Expand Up @@ -106,7 +107,7 @@ def __repr__(self):
def __len__(self):
return self.obj.sizes[self.dim]

def _reduce_method(name):
def _reduce_method(name: str) -> Callable: # type: ignore
array_agg_func = getattr(duck_array_ops, name)
bottleneck_move_func = getattr(bottleneck, "move_" + name, None)

Expand Down Expand Up @@ -453,7 +454,7 @@ def _numpy_or_bottleneck_reduce(
array_agg_func=array_agg_func,
bottleneck_move_func=bottleneck_move_func,
),
**kwargs
**kwargs,
)

def construct(self, window_dim, stride=1, fill_value=dtypes.NA):
Expand Down
2 changes: 1 addition & 1 deletion xarray/plot/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from .dataset_plot import scatter
from .facetgrid import FacetGrid
from .plot import contour, contourf, hist, imshow, line, pcolormesh, plot, step
from .dataset_plot import scatter

__all__ = [
"plot",
Expand Down
8 changes: 4 additions & 4 deletions xarray/plot/facetgrid.py
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ def map_dataarray_line(
hue=hue,
add_legend=False,
_labels=False,
**kwargs
**kwargs,
)
self._mappables.append(mappable)

Expand Down Expand Up @@ -376,7 +376,7 @@ def add_legend(self, **kwargs):
labels=list(self._hue_var.values),
title=self._hue_label,
loc="center right",
**kwargs
**kwargs,
)

self.figlegend = figlegend
Expand Down Expand Up @@ -491,7 +491,7 @@ def set_titles(self, template="{coord} = {value}", maxchar=30, size=None, **kwar
rotation=270,
ha="left",
va="center",
**kwargs
**kwargs,
)

# The column titles on the top row
Expand Down Expand Up @@ -590,7 +590,7 @@ def _easy_facetgrid(
subplot_kws=None,
ax=None,
figsize=None,
**kwargs
**kwargs,
):
"""
Convenience method to call xarray.plot.FacetGrid from 2d plotting methods
Expand Down
Loading

0 comments on commit 4d14fca

Please sign in to comment.