You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I appreciate xarray's ability to use attribute style access ds.foo as an alternative to ds["foo"] as it requires less characters/keystrokes and has less 'visual clutter'.
A drawback is that it can be much slower as lookup time seems to display O(n) behaviour instead of O(1) with n being the number of variables in the dataset. For e.g. n=100 it is approximately 100 times slower than dictionary-style access:
# Dataset with many (100) variables
ds = xr.Dataset({f'var{v}': [] for v in range(100)})
%timeit ds['var0']
462 µs ± 1.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit ds.var0
47.1 ms ± 205 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
dir() and _ipython_key_completions_() which are used for e.g. tab completion in iPython are equally slow:
%timeit dir(ds)
47 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit ds._ipython_key_completions_()
46.8 ms ± 210 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
I would like to see xarray having much better performance for attribute style access.
The text was updated successfully, but these errors were encountered:
With some modest refactoring in https://github.com/rhkleijn/xarray/tree/faster-attr-access I managed to speed up attribute style access, dir() and _ipython_key_completions_ (in this case ~100 fold) by using a more lazy approach and especially avoiding the eager {d: self[d] for d in self.dims} which constructs many (mostly unneeded) DataArray objects.
%timeit ds.var0
468 µs ± 1.96 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit dir(ds)
499 µs ± 1.51 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit ds._ipython_key_completions_()
242 µs ± 1.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
I appreciate xarray's ability to use attribute style access
ds.foo
as an alternative tods["foo"]
as it requires less characters/keystrokes and has less 'visual clutter'.A drawback is that it can be much slower as lookup time seems to display
O(n)
behaviour instead ofO(1)
withn
being the number of variables in the dataset. For e.g.n=100
it is approximately 100 times slower than dictionary-style access:dir()
and_ipython_key_completions_()
which are used for e.g. tab completion in iPython are equally slow:I would like to see xarray having much better performance for attribute style access.
The text was updated successfully, but these errors were encountered: