Attribute style access is slow #4741

rhkleijn · 2020-12-30T15:52:07Z

I appreciate xarray's ability to use attribute style access ds.foo as an alternative to ds["foo"] as it requires less characters/keystrokes and has less 'visual clutter'.

A drawback is that it can be much slower as lookup time seems to display O(n) behaviour instead of O(1) with n being the number of variables in the dataset. For e.g. n=100 it is approximately 100 times slower than dictionary-style access:

# Dataset with many (100) variables
ds = xr.Dataset({f'var{v}': [] for v in range(100)})

%timeit ds['var0']
462 µs ± 1.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit ds.var0
47.1 ms ± 205 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

dir() and _ipython_key_completions_() which are used for e.g. tab completion in iPython are equally slow:

%timeit dir(ds)
47 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit ds._ipython_key_completions_()
46.8 ms ± 210 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

I would like to see xarray having much better performance for attribute style access.

The text was updated successfully, but these errors were encountered:

rhkleijn · 2020-12-30T16:06:45Z

With some modest refactoring in https://github.com/rhkleijn/xarray/tree/faster-attr-access I managed to speed up attribute style access, dir() and _ipython_key_completions_ (in this case ~100 fold) by using a more lazy approach and especially avoiding the eager {d: self[d] for d in self.dims} which constructs many (mostly unneeded) DataArray objects.

%timeit ds.var0
468 µs ± 1.96 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit dir(ds)
499 µs ± 1.51 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit ds._ipython_key_completions_()
242 µs ± 1.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Shall I open a pull request for this?

dcherian · 2020-12-30T16:26:36Z

Yes please! This looks great!

dcherian added the topic-performance label Dec 30, 2020

rhkleijn mentioned this issue Dec 30, 2020

speedup attribute style access and tab completion #4742

Merged

3 tasks

dcherian mentioned this issue Dec 30, 2020

Comprehensive benchmarking suite #4648

Open

19 tasks

mathause closed this as completed in #4742 Jan 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attribute style access is slow #4741

Attribute style access is slow #4741

rhkleijn commented Dec 30, 2020

rhkleijn commented Dec 30, 2020

dcherian commented Dec 30, 2020

Attribute style access is slow #4741

Attribute style access is slow #4741

Comments

rhkleijn commented Dec 30, 2020

rhkleijn commented Dec 30, 2020

dcherian commented Dec 30, 2020