Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Needs performance check / improvements in value assignment of DataArray #1771

Open
fujiisoup opened this issue Dec 9, 2017 · 1 comment
Open

Comments

@fujiisoup
Copy link
Member

fujiisoup commented Dec 9, 2017

def __setitem__(self, key, value):
if isinstance(key, basestring):
self.coords[key] = value
else:
# Coordinates in key, value and self[key] should be consistent.
# TODO Coordinate consistency in key is checked here, but it
# causes unnecessary indexing. It should be optimized.
obj = self[key]

In #1746, we added a validation in xr.DataArray.__setitem__ whether the coordinates consistency of array, key, and values are checked.
In the current implementation, we call xr.DataArray.__getitem__ to use the existing coordinate validation logic, but it does unnecessary indexing and it may decrease the __setitem__ performance if the arrray is multidimensional.

We may need to optimize the logic here.

Is it reasonable to constantly monitor the performance of basic operations, such as Dataset construction, alignment, indexing, and assignment?
(or are these operations too light to make a performance monitor?)

cc @jhamman @shoyer

@jhamman
Copy link
Member

jhamman commented Dec 9, 2017

@fujiisoup in #1457, we added a framework (Airspeed-velocity) for benchmarking xarray operations. It is certainly within the scope of that framework to include indexing performance benchmarks. I just implemented a few IO related benchmarks with the expectation that more issues, like this one, would be added later on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants