Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Public API for setting new indexes: add a set_xindex method? #6849

Closed
benbovy opened this issue Jul 29, 2022 · 5 comments · Fixed by #6971
Closed

Public API for setting new indexes: add a set_xindex method? #6849

benbovy opened this issue Jul 29, 2022 · 5 comments · Fixed by #6971

Comments

@benbovy
Copy link
Member

benbovy commented Jul 29, 2022

What is your issue?

xref #6795 (comment) and #6293 (Public API section).

The scipy22 branch contains the addition of a .set_xindex() method to DataArray and Dataset so that participants at the SciPy 2022 Xarray sprint could experiment with custom indexes. After thinking more about it, I'm wondering if it couldn't actually be part of Xarray's public API alongside .set_index() (at least for a while).

  • Having two methods .set_xindex() vs. .set_index() would be quite consistent with the .xindexes vs. .indexes properties that are already there.

  • I actually like the .set_xindex() API proposed in the scipy22, i.e., setting one index at a time from one or more coordinates, possibly with build options. While it could be possible to support both that and .set_index()'s current API (quite specific to pandas multi-indexes) all in one method, it would certainly result in a much more confusing API and internal implementation.

  • In the long term we could progressively get rid of .indexes and .set_index() and/or rename .xindexes to .indexes and .set_xindex() to .set_index().

Thoughts @pydata/xarray?

@benbovy benbovy added the needs triage Issue that has not been reviewed by xarray team member label Jul 29, 2022
@benbovy benbovy added topic-indexing design question and removed needs triage Issue that has not been reviewed by xarray team member labels Jul 29, 2022
@keewis
Copy link
Collaborator

keewis commented Jul 29, 2022

sounds good!

It would probably be good to figure out a way to extend it to accept multiple indexes at once, though, or have another method that allows that (if we think that set_xindex might support setting multiple indexes we should make sure the API can be extended).

@shoyer
Copy link
Member

shoyer commented Jul 29, 2022

This sounds great to me!

I don't think we need support for setting multiple indexes at once in a single method call. You can call set_xindex multiple times for that if needed.

@keewis
Copy link
Collaborator

keewis commented Jul 29, 2022

You can call set_xindex multiple times for that if needed.

I guess that depends on how many indexes you'd want to set at once? If that number is constant and small (like, 1-3) then that would be fine for me, but for anything else a bulk API would be great.

@benbovy
Copy link
Member Author

benbovy commented Jul 29, 2022

Hmm I'd rather expect that in most cases max. 2-3 (meta-)indexes are set, each from possibly a larger number of coordinates (e.g., 2-d staggered grid), but I could be wrong.

One exception could be setting (many) PandasIndex instances, one for each 1-d non-dimension coordinate of a Dataset. Maybe we could adapt .set_index() to support that case? Or later add a convenient method that "extends" default indexes to all 1-d coordinates if it turns out to be useful in many cases?

Another example would be to replace all default (pandas) indexes by Pint indexes. I guess that a convenient method accessible via the pint accessor would do the job well in that case?

@shoyer
Copy link
Member

shoyer commented Jul 29, 2022

I agree, I think only setting a few indexes at a time would be normal. If we eventually need convenience methods for setting multiple indexes we can add those later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants