-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support __getitem__
with slices for columns
#839
feat: support __getitem__
with slices for columns
#839
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a tiny comment on a typo.
I wonder if it could be worth to:
- Add an example in the docstring of the method
- Add a note for the behavior since slicing here is
- If string: both start and stop are included
- If int: typical python behavior, start included and stop excluded
- `df['a']` extracts column `'a'` and returns a `Series`. | ||
- `df[0:2]` extracts the first two rows and returns a `DataFrame`. | ||
- `df[0:2, 'a']` extracts the first two rows from column `'a'` and returns | ||
a `Series`. | ||
- `df[0:2, 0]` extracts the first two rows from the first column and returns | ||
a `Series`. | ||
- `df[[0, 1], [0, 1, 2]]` extracts the first two rows and the first three columns | ||
and returns a `DataFrame` | ||
- `df[0: 2, ['a', 'c']]` extracts the first two rows and columns `'a'` and `'c'` and | ||
returns a `DataFrame` | ||
- `df[:, 0: 2]` extracts all rows from the first two columns and returns a `DataFrame` | ||
- `df[:, 'a': 'c']` extracts all rows and all columns positioned between `'a'` and `'c'` | ||
_inclusive_ and returns a `DataFrame`. For example, if the columns are | ||
`'a', 'b', 'c', 'd'`, then that would extract columns `'a'`, `'b'`, and `'c'`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is amazing ✨🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄 thanks, maybe we should upstream this to Polars itself, there's zero docs on getitem there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, I had never seen df[:, "a":"c"]
in polars code and I was not aware that's even possible 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good points, thanks! I think this is easier to explain with examples |
I kinda wish we didn't have to go overboard with
__getitem__
overloads, but scikit-learn are using this, so I think we shouldMy general thinking on this is: make sure that Narwhals is easily usable by scikit-learn. Then, it's their decision whether or not they adopt - but we should at least be ready
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below.