Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python)!: Update function signature of nth to allow positional input of indices, remove columns parameter #16510

Merged
merged 4 commits into from
Jun 4, 2024

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented May 26, 2024

Closes #16511

Changes

  • Update nth to remove the overloaded capability to select values from given columns at the specified indices.
  • Now positional parameters are parsed as additional column indices to select.

Example

Before:

>>> df = pl.DataFrame({"a": [1, 2], "b": [3, 4], "c": [5, 6]})
>>> df.select(pl.nth(1, "a"))
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 2   │
└─────┘

After:

>>> df.select(pl.nth(1, "a"))
...
TypeError: argument 'indices': 'str' object cannot be interpreted as an integer

Use instead:

>>> df.select(pl.col("a").get(1))
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 2   │
└─────┘

@stinodego stinodego changed the title feat(python)!: Update function signature of nth feat(python)!: Update function signature of nth to allow positional input of indices May 26, 2024
@github-actions github-actions bot added breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars labels May 26, 2024
@stinodego stinodego self-assigned this May 26, 2024
@stinodego stinodego modified the milestone: 1.0.0 May 26, 2024
@stinodego stinodego removed their assignment May 26, 2024
Copy link

codecov bot commented May 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.51%. Comparing base (d856b49) to head (81fed6e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16510      +/-   ##
==========================================
- Coverage   81.53%   81.51%   -0.02%     
==========================================
  Files        1410     1410              
  Lines      185061   185059       -2     
  Branches     2982     2982              
==========================================
- Hits       150885   150848      -37     
- Misses      33660    33695      +35     
  Partials      516      516              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexander-beedie
Copy link
Collaborator

alexander-beedie commented May 27, 2024

The addition of multi-index support for nth made the existing function signature awkward. This fixes the issue.

Completely agree with this one 💯

EDIT: Looking at this more, the columns parameter on this function makes no sense at all. first/last do have a similar signature, but there it makes sense because first/last are Expr methods. I removed it.

This one I'm less sure about; the reason I matched it to first/last is because the functionality is equivalent and we have a naming mismatch with the Expr method (which is get instead of nth). Seems arbitrary that we say "you can have this param for first and last, but not nth, even though we can trivially support it". Your call though, I don't feel that strongly about it if you think aligning by name is more important😉

@stinodego
Copy link
Member Author

We allow pl.first("a") as a syntactic sugar for pl.col("a").first(). This is used a lot in groupby-type contexts.

pl.nth(1, columns=["a"]) isn't really any 'sweeter' than pl.col("a").get(1). In fact, I would say it's a lot worse because I am reading left-to-right and thinking "Okay, selecting columns by index, taking the first index, ... oh nevermind, I am doing something completely different". So including it makes no sense (to me).

@stinodego stinodego changed the title feat(python)!: Update function signature of nth to allow positional input of indices feat(python)!: Update function signature of nth to allow positional input of indices, remove columns parameter May 27, 2024
@alexander-beedie
Copy link
Collaborator

Fine by me; to be honest I don't really like the "columns" param in the other two that much either, so feel free to ditch it for nth ;)

@stinodego
Copy link
Member Author

stinodego commented May 27, 2024

Fine by me; to be honest I don't really like the "columns" param in the other two that much either, so feel free to ditch it for nth ;)

Agree - it makes sense for pl.sum("a") because it's shorter than pl.col("a").sum() and pl.sum() isn't already a thing. But pl.first() is already a thing so now it's overloaded. Not very clean.

Maybe we can make progress on this once we solve #13757 - when selectors are better integrated, we can probably get rid of pl.first() and pl.last(), as those are really just a type of selector. I'm leaving that for 2.0.0 though I think.

@stinodego stinodego merged commit f7f7d07 into main Jun 4, 2024
16 checks passed
@stinodego stinodego deleted the breaking-depr-cols branch June 4, 2024 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update function signature of nth to allow positional input of indices
2 participants