-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename rle()
struct fields to len
and value
#15230
Comments
Side note: Would it make sense for The particular use-case being wanting the original row index after performing a df = pl.DataFrame({"foo": ["a", "a", "a", "b", "c", "c"]})
df.select(pl.col("foo").rle())
# shape: (3, 1)
# ┌───────────┐
# │ foo │
# │ --- │
# │ struct[2] │
# ╞═══════════╡
# │ {3,"a"} │
# │ {1,"b"} │
# │ {2,"c"} │
# └───────────┘ We can calculate it from the length, but it's a little awkward: (df.select(pl.col("foo").rle())
.with_columns(
index = pl.col("foo").struct["lengths"].cum_sum().shift().fill_null(0)
)
#.filter(...)
)
# shape: (3, 2)
# ┌───────────┬───────┐
# │ foo ┆ index │
# │ --- ┆ --- │
# │ struct[2] ┆ i32 │
# ╞═══════════╪═══════╡
# │ {3,"a"} ┆ 0 │
# │ {1,"b"} ┆ 3 │
# │ {2,"c"} ┆ 4 │
# └───────────┴───────┘ |
Agreed on the rename. I don't think the index should be part of the RLE method by default. It is not an essential part of the RLE definition. Though possibly an |
@cmdlineluser I am tempted to also change the field order of the struct to EDIT: Nevermind, it's probably not a good idea as the standard RLE places |
Description
Remove the plural from
Series.rle()
andExpr.rle()
field names.(Similar to what was done for
value_counts
: #11462)Current:
Desired:
(Choosing
len
to match up withlist.len()
)The text was updated successfully, but these errors were encountered: