Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support per-column nulls_last on sort operations #16639

Merged
merged 1 commit into from
Jun 1, 2024

Conversation

alexander-beedie
Copy link
Collaborator

@alexander-beedie alexander-beedie commented May 31, 2024

Fun one... started looking at implementing NULLS FIRST and NULLS LAST for the SQL interface, and needed to extend the core sort "nulls_last" parameter so that (like "descending") it can take per-column values.

Examples

import polars

df = pl.DataFrame({"x": [None, 1, None, 3], "y": [3, 2, None, 1]})
# shape: (4, 2)
# ┌──────┬──────┐
# │ x    ┆ y    │
# │ ---  ┆ ---  │
# │ i64  ┆ i64  │
# ╞══════╪══════╡
# │ null ┆ 3    │
# │ 1    ┆ 2    │
# │ null ┆ null │
# │ 3    ┆ 1    │
# └──────┴──────┘

df.sort("x", "y", nulls_last=True)
# shape: (4, 2)
# ┌──────┬──────┐
# │ x    ┆ y    │
# │ ---  ┆ ---  │
# │ i64  ┆ i64  │
# ╞══════╪══════╡
# │ 1    ┆ 2    │
# │ 3    ┆ 1    │
# │ null ┆ 3    │
# │ null ┆ null │
# └──────┴──────┘

df.sort("x", "y", nulls_last=False)
# shape: (4, 2)
# ┌──────┬──────┐
# │ x    ┆ y    │
# │ ---  ┆ ---  │
# │ i64  ┆ i64  │
# ╞══════╪══════╡
# │ null ┆ null │
# │ null ┆ 3    │
# │ 1    ┆ 2    │
# │ 3    ┆ 1    │
# └──────┴──────┘

df.sort("x", "y", nulls_last=[False,True])
# shape: (4, 2)
# ┌──────┬──────┐
# │ x    ┆ y    │
# │ ---  ┆ ---  │
# │ i64  ┆ i64  │
# ╞══════╪══════╡
# │ null ┆ 3    │
# │ null ┆ null │
# │ 1    ┆ 2    │
# │ 3    ┆ 1    │
# └──────┴──────┘

df.sort("x", "y", nulls_last=[True,False])
# shape: (4, 2)
# ┌──────┬──────┐
# │ x    ┆ y    │
# │ ---  ┆ ---  │
# │ i64  ┆ i64  │
# ╞══════╪══════╡
# │ 1    ┆ 2    │
# │ 3    ┆ 1    │
# │ null ┆ null │
# │ null ┆ 3    │
# └──────┴──────┘

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels May 31, 2024
@alexander-beedie alexander-beedie marked this pull request as draft May 31, 2024 21:48
@alexander-beedie alexander-beedie force-pushed the per-column-nulls-last branch from b0893d0 to 3ff01ae Compare May 31, 2024 22:01
@alexander-beedie alexander-beedie marked this pull request as ready for review May 31, 2024 22:03
@alexander-beedie alexander-beedie marked this pull request as draft May 31, 2024 22:20
@alexander-beedie alexander-beedie force-pushed the per-column-nulls-last branch from 3ff01ae to 3fadc78 Compare June 1, 2024 06:48
@alexander-beedie alexander-beedie marked this pull request as ready for review June 1, 2024 07:05
Copy link

codecov bot commented Jun 1, 2024

Codecov Report

Attention: Patch coverage is 90.90909% with 12 lines in your changes are missing coverage. Please review.

Project coverage is 81.49%. Comparing base (5974ac7) to head (3fadc78).
Report is 1 commits behind head on main.

Files Patch % Lines
crates/polars-ops/src/series/ops/various.rs 0.00% 6 Missing ⚠️
...es/polars-plan/src/logical_plan/alp/tree_format.rs 0.00% 3 Missing ⚠️
crates/polars-expr/src/expressions/sortby.rs 93.75% 1 Missing ⚠️
py-polars/src/lazyframe/visitor/expr_nodes.rs 0.00% 1 Missing ⚠️
py-polars/src/lazyframe/visitor/nodes.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16639      +/-   ##
==========================================
- Coverage   81.51%   81.49%   -0.02%     
==========================================
  Files        1414     1414              
  Lines      185995   186398     +403     
  Branches     3026     3014      -12     
==========================================
+ Hits       151608   151904     +296     
- Misses      33856    33965     +109     
+ Partials      531      529       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice @alexander-beedie. I wanted this for a long time, but didn't get to it.

@ritchie46 ritchie46 merged commit 8710274 into pola-rs:main Jun 1, 2024
30 checks passed
@alexander-beedie alexander-beedie deleted the per-column-nulls-last branch June 1, 2024 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants