Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Get Column into polars-expr #19660

Merged

Conversation

coastalwhite
Copy link
Collaborator

@coastalwhite coastalwhite commented Nov 6, 2024

This starts the moving of Column into polars-expr. I tried to keep this PR to a minimal, and will do the follow-up in other PRs.

For example, what this PR allows:

import polars as pl
print(
    pl.select(pl.repeat(pl.lit(1), 10))
        ._to_metadata(stats='repr')
)
shape: (1, 2)
┌─────────────┬────────┐
│ column_name ┆ repr   │
│ ---         ┆ ---    │
│ str         ┆ str    │
╞═════════════╪════════╡
│ repeat      ┆ scalar │
└─────────────┴────────┘

This can now finally produces a Column::Scalar instead of having to materialize to a Column::Series.

@github-actions github-actions bot added internal An internal refactor or improvement python Related to Python Polars rust Related to Rust Polars labels Nov 6, 2024
let length = output_length(l, r)?;
match (l, r) {
(Column::Series(l), Column::Scalar(r)) => {
let r = r.as_single_value_series();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not functioning properly for series with length=0, so I replaced it with Column::try_apply_broadcasting_binary_elementwise.

self.as_materialized_series()
.bitand(rhs.as_materialized_series())
.map(Column::from)
}
pub fn bitor(&self, rhs: &Self) -> PolarsResult<Self> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand why we have these and std::ops::BitAnd, etc. on Series.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me neither. Maybe we can try to remove it in a follow op.

@@ -15,7 +15,7 @@ impl Series {
}

#[doc(hidden)]
pub fn agg_valid_count(&self, groups: &GroupsProxy) -> Series {
pub unsafe fn agg_valid_count(&self, groups: &GroupsProxy) -> Series {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was before incorrectly marked as safe

@@ -139,8 +139,8 @@ impl Executor for JoinExec {

let df = df_left._join_impl(
&df_right,
left_on_series,
right_on_series,
left_on_series.into_iter().map(|c| c.take_materialized_series()).collect(),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really think this is a very large optimization problem, but hopefully this can be resolved when Column is fully in polars-expr.

Copy link

codecov bot commented Nov 6, 2024

Codecov Report

Attention: Patch coverage is 74.31907% with 132 lines in your changes missing coverage. Please review.

Project coverage is 79.74%. Comparing base (6b0a906) to head (f4c223d).
Report is 30 commits behind head on main.

Files with missing lines Patch % Lines
crates/polars-core/src/frame/column/mod.rs 61.57% 78 Missing ⚠️
crates/polars-expr/src/expressions/aggregation.rs 83.52% 14 Missing ⚠️
crates/polars-core/src/frame/column/partitioned.rs 0.00% 12 Missing ⚠️
crates/polars-expr/src/expressions/apply.rs 28.57% 5 Missing ⚠️
crates/polars-expr/src/expressions/literal.rs 85.29% 5 Missing ⚠️
crates/polars-stream/src/nodes/group_by.rs 0.00% 4 Missing ⚠️
crates/polars-expr/src/expressions/cast.rs 82.35% 3 Missing ⚠️
crates/polars-expr/src/expressions/column.rs 72.72% 3 Missing ⚠️
crates/polars-expr/src/expressions/binary.rs 95.91% 2 Missing ⚠️
crates/polars-expr/src/expressions/ternary.rs 50.00% 2 Missing ⚠️
... and 4 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #19660      +/-   ##
==========================================
- Coverage   79.86%   79.74%   -0.12%     
==========================================
  Files        1537     1541       +4     
  Lines      211923   212214     +291     
  Branches     2446     2446              
==========================================
- Hits       169249   169229      -20     
- Misses      42120    42431     +311     
  Partials      554      554              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ritchie46 ritchie46 merged commit d34a3e1 into pola-rs:main Nov 6, 2024
22 checks passed
@ritchie46
Copy link
Member

Nice!

@coastalwhite coastalwhite deleted the refactor/polars-expr-to-column branch November 6, 2024 21:49
tylerriccio33 pushed a commit to tylerriccio33/polars that referenced this pull request Nov 8, 2024
@c-peters c-peters added the accepted Ready for implementation label Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation internal An internal refactor or improvement python Related to Python Polars rust Related to Rust Polars
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants