-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(rust): Expose some more information in translated expression IR to python #17209
Conversation
We needed to make a change to the arguments since Quantile has two expressions as arguments, unlike all others. To solve this, change the definition of the exposed Agg struct to take a Vec of nodes.
return Err(PyNotImplementedError::new_err( | ||
"scan with hive partitioning", | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously we were silently discarding this information.
let options = serde_json::to_string(options) | ||
.map_err(|err| PyValueError::new_err(format!("{err:?}")))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only enabled when the json
feature is active (which it is in release builds I believe). I can gate and return Err
in the case that it is not active (only the case for dev builds when trying to speed up compile cycles?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is always enabled for release builds. I think this is fine. 👍
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17209 +/- ##
==========================================
- Coverage 80.84% 80.80% -0.04%
==========================================
Files 1465 1466 +1
Lines 192042 192263 +221
Branches 2745 2745
==========================================
+ Hits 155252 155359 +107
- Misses 36285 36401 +116
+ Partials 505 503 -2 ☔ View full report in Codecov by Sentry. |
f203b74
to
8a5098c
Compare
Necessary particularly for CSV where the options select the delimiter and so forth. Rather than hand-coding the options serialization, just use the serde representation for this case.
We were previously silently ignoring this, which is the wrong thing.
8a5098c
to
493e778
Compare
let options = serde_json::to_string(options) | ||
.map_err(|err| PyValueError::new_err(format!("{err:?}")))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is always enabled for release builds. I think this is fine. 👍
Thanks. |
Explicitly handle the options and hive partitioning in scans. I decided to use the serde-based serialization for the type-specific options in
Scan
rather than laboriously writing a translation for the options structs.Question: does this need to be gated behind
#[cfg(all(feature = "json", feature = "serde_json"))]
? I think it's reasonable to require those features for the IR translation to function, but if we don't think so, I do stuff by hand.Additionally, build out a bit more of the expression translation. The major one is that
IRAggExprs
now hold aVec<Node>
rather than a singleNode
as arguments, sinceQuantile
takes two expressions.