-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hive Partition Schema #14838
Comments
+1 I had a similar issue open before: #12894 @ritchie46 is this something the core team has plans around by chance? If not, I'm willing to take a stab at it given some guidance on the desired design. |
@baycoder0 c-peters is part of the core team. I think you'd want to get started here polars/crates/polars-plan/src/logical_plan/hive.rs Lines 71 to 83 in e1a4179
|
The csv datetime parser seems to be over here |
Additionally, this is highly related to #13892 |
@deanm0000 I added the initial PR here: #14950. I'd like to get the initial checks. Also, I did exhaustive testing myself and would like to add units tests. However, tests in |
What we require first is schema inference on hive partitions. Otherwise some parts may be strings and/or different date formats. There needs to be something in place for schema inference and communicating that schema result between the partitions first. |
Are you still considering a parameter to pass a hive partitions schema, similar to how you can pass There are some cases where it would be nice to override the regexp-based schema inference. For instance:
Having schema inference as a default is convenient, but an option to override would be nice. |
Datetime support has been added by #17256 |
Description
When doing hive partitioning it would be great if Polars could support other data types (e.g. dates, datetimes) similar to other frameworks (e.g. duckdb , bigquery).
The text was updated successfully, but these errors were encountered: