Releases · pola-rs/polars

24 Dec 08:51

github-actions

py-1.18.0

93ceacc

Python Polars 1.18.0 Latest

Latest

🏆 Highlights

Add new Int128Type (#20232)

🚀 Performance improvements

Order observability optimizations (#20396)
Purge ChunkedArray Metadata (#20371)
Explicit transpose in new-streaming equi-join finalize (#20363)
Cache dtype on ExprIR (#20331)
Lower overhead for BytecodeParser on introspection of incompatible UDFs (#20280)

✨ Enhancements

Always resolve dynamic types in schema (#20406)
Support loading data from multiple Excel/ODS workbooks (#20404)
Add "drop_empty_cols" parameter for read_excel and read_ods (#20430)
Order observability optimizations (#20396)
Add FirstArgLossless supertype (#20394)
Add dt.replace (#19708)
Polars build for Pyodide (#20383)
Add Azure credential provider using DefaultAzureCredential() (#20384)
Add env var to ignore file cache allocate error (#20356)
Enable joins between compatible differing numeric key columns (#20332)
Cache dtype on ExprIR (#20331)
Serialize DataFrame/Series using IPC in serde (#20266)
Improve error message on SchemaError (#20326)
Use better error messages when opening files (#20307)
Add 'skip_lines' for CSV (#20301)
Allow subtraction of time dtype columns (#20300)
Add bin.reinterpret (#20263)
Allow decoding of non-Polars arrow dictionaries in Arrow and Parquet (#20248)
Streamline creation of empty frame from Schema (#20267)
Add cat.len_chars and cat.len_bytes (#20211)
Expose AexprArena (#20230)

🐞 Bug fixes

Fix nullable object in map_elements (#20422)
Properly handle to_physical_repr of nested types (#20413)
Properly raise UDF errors (#20417)
Workaround for mmap crash under Emscripten (#20418)
Fix using new_columns in scan_csv with compressed file (#20412)
Fix return type of Series.dt.add_business_days (#20402)
Fix decimal series dispatch (#20400)
Fix decimal arithmetic schema (#20398)
Raise on categorical search_sorted (#20395)
Fix plotting f-strings and docstrings (#20399)
Don't try to load non-existend List/FSL statistics (#20388)
Propagate nulls for float methods on all numeric types (#20386)
Add env var to ignore file cache allocate error (#20356)
Flip order on right join (#20358)
Correctly parse special float values in from_repr (#20351)
Fix incorrect object store caching for ADLS URI (#20357)
Use the same encoding for nullable as non-nullable arrays (#20323)
Improve error message on SchemaError (#20326)
Boolean optional slice pushdown (#20315)
Properly handle from_physical for List/Array (#20311)
Ignore quotes in csv comments (#20306)
Ensure pl.datetime returns empty column when input columns are empty (#20278)
Ensure output height does not change on lazy projection pushdown with aggregations (#20223)
Fix error writing on Windows to locations outside of C drive (#20245)
Incorrect comparison in some cases with filtered list/array columns (#20243)
Ensure height is maintained in SQL SELECT 1 FROM (#20241)
Properly account for updated Categorical in .unique() kernel (#20235)

📖 Documentation

Improve docstring clarity (#20416)
Update GPU engine installation instructions to remove --extra-index-url from CUDA 12 packages (#20381)
Remove Plugins overview page without information (#20348)
Small fixes/clarifications in user guide (#20335)
Improve docs about NaN (#20310)
Fix substr function param definition (#19054)
Include parquet options in BigQuery I/O write sample (#20292)
Fix typo in fork warning (#20258)

📦 Build system

Add project.dynamic = ["version"] to pyproject.toml (#20345)
Update pyo3 and numpy crates to version 0.23 (#20111)
Build wheels for ARM Windows in Python release workflow (#20247)

🛠️ Other improvements

Enable masked out list, struct and array elements in parametric tests (#20365)
Move hive partitioning/multi-file handling outside of readers (#20203)
Purge ChunkedArray Metadata (#20371)
Correcting misspelled return value and unifying regional spelling (#20375)
Add test for select(len()) (#20343)
Make parametric tests include pl.List and pl.Array by default (#20319)
Use Column in Row Encoding (#20312)
Don't warn on fork hook (#20309)
Don't deconstruct CsvParseOptions (#20302)
Allow decoding of non-Polars arrow dictionaries in Arrow and Parquet (#20248)
Prepare test suite for Python 3.13 support (#20297)
Add FunctionCastOptions and conservative IR-level cast type-checking (#20286)
Add more descriptive error message for failure of vstack/extend (#20299)
Clean up some remnants of Python 3.8 support (#20293)
Add new Int128Type (#20232)
Add test for BytesIO overwritten after scan (#20240)
Expose AexprArena (#20230)

Thank you to all our contributors for making this release possible!
@Jesse-Bakker, @Terrigible, @ZemanOndrej, @alexander-beedie, @balbok0, @beckernick, @bschoenmaeckers, @coastalwhite, @georgestagg, @hamdanal, @haocheng6, @kszlim, @lukemanley, @mcrumiller, @nameexhaustion, @noexecstack, @orlp, @ptiza, @r-brink, @ritchie46, @rodrigogiraoserrao, @stijnherfst, @stinodego, @tswast and @zero-stroke

Contributors

orlp, tswast, and 23 other contributors

Assets 4

09 Dec 13:56

github-actions

py-1.17.1

87feed7

Python Polars 1.17.1

🐞 Bug fixes

Fix incorrect lazy select(len()) with some select orderings (#20222)
Fix assertion panic on LazyFrame scratch.is_empty() (#20219)

Thank you to all our contributors for making this release possible!
@nameexhaustion and @ritchie46

Contributors

ritchie46 and nameexhaustion

Assets 3

08 Dec 11:16

github-actions

rs-0.45.0.1

58a38af

Rust Polars 0.45.0

💥 Breaking changes

Remove dedicated sink_(parquet/ipc)_cloud functions (#20164)
Experimental cloud write support (#20129)

🚀 Performance improvements

Add fast paths for series.arg_sort and dataframe.sort (#19872)
Utilize the RangedUniqueKernel for Enum/Categorical (#20150)
Reduce memory copy when scanning from Python objects (#20142)
Don't instantiate validity mask when unneeded in Parquet (#20149)
Expand more filters (#20022)
Cache the DataFrame schema in get_column_index (#20021)
Reduce the size of row encoding UTF-8 (#19911)
Memoize duplicates in rolling-gb-dyn (#19939)
More efficient row encoding for pl.List (#19907)
Half the size of Booleans in row encoding (#19927)
Rolling 'iter_lookbehind' breeze through duplicates (#19922)
Initially trim leading and trailing filtered rows (#19850)
Increase default async thread count for low core count systems (#19829)
Move row group decode off async thread for local streaming parquet scan (#19828)
Support use of Duration in to_string, ergonomic/perf improvement, tz-aware Datetime bugfix (#19697)
Improve DataFrame.sort().limit/top_k performance (#19731)
Improve cloud scan performance (#19728)
Fix quadratic 'with_columns' behavior (#19701)
Improve hive partition pruning with datetime predicates from SQL (#19680)
Allow for arbitrary skips in Parquet Dictionary Decoding (#19649)
Reorder conditions in is_leap_year (#19602)
Rechunk in DataFrame.rows if needed (#19628)
Dispatch Parquet Primitive PLAIN decoding to faster kernels when possible (#19611)
Use faster iteration in 'starts_with'/'ends_with' (#19583)
Branchless Parquet Prefiltering (#19190)

✨ Enhancements

Retry with reloaded credentials on cloud error (#20185)
Support reading Enum dtype from csv (#20188)
Allow sorting of lists and arrays (#20169)
Add maintain_order parameter to joins (#20026)
Allow for to_datetime / strftime to automatically parse dates with single-digit hour/minute/second (#20144)
Experimental cloud write support (#20129)
Allow setting and reading custom schema-level IPC metadata (#20066)
Add optimized row encoding for Decimals (#20050)
Add drop_nans method to DataFrame and LazyFrame (#20029)
Catch use of 'polars' in to_string for non-Duration dtypes and raise an informative error (#19977)
Add AhoCorasick backed 'find_many' (#19952)
Speed up starts_with for small prefixes (#19904)
Auto-enable hive partitioning if hive_schema was given (#19902)
Add pl.concat_arr to concatenate columns into an Array column (#19881)
Support both "iso" and "iso:strict" format options for dt.to_string (#19840)
Add rounding for Decimal type (#19760)
Improved array arithmetic support (#19837)
Raise informative error on Unknown unnest (#19830)
Support use of Duration in to_string, ergonomic/perf improvement, tz-aware Datetime bugfix (#19697)
Allow specification of chunk_size on LazyCsvReader.read_options (#19819)
Add an is_literal method to expression meta namespace (#19773)
A different approach to warning users of fork() issues with Polars (#19197)
Add dylib (#19759)
Add IPC source node for new streaming engine (#19454)
Implement max/min methods for dtypes (#19494)
Improve hive partition pruning with datetime predicates from SQL (#19680)
Parallel IPC sink for the new streaming engine (#19622)
Add SQL support for RIGHT JOIN, fix an issue with wildcard aliasing (#19626)
Add show_graph to display a GraphViz plot for expressions (#19365)

🐞 Bug fixes

Don't trigger length check in array construction (#20205)
Allow row encoding for 32-bit architectures (e.g. WASM) (#20186)
Properly project unordered column in parquet prefiltered (#20189)
Csv stop simd cache if eol char is hit (#20199)
Estimated size for object (#20191)
Respect parallel argument in parquet (#20187)
Only validate UTF-8 for selected items when all below len 128 (#20183)
Serialize categories of Enum in arrow metadata (#20181)
Don't use RLE encoding for Parquet Boolean (#20172)
Invalid bitwise_xor for ScalarColumn (#20140)
Add temporal feature gate in is_elementwise_top_level (#20177)
Column name mismatch or not found in Parquet scan with filter (#20178)
Raise if apply returns different types (#20168)
Deal with masked out list elements (#20161)
Fix index out of bounds in uniform_hist_count (#20133)
Implement arg_sort for Null series (#20135)
Handle slice pushdown in PythonUDF GroupBy (#20132)
Check shape for *_horizontal functions (#20130)
Properly coerce types in lists (#20126)
Incorrect aggregation of empty groups after slice (#20127)
DataFrame .get_column after drop_in_place (#20120)
Subtraction with underflow on empty FixedSizeBinaryArray (#20109)
Materialize smallest dyn ints to use feature gate for i8/i16 (#20108)
Return null instead of 0. for rolling_std when window contains a single element and ddof=1 and there are nulls elsewhere in the Series (#20077)
Only slice after sort when slice is smaller than frame length (#20084)
Preserve Series name in __rpow__ operation (#20072)
Allow nested is_in() in when()/then() for full-streaming (#20052)
Fix datetime cast behavior for pre-epoch times (#19949)
Improve hist binning around breakpoints (#20054)
Fix invalid len due to projection pushdown selection of scalar (#20049)
Fix empty scalar agg type (#20051)
Improve binning in Series.hist with bin_count when all values are the same (#20034)
Less intrusive forking warnings (#20032)
Reading nullable sliced / masked Categoricals from Parquet (#20024)
Regression in hist panicking on out of bounds index (#20016)
Fix starts_with out of bounds (#20006)
Fix incorrect column order for parquet scan with hive columns in file (#19996)
Incorrectly gave list.len() for masked-out rows (#19999)
Bug fix in existing fast path for sorted series (#20004)
Incorrect collect_schema() for fill_null() after an aggregation expression in group-by context (#19993)
Fix Decimal type fill_null (#19981)
Fix panic on schema merge for prefiltering (#19972)
Fix lazy frame join expression (#19974)
Fix gather_every for Scalar (#19964)
Toggle 'fast_unique' on new_from_index (#19956)
Raise proper error message when too small interval is passed to datetime_range (#19955)
Fix scalar object (#19940)
Raise InvalidOperationError for invalid float to decimal casts (e.g. Inf, NaN) (#19938)
Fix panic with combination of hive and parquet prefiltering (#19905)
Fix panic when joining with empty frame (debug only) (#19896)
Fix incorrect result from inequality filter after join on LazyFrame (#19898)
Misleading ShapeError error message on dataframe creation (#19901)
Fix panic with empty delta scan, or empty parquet scan with a provided schema (#19884)
Ensure type object of inputs for cached any-value conversion functions are kept alive (#19866)
Fix panic using scan_parquet().with_row_index() with hive partitioning enabled (#19865)
Improve histogram bin logic (#18761)
Raise informative error instead of panicking for list arithmetic on some invalid dtypes (#19841)
Properly handle Zero-Field Structs in row encoding (#19846)
Incorrect explode schema for LazyFrame.explode() (#19860)
Ensure List element truncation ellipses respect ASCII* table formats (#19835)
Validate subnodes in validate IR (#19831)
Raise if merge non-global categoricals in unpivot (#19826)
Type hints for window_size incorrectly included timedelta in some rolling functions (#19827)
Don't panic if column not found (#19824)
Fix gather of Scalar null + idx w/ validity (#19823)
Fix object chunked gather (#19811)
Fix inconsistency between code and comment (#19810)
Fix filter scalar nulls (#19786)
Altair tooltip was being incorrectly applied to plots which did not accept it (#19789)
Fix scanning google cloud with service account credentials file (#19782)
Fix incorrect filter after right-join on LazyFrame (#19775)
Fix incorrect lazy schema for explode on array columns (#19776)
Fix incorrect lazy schema for aggregations (#19753)
Fix validation for inner and left join when join_nulls unflaged (#19698)
SQL ELSE clause should be implicitly NULL when omitted (#19714)
In group_by_dynamic, period and every were getting applied in reverse order for the window upper boundary (#19706)
Only allow list.to_struct to be elementwise when width is fixed (#19688)
Make Array arithmetic ops fully elementwise (#19682)
Update line-splitting logic in batched CSV reader (#19508)
Fix incorrect lazy schema for explode() in agg() (#19629)
Fix filter incorrectly pushed past struct unnest when unnested column name matches upper column name (#19638)
Ensure mean_horizontal raises on non-numeric input (#19648)
Reorder conditions in is_leap_year (#19602)
Copy height in .vstack() for empty dataframes (#19641) (#19642)
Run join type coercion with correct schemas active (#19625)
Correct wildcard and input expansion for some more functions (#19588)
Allow .struct.with_fields inside list.eval (#19617)
Sortedness was incorrectly being preserved in dt.offset_by when offsetting by non-constant durations in the timezone-naive case (#19616)
Fix incorrect scan_parquet().with_row_index() with non-zero slice or with streaming collect (#19609)
Fix mask and validity confusion in Parquet String decoding (#19614)
Parquet decoding of nested dictionary values (#19605)
Do not attempt to load default credentials when credential_provider is given (#19589)
Fix gather len in group-by state (#19586)
Added input validation for explode operation in the array namespace (#19163)
Improve error message (#19546)
Fix predica...

Contributors

janpipek, orlp, and 40 other contributors

Assets 2

08 Dec 10:26

github-actions

py-1.17.0

5f6bc77

Python Polars 1.17.0

🚀 Performance improvements

Add fast paths for series.arg_sort and dataframe.sort (#19872)
Much faster Series construction from subclasses of standard Python types (#20166)
Utilize the RangedUniqueKernel for Enum/Categorical (#20150)
Reduce memory copy when scanning from Python objects (#20142)
Construct Series for bytes/binary data 10x faster when dtype not explicitly set (#20157)
Don't instantiate validity mask when unneeded in Parquet (#20149)

✨ Enhancements

Retry with reloaded credentials on cloud error (#20185)
Support reading Enum dtype from csv (#20188)
Improve dtype inference and load for DataFrame cols constructed from Python Enum values (#20180)
Allow sorting of lists and arrays (#20169)
Add maintain_order parameter to joins (#20026)
Allow for to_datetime / strftime to automatically parse dates with single-digit hour/minute/second (#20144)
Issue warning when using to_struct() without a list of field names (#20158)
Experimental cloud write support (#20129)
Add lazy support for pl.select (#20091)
Enable view arrow export in write_delta (#20092)

🐞 Bug fixes

Don't trigger length check in array construction (#20205)
Allow row encoding for 32-bit architectures (e.g. WASM) (#20186)
Properly project unordered column in parquet prefiltered (#20189)
Csv stop simd cache if eol char is hit (#20199)
Estimated size for object (#20191)
Respect parallel argument in parquet (#20187)
Only validate UTF-8 for selected items when all below len 128 (#20183)
Serialize categories of Enum in arrow metadata (#20181)
Don't use RLE encoding for Parquet Boolean (#20172)
Invalid bitwise_xor for ScalarColumn (#20140)
Series construct with large nested u64 (#20167)
Add temporal feature gate in is_elementwise_top_level (#20177)
Column name mismatch or not found in Parquet scan with filter (#20178)
Raise if apply returns different types (#20168)
Deal with masked out list elements (#20161)
Fix index out of bounds in uniform_hist_count (#20133)
Implement arg_sort for Null series (#20135)
Handle slice pushdown in PythonUDF GroupBy (#20132)
Check shape for *_horizontal functions (#20130)
Properly coerce types in lists (#20126)
Incorrect aggregation of empty groups after slice (#20127)
DataFrame .get_column after drop_in_place (#20120)
Subtraction with underflow on empty FixedSizeBinaryArray (#20109)
Materialize smallest dyn ints to use feature gate for i8/i16 (#20108)
Return null instead of 0. for rolling_std when window contains a single element and ddof=1 and there are nulls elsewhere in the Series (#20077)
Only slice after sort when slice is smaller than frame length (#20084)
Preserve Series name in __rpow__ operation (#20072)
Allow nested is_in() in when()/then() for full-streaming (#20052)

📖 Documentation

Add more Rust examples to User Guide (#20194)
Expand plotting docs (#19719)
Fix Rust examples in user guide (#20075)
Update by param description for rolling_*_by functions (#19715)
Correct supported compression formats (#20085)
Specify strictness in cast (#20067)

📦 Build system

Upgrade sqlparser-rs from version 0.49 to 0.52 (#20110)
Bump memmap2 to version 0.9 (#20105)
Bump object_store to version 0.11 (#20102)
Bump fs4 to version 0.12 (#20101)
Bump thiserror to version 2 (#20097)
Bump atoi_simd to version 0.16 (#20098)
Bump chrono-tz to 0.10 (#20094)
Update Rust dependency ndarray to 0.16 (#20093)
Bump Rust toolchain to nightly-2024-11-28 (#20064)

🛠️ Other improvements

Deprecate ddof parameter for correlation coefficient (#20197)
Move Bitwise aggregations to FunctionExpr (#20193)
Add ragged lines test (#20182)
Set delta version check higher (#20153)
Fix typo in assertion in datatype copy test (#20121)
Move horizontal methods to polars-ops (#20134)
Remove useless SeriesTrait::get implementations (#20136)
Add a bunch more automated row encoding sortedness tests (#20056)

Thank you to all our contributors for making this release possible!
@DzenanJupic, @MarcoGorelli, @YichiZhang0613, @alexander-beedie, @coastalwhite, @dependabot, @dependabot[bot], @flowlight0, @henryharbeck, @iharthi, @ion-elgreco, @jqnatividad, @lukapeschke, @lukemanley, @mcrumiller, @nameexhaustion, @ptiza, @ritchie46, @siddharth-vi, @stijnherfst, @stinodego and @wsyxbcl

Contributors

mcrumiller, jqnatividad, and 19 other contributors

Assets 3

29 Nov 11:22

github-actions

py-1.16.0

44ddbc2

Python Polars 1.16.0

🚀 Performance improvements

Expand more filters (#20022)
Cache the DataFrame schema in get_column_index (#20021)

✨ Enhancements

Enable creation of independently reusable Config instances (#20053)
Improved error message on invalid Python Enum init (#20060)
Improve Polars Enum dtype init from standard Python enums (#19997)
Add optimized row encoding for Decimals (#20050)
Add drop_nans method to DataFrame and LazyFrame (#20029)

🐞 Bug fixes

Improve hist binning around breakpoints (#20054)
Fix invalid len due to projection pushdown selection of scalar (#20049)
Fix empty scalar agg type (#20051)
Improve binning in Series.hist with bin_count when all values are the same (#20034)
Less intrusive forking warnings (#20032)
Reading nullable sliced / masked Categoricals from Parquet (#20024)
Regression in hist panicking on out of bounds index (#20016)
Fix starts_with out of bounds (#20006)
Fix incorrect column order for parquet scan with hive columns in file (#19996)
Incorrectly gave list.len() for masked-out rows (#19999)
Bug fix in existing fast path for sorted series (#20004)
Incorrect collect_schema() for fill_null() after an aggregation expression in group-by context (#19993)
Fix row_by_key typing (#19888)

📖 Documentation

Remove note about guaranteed left join order (#20048)
Fix broken links to user guide (#19989)

📦 Build system

Pin maturin (#20063)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @coastalwhite, @gab23r, @lukemanley, @mcrumiller, @nameexhaustion, @ritchie46, @siddharth-vi, @stijnherfst and @stinodego

Contributors

mcrumiller, alexander-beedie, and 8 other contributors

Assets 3

25 Nov 21:55

github-actions

py-1.15.0

f0d087d

Python Polars 1.15.0

🚀 Performance improvements

Reduce the size of row encoding UTF-8 (#19911)
Memoize duplicates in rolling-gb-dyn (#19939)
More efficient row encoding for pl.List (#19907)
Half the size of Booleans in row encoding (#19927)
Rolling 'iter_lookbehind' breeze through duplicates (#19922)
Initially trim leading and trailing filtered rows (#19850)

✨ Enhancements

Catch use of 'polars' in to_string for non-Duration dtypes and raise an informative error (#19977)
Add AhoCorasick backed 'find_many' (#19952)
Allow Python Enums as dtype inputs (#19926)
Speed up starts_with for small prefixes (#19904)
Auto-enable hive partitioning if hive_schema was given (#19902)
Add pl.concat_arr to concatenate columns into an Array column (#19881)
Support both "iso" and "iso:strict" format options for dt.to_string (#19840)
Add rounding for Decimal type (#19760)
Improved array arithmetic support (#19837)

🐞 Bug fixes

Fix Decimal type fill_null (#19981)
Fix panic on schema merge for prefiltering (#19972)
Fix lazy frame join expression (#19974)
Fix gather_every for Scalar (#19964)
Toggle 'fast_unique' on new_from_index (#19956)
Parse uppercase config keys (#19852)
Raise proper error message when too small interval is passed to datetime_range (#19955)
Fix scalar object (#19940)
Raise InvalidOperationError for invalid float to decimal casts (e.g. Inf, NaN) (#19938)
Address indexing edge-case with numpy arrays (#19895)
Fix panic with combination of hive and parquet prefiltering (#19905)
Fix panic when joining with empty frame (debug only) (#19896)
Fix incorrect result from inequality filter after join on LazyFrame (#19898)
Misleading ShapeError error message on dataframe creation (#19901)
Fix panic with empty delta scan, or empty parquet scan with a provided schema (#19884)
Ensure type object of inputs for cached any-value conversion functions are kept alive (#19866)
Improve export from 2D Array dtype columns to PyTorch Tensors (to_torch) and Jax Arrays (to_jax) (#19862)
Fix panic using scan_parquet().with_row_index() with hive partitioning enabled (#19865)
Improve histogram bin logic (#18761)
Raise informative error instead of panicking for list arithmetic on some invalid dtypes (#19841)
Properly handle Zero-Field Structs in row encoding (#19846)
Incorrect explode schema for LazyFrame.explode() (#19860)
DataFrame rows_by_key returning key tuples with elements in wrong order (#19486)
Ensure List element truncation ellipses respect ASCII* table formats (#19835)

📖 Documentation

Remove duplicate sentence in Series.bottom_k docstring (#19947)
Complete parameters description and add an example for clip() (#19875)
Fix some warnings during docs build (#19848)

📦 Build system

Use public windows runners in python release (#19982)
Add windows-aarch64 to python binaries (#19966)

🛠️ Other improvements

Minor non-breaking space ( ) tweak for HTML rendering (#19864)
Implement nested row encoding / decoding (#19874)
Switch back to PyO3 0.22 (#19851)
Adjust flaky with_columns test (#19844)
Add proper tests for row encoding (#19843)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @barak1412, @coastalwhite, @etiennebacher, @ion-elgreco, @itamarst, @lukemanley, @mcrumiller, @mhogervo, @nameexhaustion, @orlp, @ritchie46, @stijnherfst and @stinodego

Contributors

orlp, mcrumiller, and 13 other contributors

Assets 3

17 Nov 18:50

github-actions

py-1.14.0

34ee4ee

Python Polars 1.14.0

🚀 Performance improvements

Increase default async thread count for low core count systems (#19829)
Move row group decode off async thread for local streaming parquet scan (#19828)
Support use of Duration in to_string, ergonomic/perf improvement, tz-aware Datetime bugfix (#19697)

✨ Enhancements

Raise informative error on Unknown unnest (#19830)
Support DataFrame init from raw SQLAlchemy rows (#19820)
Support use of Duration in to_string, ergonomic/perf improvement, tz-aware Datetime bugfix (#19697)
Add an is_literal method to expression meta namespace (#19773)
A different approach to warning users of fork() issues with Polars (#19197)

🐞 Bug fixes

Fix read_database(…,iter_batches=True) type annotations (#19832)
Validate subnodes in validate IR (#19831)
Raise if merge non-global categoricals in unpivot (#19826)
Type hints for window_size incorrectly included timedelta in some rolling functions (#19827)
Don't panic if column not found (#19824)
Fix gather of Scalar null + idx w/ validity (#19823)
Replace _kwargs in collect method (#19618)
Fix object chunked gather (#19811)
Fix filter scalar nulls (#19786)
Replace spaces with   to support showing multiple spaces in HTML repr (#19783)
Altair tooltip was being incorrectly applied to plots which did not accept it (#19789)
Respect schema_overrides in batched csv reader (#19755)
Fix scanning google cloud with service account credentials file (#19782)
Release the GIL in Python APIs, part 2 of 2 (#19762)
Fix incorrect filter after right-join on LazyFrame (#19775)
Fix incorrect lazy schema for explode on array columns (#19776)
Fixed typo in file lazy.py (#19769)

📖 Documentation

Update bokeh to use cdn to avoid Bokeh Error (#19788)
Change dprint config (#19747)
Mention row_by_keys in the to_dict documentation (#19767)
Fix link to Graphviz download (#19791)

🛠️ Other improvements

Add ToField context for common args (#19833)
Use polars parquet reader for delta scan (#19103)
Migrate polars-expr AggregationContext to use Column (#19736)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @TNieuwdorp, @YichiZhang0613, @alexander-beedie, @braaannigan, @coastalwhite, @engylemure, @gab23r, @iliya-malecki, @ion-elgreco, @itamarst, @jackxxu, @nameexhaustion, @orlp, @ritchie46, @rodrigogiraoserrao and @sn0rkmaiden

Contributors

orlp, jackxxu, and 15 other contributors

Assets 3

13 Nov 21:02

github-actions

py-1.13.1

9f79100

Python Polars 1.13.1

✨ Enhancements

Add IPC source node for new streaming engine (#19454)

🐞 Bug fixes

Release GIL in Python APIs, part 1 (#19705)
Fix incorrect lazy schema for aggregations (#19753)
Address incorrect selector & col expansion (#19742)

📖 Documentation

Fix formatting of nested list (#19746)
Add meta.is_column to API docs (#19744)
Fix join API reference links (#19745)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @coastalwhite, @etiennebacher, @itamarst, @nameexhaustion, @orlp, @ritchie46 and @rodrigogiraoserrao

Contributors

orlp, alexander-beedie, and 6 other contributors

Assets 3

12 Nov 12:19

github-actions

py-1.13.0

7f0b3e0

Python Polars 1.13.0

🚀 Performance improvements

Improve DataFrame.sort().limit/top_k performance (#19731)
Improve cloud scan performance (#19728)
Fix quadratic 'with_columns' behavior (#19701)
Improve hive partition pruning with datetime predicates from SQL (#19680)
Allow for arbitrary skips in Parquet Dictionary Decoding (#19649)
Reorder conditions in is_leap_year (#19602)
Rechunk in DataFrame.rows if needed (#19628)
Dispatch Parquet Primitive PLAIN decoding to faster kernels when possible (#19611)
Use faster iteration in 'starts_with'/'ends_with' (#19583)
Branchless Parquet Prefiltering (#19190)
Reduce size of IdxVec from 24 -> 16 bytes (#19550)

✨ Enhancements

Try to support native SAP HANA driver via read_database (#19733)
Implement max/min methods for dtypes (#19494)
Improve n_chunks typing (#19727)
Improve hive partition pruning with datetime predicates from SQL (#19680)
Identify inefficient use of Python string removeprefix, removesuffix, and zfill in map_elements (#19672)
Automatically use boto3 / google-auth if installed when scanning cloud (#19677)
Identify inefficient use of Python string replace in map_elements (#19668)
Parallel IPC sink for the new streaming engine (#19622)
Add SQL support for RIGHT JOIN, fix an issue with wildcard aliasing (#19626)
Add show_graph to display a GraphViz plot for expressions (#19365)
Streamline use of predicates connected by & with IEJoin (join_where) (#19552)
Support use of is_between range predicate with IEJoin operations (join_where) (#19547)

🐞 Bug fixes

Use cls for to_python (#19726)
Fix validation for inner and left join when join_nulls unflaged (#19698)
SQL ELSE clause should be implicitly NULL when omitted (#19714)
Improve n_chunks typing (#19727)
Ensure NoDataError raised consistently between engines for Excel reads (#19712)
In group_by_dynamic, period and every were getting applied in reverse order for the window upper boundary (#19706)
Only allow list.to_struct to be elementwise when width is fixed (#19688)
Make Array arithmetic ops fully elementwise (#19682)
Address inconsistency with use of Python types in frame-level cast (#19657)
Update line-splitting logic in batched CSV reader (#19508)
Fix incorrect lazy schema for explode() in agg() (#19629)
Fix fill null types (#19656)
Fix filter incorrectly pushed past struct unnest when unnested column name matches upper column name (#19638)
Fix typing for SchemaDefinition (#19647)
Ensure mean_horizontal raises on non-numeric input (#19648)
Reorder conditions in is_leap_year (#19602)
Copy height in .vstack() for empty dataframes (#19641) (#19642)
Correct wildcard and input expansion for some more functions (#19588)
Allow .struct.with_fields inside list.eval (#19617)
Sortedness was incorrectly being preserved in dt.offset_by when offsetting by non-constant durations in the timezone-naive case (#19616)
Fix incorrect scan_parquet().with_row_index() with non-zero slice or with streaming collect (#19609)
Fix mask and validity confusion in Parquet String decoding (#19614)
Parquet decoding of nested dictionary values (#19605)
Do not attempt to load default credentials when credential_provider is given (#19589)
Fix gather len in group-by state (#19586)
Added input validation for explode operation in the array namespace (#19163)
Improve error message (#19546)
Fix predicate pushdown into inequality joins (#19582)
Correct categorical namespace error message (#19558)
Fix performance regression for sort/gather on list/array columns (#19564)
Ignore quoted newlines when skipping lines in CSV (#19543)
Incorrect gather for FixedSizeList with outer validity but no inner validities (#19489)
Make Duration parsing fallible and not panic (#19490)

📖 Documentation

Revise and rework user-guide/expressions (#19360)
Update Excel page of user guide to refer to fastexcel as the default engine (#19691)
Alter examples for round_sig_figs to make behaviour clearer (#19667)
Assorted fixes to Rust API docs (#19664)
Improve replace and replace_all docstring explanation of the "$" character with reference to capture groups (vs use as a literal) (#19529)
Add credential provider section and examples to user guide (#19487)
Fix various instances of repeated words in docs and comments (#19516)

📦 Build system

Bump Rust toolchain to nightly-2024-10-28 (#19492)

🛠️ Other improvements

Remove unused Excel code (#19710)
Use Column for the {try,}_apply_columns{_par,} functions on DataFrame (#19683)
Remove more @scalar-opt (#19666)
Move Series bitops to std::ops::Bit... (#19673)
Mark test_parquet.py test_dict_slices as slow (#19675)
Get Column into polars-expr (#19660)
Streamline internal SQL join condition processing (#19658)
Factor out logic for re-use by new streaming CSV source (#19637)
Configure grouped Dependabot updates (#19604)
Fix PyO3 error in CI (#19545)
Update nightly compiler version (#19590)
Added input validation for explode operation in the array namespace (#19163)
Fix lint (#19584)
Add a Column::Partitioned variant (#19557)
Move to fast-float2 (#19578)
Only run remote bench on rust changes (#19581)
Remove unsafe *_release functions (#19554)
Fix test_rolling_by_integer not using parameterized dtype (#19555)
Add mindebug-dev rust profile (#19524)
Add CI step to process benchmark results (#19530)
Add CI benchmark on merge (#19518)
Skip client check with env var (#19517)
Improve makefile build commands (#19498)

Thank you to all our contributors for making this release possible!
@3tilley, @HansBambel, @MarcoGorelli, @alexander-beedie, @barak1412, @braaannigan, @cmdlineluser, @coastalwhite, @corwinjoy, @dependabot, @dependabot[bot], @eitsupi, @janpipek, @jqnatividad, @letkemann, @max-muoto, @nameexhaustion, @orlp, @ritchie46, @rodrigogiraoserrao, @siddharth-vi, @stinodego and @wence-

Contributors

janpipek, orlp, and 20 other contributors

Assets 3

01 Nov 09:07

github-actions

rs-0.44.2

2dce3d3

Rust Polars 0.44.2

🚀 Performance improvements

Reduce size of IdxVec from 24 -> 16 bytes (#19550)

✨ Enhancements

Streamline use of predicates connected by & with IEJoin (join_where) (#19552)
Support use of is_between range predicate with IEJoin operations (join_where) (#19547)

🐞 Bug fixes

Correct categorical namespace error message (#19558)
Fix performance regression for sort/gather on list/array columns (#19564)
Ignore quoted newlines when skipping lines in CSV (#19543)

🛠️ Other improvements

Remove ad-hoc buffer pool (#19553)
Remove SyncCounter (#19556)
Removed unnecessary flatten function (#19551)
Remove unsafe *_release functions (#19554)
Improve new-streaming groupby performance for high cardinality (#19537)
Add mindebug-dev rust profile (#19524)
Add CI step to process benchmark results (#19530)

Thank you to all our contributors for making this release possible!
@HansBambel, @alexander-beedie, @barak1412, @coastalwhite, @nameexhaustion, @orlp and @ritchie46

Contributors

orlp, alexander-beedie, and 5 other contributors

Assets 2

Releases: pola-rs/polars

Python Polars 1.18.0

🏆 Highlights

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Python Polars 1.17.1

🐞 Bug fixes

Contributors

Rust Polars 0.45.0

💥 Breaking changes

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

Contributors

Python Polars 1.17.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Python Polars 1.16.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

Contributors

Python Polars 1.15.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Python Polars 1.14.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Python Polars 1.13.1

✨ Enhancements

🐞 Bug fixes

📖 Documentation

Contributors

Python Polars 1.13.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Rust Polars 0.44.2

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors