GH-41748: [Python][Parquet] Update BYTE_STREAM_SPLIT description in w…

…rite_table() docstring (#41759) ### Rationale for this change In PR #40094 (issue GH-39978), we forgot to update the `write_table` docstring with an accurate description of the supported data types for BYTE_STREAM_SPLIT. ### Are these changes tested? No (only a doc change). ### Are there any user-facing changes? No. * GitHub Issue: #41748 Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
apache · May 22, 2024 · 065a6da · 065a6da
1 parent 37e5240
commit 065a6da
Showing 1 changed file with 3 additions and 2 deletions.
diff --git a/python/pyarrow/parquet/core.py b/python/pyarrow/parquet/core.py
@@ -797,8 +797,9 @@ def _sanitize_table(table, new_schema, flavor):
     Specify if the byte_stream_split encoding should be used in general or
     only for some columns. If both dictionary and byte_stream_stream are
     enabled, then dictionary is preferred.
-    The byte_stream_split encoding is valid only for floating-point data types
-    and should be combined with a compression codec.
+    The byte_stream_split encoding is valid for integer, floating-point
+    and fixed-size binary data types (including decimals); it should be
+    combined with a compression codec so as to achieve size reduction.
 column_encoding : string or dict, default None
     Specify the encoding scheme on a per column basis.
     Can only be used when ``use_dictionary`` is set to False, and