Skip to content

Commit

Permalink
[SPARK-45781][BUILD] Upgrade Arrow to 14.0.0
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
This pr upgrade Apache Arrow from 13.0.0 to 14.0.0.

### Why are the changes needed?
The Apache Arrow 14.0.0 release brings a number of enhancements and bug fixes.
‎
In terms of bug fixes, the release addresses several critical issues that were causing failures in integration jobs with Spark([GH-36332](apache/arrow#36332)) and problems with importing empty data arrays([GH-37056](apache/arrow#37056)). It also optimizes the process of appending variable length vectors([GH-37829](apache/arrow#37829)) and includes C++ libraries for MacOS AARCH 64 in Java-Jars([GH-38076](apache/arrow#38076)).
‎
The new features and improvements focus on enhancing the handling and manipulation of data. This includes the introduction of DefaultVectorComparators for large types([GH-25659](apache/arrow#25659)), support for extended expressions in ScannerBuilder([GH-34252](apache/arrow#34252)), and the exposure of the VectorAppender class([GH-37246](apache/arrow#37246)).
‎
The release also brings enhancements to the development and testing process, with the CI environment now using JDK 21([GH-36994](apache/arrow#36994)). In addition, the release introduces vector validation consistent with C++, ensuring consistency across different languages([GH-37702](apache/arrow#37702)).
‎
Furthermore, the usability of VarChar writers and binary writers has been improved with the addition of extra input methods([GH-37705](apache/arrow#37705)), and VarCharWriter now supports writing from `Text` and `String`([GH-37706](apache/arrow#37706)). The release also adds typed getters for StructVector, improving the ease of accessing data([GH-37863](apache/arrow#37863)).

The full release notes as follows:
- https://arrow.apache.org/release/14.0.0.html

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43650 from LuciferYang/arrow-14.

Lead-authored-by: yangjie01 <yangjie01@baidu.com>
Co-authored-by: YangJie <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
LuciferYang authored and dongjoon-hyun committed Nov 4, 2023
1 parent 3363c2a commit 749c79e
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
8 changes: 4 additions & 4 deletions dev/deps/spark-deps-hadoop-3-hive-2.3
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ antlr4-runtime/4.13.1//antlr4-runtime-4.13.1.jar
aopalliance-repackaged/2.6.1//aopalliance-repackaged-2.6.1.jar
arpack/3.0.3//arpack-3.0.3.jar
arpack_combined_all/0.1//arpack_combined_all-0.1.jar
arrow-format/13.0.0//arrow-format-13.0.0.jar
arrow-memory-core/13.0.0//arrow-memory-core-13.0.0.jar
arrow-memory-netty/13.0.0//arrow-memory-netty-13.0.0.jar
arrow-vector/13.0.0//arrow-vector-13.0.0.jar
arrow-format/14.0.0//arrow-format-14.0.0.jar
arrow-memory-core/14.0.0//arrow-memory-core-14.0.0.jar
arrow-memory-netty/14.0.0//arrow-memory-netty-14.0.0.jar
arrow-vector/14.0.0//arrow-vector-14.0.0.jar
audience-annotations/0.5.0//audience-annotations-0.5.0.jar
avro-ipc/1.11.3//avro-ipc-1.11.3.jar
avro-mapred/1.11.3//avro-mapred-1.11.3.jar
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@
If you are changing Arrow version specification, please check
./python/pyspark/sql/pandas/utils.py, and ./python/setup.py too.
-->
<arrow.version>13.0.0</arrow.version>
<arrow.version>14.0.0</arrow.version>
<ammonite.version>2.5.11</ammonite.version>

<!-- org.fusesource.leveldbjni will be used except on arm64 platform. -->
Expand Down

0 comments on commit 749c79e

Please sign in to comment.