From b9dfe6af6be7ccec2393a09555ab3159422a770e Mon Sep 17 00:00:00 2001 From: Dongjoon Hyun Date: Wed, 9 Aug 2023 09:23:29 +0900 Subject: [PATCH] [SPARK-44723][BUILD] Upgrade `gcs-connector` to 2.2.16 ### What changes were proposed in this pull request? This PR aims to upgrade `gcs-connector` to 2.2.16. ### Why are the changes needed? - https://github.com/GoogleCloudDataproc/hadoop-connectors/releases/tag/v2.2.16 (2023-06-30) - https://github.com/GoogleCloudDataproc/hadoop-connectors/releases/tag/v2.2.15 (2023-06-02) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs and do the manual tests. **BUILD** ``` dev/make-distribution.sh -Phadoop-cloud ``` **TEST** ``` $ export KEYFILE=your-credential-file.json $ export EMAIL=$(jq -r '.client_email' < $KEYFILE) $ export PRIVATE_KEY_ID=$(jq -r '.private_key_id' < $KEYFILE) $ export PRIVATE_KEY="$(jq -r '.private_key' < $KEYFILE)" $ bin/spark-shell \ -c spark.hadoop.fs.gs.auth.service.account.email=$EMAIL \ -c spark.hadoop.fs.gs.auth.service.account.private.key.id=$PRIVATE_KEY_ID \ -c spark.hadoop.fs.gs.auth.service.account.private.key="$PRIVATE_KEY" Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 23/08/08 10:43:29 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = local[*], app id = local-1691516610108). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 4.0.0-SNAPSHOT /_/ Using Scala version 2.12.18 (OpenJDK 64-Bit Server VM, Java 1.8.0_312) Type in expressions to have them evaluated. Type :help for more information. scala> spark.read.text("gs://apache-spark-bucket/README.md").count() 23/08/08 10:43:46 WARN GhfsStorageStatistics: Detected potential high latency for operation op_get_file_status. latencyMs=823; previousMaxLatencyMs=0; operationCount=1; context=gs://apache-spark-bucket/README.md res0: Long = 124 scala> spark.read.orc("examples/src/main/resources/users.orc").write.mode("overwrite").orc("gs://apache-spark-bucket/users.orc") 23/08/08 10:43:59 WARN GhfsStorageStatistics: Detected potential high latency for operation op_delete. latencyMs=549; previousMaxLatencyMs=0; operationCount=1; context=gs://apache-spark-bucket/users.orc 23/08/08 10:43:59 WARN GhfsStorageStatistics: Detected potential high latency for operation op_mkdirs. latencyMs=440; previousMaxLatencyMs=0; operationCount=1; context=gs://apache-spark-bucket/users.orc/_temporary/0 23/08/08 10:44:04 WARN GhfsStorageStatistics: Detected potential high latency for operation op_delete. latencyMs=631; previousMaxLatencyMs=549; operationCount=2; context=gs://apache-spark-bucket/users.orc/_temporary 23/08/08 10:44:05 WARN GhfsStorageStatistics: Detected potential high latency for operation stream_write_close_operations. latencyMs=572; previousMaxLatencyMs=393; operationCount=2; context=gs://apache-spark-bucket/users.orc/_SUCCESS scala> scala> spark.read.orc("gs://apache-spark-bucket/users.orc").show() +------+--------------+----------------+ | name|favorite_color|favorite_numbers| +------+--------------+----------------+ |Alyssa| NULL| [3, 9, 15, 20]| | Ben| red| []| +------+--------------+----------------+ ``` Closes #42401 from dongjoon-hyun/SPARK-44723. Authored-by: Dongjoon Hyun Signed-off-by: Hyukjin Kwon --- dev/deps/spark-deps-hadoop-3-hive-2.3 | 2 +- pom.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index 1f8c079a9bc8c..416753ab2010c 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -62,7 +62,7 @@ datasketches-memory/2.1.0//datasketches-memory-2.1.0.jar derby/10.14.2.0//derby-10.14.2.0.jar dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar flatbuffers-java/1.12.0//flatbuffers-java-1.12.0.jar -gcs-connector/hadoop3-2.2.14/shaded/gcs-connector-hadoop3-2.2.14-shaded.jar +gcs-connector/hadoop3-2.2.16/shaded/gcs-connector-hadoop3-2.2.16-shaded.jar gmetric4j/1.0.10//gmetric4j-1.0.10.jar gson/2.2.4//gson-2.2.4.jar guava/14.0.1//guava-14.0.1.jar diff --git a/pom.xml b/pom.xml index 76e3596edd430..624df0c314a0e 100644 --- a/pom.xml +++ b/pom.xml @@ -160,7 +160,7 @@ 1.11.655 0.12.8 - hadoop3-2.2.14 + hadoop3-2.2.16 4.5.14 4.4.16