Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: arrow 14 #1170

Closed
wants to merge 14 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .ci_support/osx_64_.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ bzip2:
c_compiler:
- clang
c_compiler_version:
- '15'
- '14'
channel_sources:
- conda-forge
channel_targets:
Expand All @@ -19,7 +19,7 @@ cuda_compiler_version:
cxx_compiler:
- clangxx
cxx_compiler_version:
- '15'
- '14'
gflags:
- '2.2'
glog:
Expand Down
4 changes: 2 additions & 2 deletions .ci_support/osx_arm64_.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ bzip2:
c_compiler:
- clang
c_compiler_version:
- '15'
- '14'
channel_sources:
- conda-forge
channel_targets:
Expand All @@ -19,7 +19,7 @@ cuda_compiler_version:
cxx_compiler:
- clangxx
cxx_compiler_version:
- '15'
- '14'
gflags:
- '2.2'
glog:
Expand Down
8 changes: 3 additions & 5 deletions README.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion recipe/build-arrow.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ fi
# Enable CUDA support
if [[ ! -z "${cuda_compiler_version+x}" && "${cuda_compiler_version}" != "None" ]]
then
EXTRA_CMAKE_ARGS=" ${EXTRA_CMAKE_ARGS} -DARROW_CUDA=ON -DCUDA_TOOLKIT_ROOT_DIR=${CUDA_HOME} -DCMAKE_LIBRARY_PATH=${CONDA_BUILD_SYSROOT}/lib"
EXTRA_CMAKE_ARGS=" ${EXTRA_CMAKE_ARGS} -DARROW_CUDA=ON -DCUDAToolkit_ROOT=${CUDA_HOME} -DCMAKE_LIBRARY_PATH=${CONDA_BUILD_SYSROOT}/lib"
else
EXTRA_CMAKE_ARGS=" ${EXTRA_CMAKE_ARGS} -DARROW_CUDA=OFF"
fi
Expand Down
4 changes: 4 additions & 0 deletions recipe/build-pyarrow.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ BUILD_EXT_FLAGS=""
# Enable CUDA support
if [[ ! -z "${cuda_compiler_version+x}" && "${cuda_compiler_version}" != "None" ]]; then
export PYARROW_WITH_CUDA=1
if [[ "${build_platform}" != "${target_platform}" ]]; then
export CUDAToolkit_ROOT=${CUDA_HOME}
export CMAKE_LIBRARY_PATH=${CONDA_BUILD_SYSROOT}/lib
fi
else
export PYARROW_WITH_CUDA=0
fi
Expand Down
4 changes: 2 additions & 2 deletions recipe/conda_build_config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# keep this in sync with llvm_version in meta.yaml;
# don't rely on global pinning even if it matches
c_compiler_version: # [osx]
- 15 # [osx]
- 14 # [osx]
cxx_compiler_version: # [osx]
- 15 # [osx]
- 14 # [osx]
83 changes: 36 additions & 47 deletions recipe/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{% set version = "13.0.0" %}
{% set version = "14.0.0" %}
{% set cuda_enabled = cuda_compiler_version != "None" %}
{% set build_ext_version = "4.0.0" %}
{% set build_ext_version = "5.0.0" %}
{% set build_ext = "cuda" if cuda_enabled else "cpu" %}
{% set proc_build_number = "0" %}
{% set llvm_version = "15" %}
{% set llvm_version = "14" %}

# see https://github.com/apache/arrow/blob/apache-arrow-10.0.1/cpp/CMakeLists.txt#L88-L90
{% set so_version = (version.split(".")[0] | int * 100 + version.split(".")[1] | int) ~ "." ~ version.split(".")[2] ~ ".0" %}
Expand All @@ -16,15 +16,16 @@ source:
# arrow has the unfortunate habit of changing tags of X.0.0 in the
# lead-up until release -> don't use github sources on main
# - url: https://github.com/apache/arrow/archive/refs/tags/apache-arrow-{{ version }}.tar.gz
- url: https://dist.apache.org/repos/dist/release/arrow/arrow-{{ version }}/apache-arrow-{{ version }}.tar.gz
sha256: 35dfda191262a756be934eef8afee8d09762cad25021daa626eb249e251ac9e6
# - url: https://dist.apache.org/repos/dist/release/arrow/arrow-{{ version }}/apache-arrow-{{ version }}.tar.gz
# sha256: 35dfda191262a756be934eef8afee8d09762cad25021daa626eb249e251ac9e6
- git_url: https://github.com/apache/arrow.git
git_rev: 1940cbbb3a5224243e429f9577604bb7b276cf4f
patches:
- patches/0001-GH-15017-Python-Harden-test_memory.py-for-use-with-A.patch
- patches/0002-GH-37480-Python-Bump-pandas-version-that-contains-re.patch
- patches/0003-fixture-teardown-should-not-fail-test.patch
# testing-submodule not part of release tarball
- git_url: https://github.com/apache/arrow-testing.git
git_rev: 47f7b56b25683202c1fd957668e13f2abafc0f12
folder: testing
# - git_url: https://github.com/apache/arrow-testing.git
# git_rev: 47f7b56b25683202c1fd957668e13f2abafc0f12
# folder: testing

build:
number: 4
Expand All @@ -35,12 +36,21 @@ build:
run_exports:
- {{ pin_subpackage("libarrow", max_pin="x") }}

requirements:
build:
# needed to clone arrow-testing
- git-lfs

outputs:
- name: apache-arrow-proc
version: {{ build_ext_version }}
build:
number: {{ proc_build_number }}
string: {{ build_ext }}
requirements:
run_constrained:
# avoid installation with old naming of proc package
- arrow-cpp-proc <0.0a0
test:
commands:
- exit 0
Expand All @@ -51,19 +61,6 @@ outputs:
- LICENSE.txt
summary: A meta-package to select Arrow build variant

# compat output for old mutex-package naming
- name: arrow-cpp-proc
version: {{ build_ext_version }}
build:
number: {{ proc_build_number }}
string: {{ build_ext }}
requirements:
run:
- apache-arrow-proc ={{ build_ext_version }}={{ build_ext }}
test:
commands:
- exit 0

- name: libarrow
script: build-arrow.sh # [unix]
script: build-arrow.bat # [win]
Expand Down Expand Up @@ -124,6 +121,8 @@ outputs:
- libgrpc
- libprotobuf
- libutf8proc
# gandiva requires shared libllvm
- llvm # [unix]
- lz4-c
- nlohmann_json
# gandiva depends on openssl
Expand All @@ -145,8 +144,8 @@ outputs:
- libcurl # [win]
run_constrained:
- apache-arrow-proc =*={{ build_ext }}
# make sure we don't co-install with old version of old package name
- arrow-cpp ={{ version }}
# avoid installation with old naming of lib package
- arrow-cpp <0.0a0
# old parquet lib output, now part of this feedstock
- parquet-cpp <0.0a0

Expand Down Expand Up @@ -196,22 +195,6 @@ outputs:
- LICENSE.txt
summary: C++ libraries for Apache Arrow

# compat output for old naming scheme; switched for 10.0.0; keep for a few versions
- name: arrow-cpp
version: {{ version }}
build:
string: h{{ PKG_HASH }}_{{ PKG_BUILDNUM }}_{{ build_ext }}
run_exports:
- {{ pin_subpackage("libarrow", max_pin="x.x.x") }}
requirements:
host:
- {{ pin_subpackage('libarrow', exact=True) }}
run:
- {{ pin_subpackage('libarrow', exact=True) }}
test:
commands:
- exit 0

- name: pyarrow
script: build-pyarrow.sh # [unix]
script: build-pyarrow.bat # [win]
Expand Down Expand Up @@ -357,8 +340,9 @@ outputs:
- hypothesis
- minio-server
- pandas
- s3fs
- s3fs >=2023
- scipy
- sparse >=0.14
# these are generally (far) behind on migrating abseil/grpc/protobuf,
# and using them as test dependencies blocks the migrator unnecessarily
# - pytorch
Expand All @@ -367,8 +351,6 @@ outputs:
# - jpype1
# doesn't get picked up correctly
# - libhdfs3
# causes segfaults
# - sparse
source_files:
- testing/data
commands:
Expand All @@ -389,13 +371,20 @@ outputs:
# skip tests that cannot succeed in emulation
{% set tests_to_skip = tests_to_skip + " or test_debug_memory_pool_disabled" %} # [aarch64 or ppc64le]
{% set tests_to_skip = tests_to_skip + " or test_env_var_io_thread_count" %} # [aarch64 or ppc64le]
# XMinioInvalidObjectName on win: "Object name contains unsupported characters"
{% set tests_to_skip = tests_to_skip + " or test_write_to_dataset_with_partitions_s3fs" %} # [win]
# XMinioInvalidObjectName on osx/win: "Object name contains unsupported characters"
{% set tests_to_skip = tests_to_skip + " or test_write_to_dataset_with_partitions_s3fs" %} # [osx or win]
# vvvvvvv TESTS THAT SHOULDN'T HAVE TO BE SKIPPED vvvvvvv
# currently broken
{% set tests_to_skip = tests_to_skip + " or test_fastparquet_cross_compatibility" %}
# new fsspec changed behaviour, see https://github.com/apache/arrow/issues/37555
{% set tests_to_skip = tests_to_skip + " or test_get_file_info_with_selector" %}
# problems with minio
{% set tests_to_skip = tests_to_skip + " or (test_delete_dir and S3FileSystem)" %}
{% set tests_to_skip = tests_to_skip + " or (test_delete_dir_contents and S3FileSystem)" %}
{% set tests_to_skip = tests_to_skip + " or (test_get_file_info and S3FileSystem)" %}
{% set tests_to_skip = tests_to_skip + " or (test_move_directory and S3FileSystem)" %}
# gandiva tests are segfaulting on ppc
{% set tests_to_skip = tests_to_skip + " or test_gandiva" %} # [ppc64le]
{% set tests_to_skip = tests_to_skip + " or test_gandiva" %} # [ppc64le]
# test failures on ppc (both failing with: Float value was truncated converting to int32)
{% set tests_to_skip = tests_to_skip + " or test_safe_cast_from_float_with_nans_to_int" %} # [ppc64le]
{% set tests_to_skip = tests_to_skip + " or test_float_with_null_as_integer" %} # [ppc64le]
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 0ff739a0d3c8c4df4b46625f0bb0bc87c6b0d29d Mon Sep 17 00:00:00 2001
From: h-vetinari <h.vetinari@gmx.com>
Date: Fri, 28 Jul 2023 13:31:01 +1100
Subject: [PATCH 1/2] GH-15017: [Python] Harden test_memory.py for use with
Subject: [PATCH 1/3] GH-15017: [Python] Harden test_memory.py for use with
ARROW_USE_GLOG=ON (#36901)

Accept output pattern for ARROW_USE_GLOG=ON too.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
From 2caeef5e4ad232013e769e36fb957cf4f0c5992f Mon Sep 17 00:00:00 2001
From: Dane Pitkin <48041712+danepitkin@users.noreply.github.com>
Date: Thu, 31 Aug 2023 00:32:26 -0400
Subject: [PATCH 2/2] GH-37480: [Python] Bump pandas version that contains
Subject: [PATCH 2/3] GH-37480: [Python] Bump pandas version that contains
regression for pandas issue 50127 (#37481)

### Rationale for this change
Expand Down
37 changes: 37 additions & 0 deletions recipe/patches/0003-fixture-teardown-should-not-fail-test.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
From 93389575d6e06cd561357a46d6eb5c888f3e18bb Mon Sep 17 00:00:00 2001
From: "H. Vetinari" <h.vetinari@gmx.com>
Date: Wed, 13 Sep 2023 21:34:29 +1100
Subject: [PATCH 3/3] fixture teardown should not fail test

---
python/pyarrow/tests/test_fs.py | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/python/pyarrow/tests/test_fs.py b/python/pyarrow/tests/test_fs.py
index b64680c87..d81213b64 100644
--- a/python/pyarrow/tests/test_fs.py
+++ b/python/pyarrow/tests/test_fs.py
@@ -256,7 +256,10 @@ def s3fs(request, s3_server):
allow_move_dir=False,
allow_append_to_file=False,
)
- fs.delete_dir(bucket)
+ try:
+ fs.delete_dir(bucket)
+ except OSError:
+ pass


@pytest.fixture
@@ -358,7 +361,10 @@ def py_fsspec_s3fs(request, s3_server):
allow_move_dir=False,
allow_append_to_file=True,
)
- fs.delete_dir(bucket)
+ try:
+ fs.delete_dir(bucket)
+ except OSError:
+ pass


@pytest.fixture(params=[