Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL][CI] Use cached Velox build binary to accelerate GHA workflow #7474

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 28 additions & 9 deletions .github/workflows/velox_backend.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,23 +56,33 @@ jobs:
build-native-lib-centos-7:
runs-on: ubuntu-20.04
container: apache/gluten:vcpkg-centos-7
env:
VELOX_BUILD_PATH: "./ep/build-velox/build/velox_ep/_build/release/"
steps:
- uses: actions/checkout@v2
- name: Generate cache key
run: |
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './cpp/*', './.github/workflows/*') }} > cache-key
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './.github/workflows/velox_backend.yml') }} > cache-key
Copy link
Member

@zhztheplayer zhztheplayer Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./ep/build-velox/src/**

I think we could change our OAP Velox dependency from branches to tags while landing this PR. Branches can be updated without invalidating the new cache.

Edit: Similar issue already existed even with current cache policy. It's fine till now so no need to put to high priority.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhztheplayer, yes, this is a potential issue. But before a velox rebase branch is finalized, we may still need oap/velox to hold it as a branch that allows updating (e.g., needs some fixes due to Gluten CI failure). So maybe, we should just have a internal convention to forbid updating oap/velox branch once it is already referenced by Gluten code.

- name: Cache
id: cache
uses: actions/cache/restore@v3
with:
path: |
./cpp/build/releases/
${VELOX_BUILD_PATH}/lib/libvelox.a
key: cache-velox-build-centos-7-${{ hashFiles('./cache-key') }}
- name: Build Gluten native libraries
- name: Build Gluten with cached lib Velox
if: ${{ steps.cache.outputs.cache-hit == 'true' }}
run: |
df -a
mkdir -p cache_dir && mv ${VELOX_BUILD_PATH}/* ./cache_dir/
bash dev/ci-velox-buildstatic-centos-7.sh prepare_build
mkdir -p ${VELOX_BUILD_PATH}/ && mv ./cache_dir/* ${VELOX_BUILD_PATH}/
bash dev/ci-velox-buildstatic-centos-7.sh build_gluten_cpp
- name: Build Velox and Gluten CPP
if: ${{ steps.cache.outputs.cache-hit != 'true' }}
run: |
df -a
cd $GITHUB_WORKSPACE/
rm -rf ./ep/build-velox/build/velox_ep
bash dev/ci-velox-buildstatic-centos-7.sh
- uses: actions/upload-artifact@v3
with:
Expand Down Expand Up @@ -1060,19 +1070,19 @@ jobs:
run-cpp-test-udf-test:
runs-on: ubuntu-20.04
container: ghcr.io/facebookincubator/velox-dev:centos8
env:
VELOX_BUILD_PATH: "./ep/build-velox/build/velox_ep/_build/release/"
steps:
- uses: actions/checkout@v2
- name: Generate cache key
run: |
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './cpp/*', './.github/workflows/*') }} > cache-key
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './.github/workflows/velox_backend.yml') }} > cache-key
- name: Cache
id: cache
uses: actions/cache/restore@v3
with:
path: |
./cpp/build/releases/
./cpp/build/velox/udf/examples/
./cpp/build/velox/benchmarks/
${VELOX_BUILD_PATH}
/root/.m2/repository/org/apache/arrow/
key: cache-velox-build-centos-8-${{ hashFiles('./cache-key') }}
- name: Setup java and maven
Expand All @@ -1081,10 +1091,19 @@ jobs:
sed -i -e "s|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g" /etc/yum.repos.d/CentOS-* || true
yum install sudo patch java-1.8.0-openjdk-devel wget -y
$SETUP install_maven
- name: Build Gluten native libraries
- name: Build Gluten with cached Velox build binary
if: steps.cache.outputs.cache-hit == 'true'
run: |
df -a
mkdir -p cache_dir && mv ${VELOX_BUILD_PATH}/* ./cache_dir/
bash dev/ci-velox-buildshared-centos-8.sh prepare_build
mkdir -p ${VELOX_BUILD_PATH}/ && mv ./cache_dir/* ${VELOX_BUILD_PATH}/
bash dev/ci-velox-buildshared-centos-8.sh build_gluten_cpp
- name: Build Velox and Gluten CPP
if: steps.cache.outputs.cache-hit != 'true'
run: |
df -a
rm -rf ./ep/build-velox/build/velox_ep
bash dev/ci-velox-buildshared-centos-8.sh
- name: Run CPP unit test
run: |
Expand Down
17 changes: 7 additions & 10 deletions .github/workflows/velox_backend_cache.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ on:

env:
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
VELOX_BUILD_PATH: "./ep/build-velox/build/velox_ep/_build/release/"

concurrency:
group: ${{ github.repository }}-${{ github.workflow }}
Expand All @@ -35,14 +36,14 @@ jobs:
- uses: actions/checkout@v2
- name: Generate cache key
run: |
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './cpp/*', './.github/workflows/*') }} > cache-key
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './.github/workflows/velox_backend.yml') }} > cache-key
- name: Check existing caches
id: check-cache
uses: actions/cache/restore@v3
with:
lookup-only: true
path: |
./cpp/build/releases/
${VELOX_BUILD_PATH}/lib/libvelox.a
key: cache-velox-build-centos-7-${{ hashFiles('./cache-key') }}
- name: Build Gluten native libraries
if: steps.check-cache.outputs.cache-hit != 'true'
Expand All @@ -55,7 +56,7 @@ jobs:
uses: actions/cache/save@v3
with:
path: |
./cpp/build/releases/
${VELOX_BUILD_PATH}/lib/libvelox.a
key: cache-velox-build-centos-7-${{ hashFiles('./cache-key') }}

cache-native-lib-centos-8:
Expand All @@ -65,16 +66,14 @@ jobs:
- uses: actions/checkout@v2
- name: Generate cache key
run: |
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './cpp/*', './.github/workflows/*') }} > cache-key
echo ${{ hashFiles('./ep/build-velox/src/**', './dev/**', './.github/workflows/velox_backend.yml') }} > cache-key
- name: Check existing caches
id: check-cache
uses: actions/cache/restore@v3
with:
lookup-only: true
path: |
./cpp/build/releases/
./cpp/build/velox/udf/examples/
./cpp/build/velox/benchmarks/
${VELOX_BUILD_PATH}
/root/.m2/repository/org/apache/arrow/
key: cache-velox-build-centos-8-${{ hashFiles('./cache-key') }}
- name: Setup java and maven
Expand All @@ -95,9 +94,7 @@ jobs:
uses: actions/cache/save@v3
with:
path: |
./cpp/build/releases/
./cpp/build/velox/udf/examples/
./cpp/build/velox/benchmarks/
${VELOX_BUILD_PATH}
/root/.m2/repository/org/apache/arrow/
key: cache-velox-build-centos-8-${{ hashFiles('./cache-key') }}

Expand Down
2 changes: 1 addition & 1 deletion cpp/velox/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SCRIPT_CXX_FLAGS}")

message("Velox module final CMAKE_CXX_FLAGS=${CMAKE_CXX_FLAGS}")

# User can specify VELOX_BUILD_PATH, if Velox are built elsewhere.
# User can specify VELOX_BUILD_PATH, if Velox is built elsewhere.
if(NOT DEFINED VELOX_BUILD_PATH)
if(${CMAKE_BUILD_TYPE} STREQUAL "Debug")
set(VELOX_BUILD_PATH
Expand Down
109 changes: 57 additions & 52 deletions dev/builddeps-veloxbe.sh
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,62 @@ fi

concat_velox_param

if [ "$VELOX_HOME" == "" ]; then
VELOX_HOME="$GLUTEN_DIR/ep/build-velox/build/velox_ep"
fi

source $GLUTEN_DIR/dev/build_helper_functions.sh

function prepare_build {
(
cd $GLUTEN_DIR/ep/build-velox/src
./get_velox.sh $VELOX_PARAMETER
)

OS=`uname -s`
ARCH=`uname -m`
DEPENDENCY_DIR=${DEPENDENCY_DIR:-$CURRENT_DIR/../ep/_ep}
mkdir -p ${DEPENDENCY_DIR}

if [ -z "${GLUTEN_VCPKG_ENABLED:-}" ] && [ $RUN_SETUP_SCRIPT == "ON" ]; then
echo "Start to install dependencies"
pushd $VELOX_HOME
if [ $OS == 'Linux' ]; then
setup_linux
elif [ $OS == 'Darwin' ]; then
setup_macos
else
echo "Unsupported kernel: $OS"
exit 1
fi
if [ $ENABLE_S3 == "ON" ]; then
if [ $OS == 'Darwin' ]; then
echo "S3 is not supported on MacOS."
exit 1
fi
${VELOX_HOME}/scripts/setup-adapters.sh aws
fi
if [ $ENABLE_HDFS == "ON" ]; then
if [ $OS == 'Darwin' ]; then
echo "HDFS is not supported on MacOS."
exit 1
fi
pushd $VELOX_HOME
install_libhdfs3
popd
fi
if [ $ENABLE_GCS == "ON" ]; then
${VELOX_HOME}/scripts/setup-adapters.sh gcs
fi
if [ $ENABLE_ABFS == "ON" ]; then
export AZURE_SDK_DISABLE_AUTO_VCPKG=ON
${VELOX_HOME}/scripts/setup-adapters.sh abfs
fi
popd
fi
echo "Finished build preparation."
}

function build_arrow {
cd $GLUTEN_DIR/dev
./build_arrow.sh
Expand Down Expand Up @@ -216,65 +272,14 @@ function build_gluten_cpp {
}

function build_velox_backend {
prepare_build
if [ $BUILD_ARROW == "ON" ]; then
build_arrow
fi
build_velox
build_gluten_cpp
}

(
cd $GLUTEN_DIR/ep/build-velox/src
./get_velox.sh $VELOX_PARAMETER
)

if [ "$VELOX_HOME" == "" ]; then
VELOX_HOME="$GLUTEN_DIR/ep/build-velox/build/velox_ep"
fi

OS=`uname -s`
ARCH=`uname -m`
DEPENDENCY_DIR=${DEPENDENCY_DIR:-$CURRENT_DIR/../ep/_ep}
mkdir -p ${DEPENDENCY_DIR}

source $GLUTEN_DIR/dev/build_helper_functions.sh
if [ -z "${GLUTEN_VCPKG_ENABLED:-}" ] && [ $RUN_SETUP_SCRIPT == "ON" ]; then
echo "Start to install dependencies"
pushd $VELOX_HOME
if [ $OS == 'Linux' ]; then
setup_linux
elif [ $OS == 'Darwin' ]; then
setup_macos
else
echo "Unsupported kernel: $OS"
exit 1
fi
if [ $ENABLE_S3 == "ON" ]; then
if [ $OS == 'Darwin' ]; then
echo "S3 is not supported on MacOS."
exit 1
fi
${VELOX_HOME}/scripts/setup-adapters.sh aws
fi
if [ $ENABLE_HDFS == "ON" ]; then
if [ $OS == 'Darwin' ]; then
echo "HDFS is not supported on MacOS."
exit 1
fi
pushd $VELOX_HOME
install_libhdfs3
popd
fi
if [ $ENABLE_GCS == "ON" ]; then
${VELOX_HOME}/scripts/setup-adapters.sh gcs
fi
if [ $ENABLE_ABFS == "ON" ]; then
export AZURE_SDK_DISABLE_AUTO_VCPKG=ON
${VELOX_HOME}/scripts/setup-adapters.sh abfs
fi
popd
fi

commands_to_run=${OTHER_ARGUMENTS:-}
(
if [[ "x$commands_to_run" == "x" ]]; then
Expand Down
2 changes: 1 addition & 1 deletion dev/ci-velox-buildshared-centos-8.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ set -e

source /opt/rh/gcc-toolset-9/enable
./dev/builddeps-veloxbe.sh --run_setup_script=OFF --enable_ep_cache=OFF --build_tests=ON \
--build_examples=ON --build_benchmarks=ON --build_protobuf=ON
--build_examples=ON --build_benchmarks=ON --build_protobuf=ON $@
2 changes: 1 addition & 1 deletion dev/ci-velox-buildstatic-centos-7.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ set -e
source /opt/rh/devtoolset-9/enable
export NUM_THREADS=4
./dev/builddeps-veloxbe.sh --enable_vcpkg=ON --build_arrow=OFF --build_tests=OFF --build_benchmarks=OFF \
--build_examples=OFF --enable_s3=ON --enable_gcs=ON --enable_hdfs=ON --enable_abfs=ON
--build_examples=OFF --enable_s3=ON --enable_gcs=ON --enable_hdfs=ON --enable_abfs=ON $@
Loading