Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Migration CCI->GHA #536

Merged
merged 25 commits into from
Oct 9, 2023
22 changes: 17 additions & 5 deletions .circleci/unittest/linux/scripts/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ unset PYTORCH_VERSION
# In fact, keeping PYTORCH_VERSION forces us to hardcode PyTorch version in config.

set -e
set -v

eval "$(./conda/bin/conda shell.bash hook)"
conda activate ./env
Expand All @@ -25,17 +26,28 @@ fi
git submodule sync && git submodule update --init --recursive

printf "Installing PyTorch with %s\n" "${CU_VERSION}"
if [ "${CU_VERSION:-}" == cpu ] ; then
pip3 install --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cpu
if [[ "$TORCH_VERSION" == "nightly" ]]; then
if [ "${CU_VERSION:-}" == cpu ] ; then
python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
else
python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/$CU_VERSION
fi
elif [[ "$TORCH_VERSION" == "stable" ]]; then
if [ "${CU_VERSION:-}" == cpu ] ; then
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
else
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/$CU_VERSION
fi
else
pip3 install --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cu113
printf "Failed to install pytorch"
exit 1
fi

printf "* Installing tensordict\n"
pip3 install -e .
python setup.py develop

# install torchsnapshot nightly
pip3 install git+https://github.com/pytorch/torchsnapshot
python -m pip install git+https://github.com/pytorch/torchsnapshot --no-build-isolation

# smoke test
python -c "import functorch;import torchsnapshot"
9 changes: 8 additions & 1 deletion .circleci/unittest/linux/scripts/setup_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
# Do not install PyTorch and torchvision here, otherwise they also get cached.

set -e
set -v


this_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
# Avoid error: "fatal: unsafe repository"
Expand All @@ -24,7 +26,7 @@ esac
# 1. Install conda at ./conda
if [ ! -d "${conda_dir}" ]; then
printf "* Installing conda\n"
wget -O miniconda.sh "http://repo.continuum.io/miniconda/Miniconda3-latest-${os}-x86_64.sh"
wget -O miniconda.sh "http://repo.continuum.io/miniconda/Miniconda3-latest-${os}-${ARCH}.sh"
bash ./miniconda.sh -b -f -p "${conda_dir}"
fi
eval "$(${conda_dir}/bin/conda shell.bash hook)"
Expand All @@ -45,3 +47,8 @@ cat "${this_dir}/environment.yml"
pip install pip --upgrade

conda env update --file "${this_dir}/environment.yml" --prune

#if [[ $OSTYPE == 'darwin'* ]]; then
# printf "* Installing C++ for OSX\n"
# conda install -c conda-forge cxx-compiler -y
#fi
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,5 +59,5 @@ If you know or suspect the reason for this bug, paste the code lines and suggest
## Checklist

- [ ] I have checked that there is no similar issue in the repo (**required**)
- [ ] I have read the [documentation](https://github.com/pytorch/rl/tree/main/docs/) (**required**)
- [ ] I have read the [documentation](https://github.com/pytorch/tensordict/tree/main/docs/) (**required**)
- [ ] I have provided a minimal working example to reproduce the bug (**required**)
4 changes: 2 additions & 2 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax `close #15213` if this solves the issue #15213

- [ ] I have raised an issue to propose this change ([required](https://github.com/pytorch/rl/issues) for new features and bug fixes)
- [ ] I have raised an issue to propose this change ([required](https://github.com/pytorch/tensordict/issues) for new features and bug fixes)

## Types of changes

Expand All @@ -25,7 +25,7 @@ What types of changes does your code introduce? Remove all that do not apply:
Go over all the following points, and put an `x` in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

- [ ] I have read the [CONTRIBUTION](https://github.com/pytorch/rl/blob/main/CONTRIBUTING.md) guide (**required**)
- [ ] I have read the [CONTRIBUTION](https://github.com/pytorch/tensordict/blob/main/CONTRIBUTING.md) guide (**required**)
- [ ] My change requires a change to the documentation.
- [ ] I have updated the tests accordingly (*required for a bug fix or a new feature*).
- [ ] I have updated the documentation accordingly.
75 changes: 75 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
name: Lint

on:
pull_request:
push:
branches:
- nightly
- main
- release/*
workflow_dispatch:

concurrency:
# Documentation suggests ${{ github.head_ref }}, but that's only available on pull_request/pull_request_target triggers, so using ${{ github.ref }}.
# On master, we want all builds to complete even if merging happens faster to make it easier to discover at which point something broke.
group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && format('ci-master-{0}', github.sha) || format('ci-{0}', github.ref) }}
cancel-in-progress: true

jobs:
python-source-and-configs:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
repository: pytorch/tensordict
script: |
set -euo pipefail

echo '::group::Setup environment'
CONDA_PATH=$(which conda)
eval "$(${CONDA_PATH} shell.bash hook)"
conda create --name ci --quiet --yes python=3.8 pip
conda activate ci
echo '::endgroup::'

echo '::group::Install lint tools'
pip install --progress-bar=off pre-commit
echo '::endgroup::'

echo '::group::Lint Python source and configs'
set +e
pre-commit run --all-files

if [ $? -ne 0 ]; then
git --no-pager diff
exit 1
fi
echo '::endgroup::'

c-source:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
repository: pytorch/tensordict
script: |
set -euo pipefail

echo '::group::Setup environment'
CONDA_PATH=$(which conda)
eval "$(${CONDA_PATH} shell.bash hook)"
conda create --name ci --quiet --yes -c conda-forge python=3.8 ncurses=5 libgcc
conda activate ci
export LD_LIBRARY_PATH="${CONDA_PREFIX}/lib:${LD_LIBRARY_PATH}"
echo '::endgroup::'

echo '::group::Install lint tools'
curl https://oss-clang-format.s3.us-east-2.amazonaws.com/linux64/clang-format-linux64 -o ./clang-format
chmod +x ./clang-format
echo '::endgroup::'

echo '::group::Lint C source'
set +e
./.circleci/unittest/linux/scripts/run-clang-format.py -r torchrl/csrc --clang-format-executable ./clang-format

if [ $? -ne 0 ]; then
git --no-pager diff
exit 1
fi
echo '::endgroup::'
137 changes: 137 additions & 0 deletions .github/workflows/test-linux.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
name: Unit-tests on Linux

on:
pull_request:
push:
branches:
- nightly
- main
- release/*
workflow_dispatch:

env:
CHANNEL: "nightly"

concurrency:
# Documentation suggests ${{ github.head_ref }}, but that's only available on pull_request/pull_request_target triggers, so using ${{ github.ref }}.
# On master, we want all builds to complete even if merging happens faster to make it easier to discover at which point something broke.
group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && format('ci-master-{0}', github.sha) || format('ci-{0}', github.ref) }}
cancel-in-progress: true

jobs:
test-gpu:
strategy:
matrix:
python_version: ["3.8"]
cuda_arch_version: ["12.1"]
fail-fast: false
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.g5.4xlarge.nvidia.gpu
repository: pytorch/tensordict
gpu-arch-type: cuda
gpu-arch-version: ${{ matrix.cuda_arch_version }}
script: |
# Set env vars from matrix
export PYTHON_VERSION=${{ matrix.python_version }}
# Commenting these out for now because the GPU test are not working inside docker
export CUDA_ARCH_VERSION=${{ matrix.cuda_arch_version }}
export CU_VERSION="cu${CUDA_ARCH_VERSION:0:2}${CUDA_ARCH_VERSION:3:1}"
export TORCH_VERSION=nightly
# Remove the following line when the GPU tests are working inside docker, and uncomment the above lines
#export CU_VERSION="cpu"
export ARCH=x86_64

echo "PYTHON_VERSION: $PYTHON_VERSION"
echo "CU_VERSION: $CU_VERSION"

## setup_env.sh
bash .circleci/unittest/linux/scripts/setup_env.sh
bash .circleci/unittest/linux/scripts/install.sh
bash .circleci/unittest/linux/scripts/run_test.sh
bash .circleci/unittest/linux/scripts/post_process.sh

test-cpu:
strategy:
matrix:
python_version: ["3.8", "3.9", "3.10", "3.11"]
fail-fast: false
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.12xlarge
repository: pytorch/tensordict
timeout: 90
script: |
# Set env vars from matrix
export PYTHON_VERSION=${{ matrix.python_version }}
export CU_VERSION="cpu"
export TORCH_VERSION=nightly
export ARCH=x86_64

echo "PYTHON_VERSION: $PYTHON_VERSION"
echo "CU_VERSION: $CU_VERSION"

## setup_env.sh
bash .circleci/unittest/linux/scripts/setup_env.sh
bash .circleci/unittest/linux/scripts/install.sh
bash .circleci/unittest/linux/scripts/run_test.sh
bash .circleci/unittest/linux/scripts/post_process.sh

test-stable-gpu:
strategy:
matrix:
python_version: ["3.8"]
cuda_arch_version: ["12.1"]
fail-fast: false
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.g5.4xlarge.nvidia.gpu
repository: pytorch/tensordict
gpu-arch-type: cuda
gpu-arch-version: ${{ matrix.cuda_arch_version }}
timeout: 90
script: |
# Set env vars from matrix
export PYTHON_VERSION=${{ matrix.python_version }}
# Commenting these out for now because the GPU test are not working inside docker
export CUDA_ARCH_VERSION=${{ matrix.cuda_arch_version }}
export CU_VERSION="cu${CUDA_ARCH_VERSION:0:2}${CUDA_ARCH_VERSION:3:1}"
export TORCH_VERSION=stable
# Remove the following line when the GPU tests are working inside docker, and uncomment the above lines
#export CU_VERSION="cpu"
export ARCH=x86_64

echo "PYTHON_VERSION: $PYTHON_VERSION"
echo "CU_VERSION: $CU_VERSION"

## setup_env.sh
bash .circleci/unittest/linux/scripts/setup_env.sh
bash .circleci/unittest/linux/scripts/install.sh
bash .circleci/unittest/linux/scripts/run_test.sh
bash .circleci/unittest/linux/scripts/post_process.sh

test-stable-cpu:
strategy:
matrix:
python_version: ["3.8", "3.11"]
fail-fast: false
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.12xlarge
repository: pytorch/tensordict
timeout: 90
script: |
# Set env vars from matrix
export PYTHON_VERSION=${{ matrix.python_version }}
export CU_VERSION="cpu"
export TORCH_VERSION=stable
export ARCH=x86_64

echo "PYTHON_VERSION: $PYTHON_VERSION"
echo "CU_VERSION: $CU_VERSION"

## setup_env.sh
bash .circleci/unittest/linux/scripts/setup_env.sh
bash .circleci/unittest/linux/scripts/install.sh
bash .circleci/unittest/linux/scripts/run_test.sh
bash .circleci/unittest/linux/scripts/post_process.sh
77 changes: 77 additions & 0 deletions .github/workflows/test-macos.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
name: Unit-tests on MacOS

on:
pull_request:
push:
branches:
- nightly
- main
- release/*
workflow_dispatch:

env:
CHANNEL: "nightly"

concurrency:
# Documentation suggests ${{ github.head_ref }}, but that's only available on pull_request/pull_request_target triggers, so using ${{ github.ref }}.
# On master, we want all builds to complete even if merging happens faster to make it easier to discover at which point something broke.
group: ${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && format('ci-master-{0}', github.sha) || format('ci-{0}', github.ref) }}
cancel-in-progress: true

jobs:
tests-intel:
strategy:
matrix:
python_version: ["3.8", "3.11"]
fail-fast: false
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
with:
repository: pytorch/tensordict
timeout: 120
script: |
# Set env vars from matrix
set -e
set -v
export PYTHON_VERSION=${{ matrix.python_version }}
export CU_VERSION="cpu"
export SYSTEM_VERSION_COMPAT=0
export TORCH_VERSION=nightly
export ARCH=x86_64

echo "PYTHON_VERSION: $PYTHON_VERSION"
echo "CU_VERSION: $CU_VERSION"

## setup_env.sh
bash .circleci/unittest/linux/scripts/setup_env.sh
bash .circleci/unittest/linux/scripts/install.sh
bash .circleci/unittest/linux/scripts/run_test.sh
bash .circleci/unittest/linux/scripts/post_process.sh

tests-silicon:
strategy:
matrix:
python_version: ["3.9"]
fail-fast: false
uses: pytorch/test-infra/.github/workflows/macos_job.yml@main
with:
runner: macos-m1-12
repository: pytorch/tensordict
timeout: 120
script: |
# Set env vars from matrix
set -e
set -v
export PYTHON_VERSION=${{ matrix.python_version }}
export CU_VERSION="cpu"
export SYSTEM_VERSION_COMPAT=0
export TORCH_VERSION=nightly
export ARCH=arm64

echo "PYTHON_VERSION: $PYTHON_VERSION"
echo "CU_VERSION: $CU_VERSION"

## setup_env.sh
bash .circleci/unittest/linux/scripts/setup_env.sh
bash .circleci/unittest/linux/scripts/install.sh
bash .circleci/unittest/linux/scripts/run_test.sh
bash .circleci/unittest/linux/scripts/post_process.sh
Loading