Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Upgrade to PyTorch 2.0.1 Release Candidate + Other improvements #1857

Merged
merged 1 commit into from
May 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ commands:
sudo apt-get --purge remove "*nvidia*"

install-cudnn:
description: "Install CUDNN 8.5.0"
description: "Install CUDNN 8.8.0"
parameters:
os:
type: string
Expand All @@ -112,10 +112,10 @@ commands:
default: "x86_64"
cudnn-version:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
cuda-version:
type: string
default: "cuda11.7"
default: "cuda11.8"
steps:
- run:
name: Install CUDNN
Expand Down Expand Up @@ -200,7 +200,7 @@ commands:
default: "cuda11.8"
cudnn-version:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
trt-version-short:
type: string
default: "8.6.0"
Expand Down Expand Up @@ -252,7 +252,7 @@ commands:
default: "8.6.0"
cudnn-version-long:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
steps:
- run:
name: Set up python environment
Expand All @@ -269,10 +269,10 @@ commands:
parameters:
torch-build:
type: string
default: "2.0.0"
default: "2.0.1"
torch-build-index:
type: string
default: "https://download.pytorch.org/whl/cu118"
default: "https://download.pytorch.org/whl/test/cu118"
steps:
- run:
name: Install Torch
Expand Down Expand Up @@ -474,6 +474,7 @@ commands:
- run: mkdir -p /tmp/artifacts
- run:
name: Run core / C++ tests
no_output_timeout: 15m
environment:
LD_LIBRARY_PATH: "/opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch_tensorrt.libs:/home/circleci/project/bazel-project/external/libtorch_pre_cxx11_abi/lib/:/home/circleci/project/bazel-project/external/tensorrt/lib/:/usr/local/cuda-11.8/lib64/:$LD_LIBRARY_PATH"
command: |
Expand Down Expand Up @@ -1205,10 +1206,10 @@ parameters:
# Nightly platform config
torch-build:
type: string
default: "2.0.0"
default: "2.0.1"
torch-build-index:
type: string
default: "https://download.pytorch.org/whl/cu118"
default: "https://download.pytorch.org/whl/test/cu118"
torch-build-legacy:
type: string
default: "1.13.1+cu117"
Expand All @@ -1217,7 +1218,7 @@ parameters:
default: "https://download.pytorch.org/whl/cu117"
cudnn-version:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
trt-version-short:
type: string
default: "8.6.0"
Expand Down Expand Up @@ -1412,4 +1413,3 @@ workflows:
trt-version-short: << pipeline.parameters.trt-version-short >>
cudnn-version: << pipeline.parameters.cudnn-version >>
python-version: << pipeline.parameters.python-version >>

26 changes: 14 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,7 @@ In the case of building on top of a custom base container, you first must determ
version of the PyTorch C++ ABI. If your source of PyTorch is pytorch.org, likely this is the pre-cxx11-abi in which case you must modify `//docker/dist-build.sh` to not build the
C++11 ABI version of Torch-TensorRT.

You can then build the container using:


```bash
docker build --build-arg BASE_IMG=<IMAGE> -f docker/Dockerfile -t torch_tensorrt:latest .
```
You can then build the container using the build command in the [docker README](docker/README.md#instructions)

If you would like to build outside a docker container, please follow the section [Compiling Torch-TensorRT](#compiling-torch-tensorrt)

Expand Down Expand Up @@ -73,6 +68,7 @@ import torch_tensorrt
...

trt_ts_module = torch_tensorrt.compile(torch_script_module,
# If the inputs to the module are plain Tensors, specify them via the `inputs` argument:
inputs = [example_tensor, # Provide example tensor for input shape or...
torch_tensorrt.Input( # Specify input object with shape and dtype
min_shape=[1, 3, 224, 224],
Expand All @@ -81,6 +77,12 @@ trt_ts_module = torch_tensorrt.compile(torch_script_module,
# For static size shape=[1, 3, 224, 224]
dtype=torch.half) # Datatype of input tensor. Allowed options torch.(float|half|int8|int32|bool)
],

# For inputs containing tuples or lists of tensors, use the `input_signature` argument:
# Below, we have an input consisting of a Tuple of two Tensors (Tuple[Tensor, Tensor])
# input_signature = ( (torch_tensorrt.Input(shape=[1, 3, 224, 224], dtype=torch.half),
# torch_tensorrt.Input(shape=[1, 3, 224, 224], dtype=torch.half)), ),

enabled_precisions = {torch.half}, # Run with FP16
)

Expand Down Expand Up @@ -114,17 +116,17 @@ torch.jit.save(trt_ts_module, "trt_torchscript_module.ts") # save the TRT embedd
These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

- Bazel 5.2.0
- Libtorch 2.0.0.dev20230103 (built with CUDA 11.7)
- CUDA 11.7
- cuDNN 8.5.0
- TensorRT 8.5.1.7
- Libtorch 2.0.1 (built with CUDA 11.8)
- CUDA 11.8
- cuDNN 8.8.0
- TensorRT 8.6.0

## Prebuilt Binaries and Wheel files

Releases: https://github.com/pytorch/TensorRT/releases

```
pip install torch-tensorrt==1.2.0 --find-links https://github.com/pytorch/TensorRT/releases/expanded_assets/v1.2.0
pip install torch-tensorrt
```

## Compiling Torch-TensorRT
Expand Down Expand Up @@ -245,7 +247,7 @@ A tarball with the include files and library can then be found in bazel-bin
### Running Torch-TensorRT on a JIT Graph

> Make sure to add LibTorch to your LD_LIBRARY_PATH <br>
> `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(pwd)/bazel-Torch-TensorRT/external/libtorch/lib`
> `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(pwd)/bazel-TensorRT/external/libtorch/lib`

``` shell
bazel run //cpp/bin/torchtrtc -- $(realpath <PATH TO GRAPH>) out.ts <input-size>
Expand Down
14 changes: 7 additions & 7 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,17 @@ new_local_repository(
http_archive(
name = "libtorch",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "292b3f81e7c857fc102be93e2e44c40cdb4d8ef03d98121bc6af434c66e8490b",
sha256 = "c5174f18c0866421a5738d389aaea0c02f32a1a5be5f0747dc8dd0d96034c9b0",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/cu118/libtorch-cxx11-abi-shared-with-deps-2.0.0%2Bcu118.zip"],
urls = ["https://download.pytorch.org/libtorch/test/cu118/libtorch-cxx11-abi-shared-with-deps-latest.zip"],
)

http_archive(
name = "libtorch_pre_cxx11_abi",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "f3cbd7e9593f0c64b8671d02a21d562c98b60ef1abf5898c0ee9acfbc5a6b5d2",
sha256 = "cc19b398cf435e0e34f347ef90fc11c2a42703998330a9c4a9fb0d2291737df7",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/cu118/libtorch-shared-with-deps-2.0.0%2Bcu118.zip"],
urls = ["https://download.pytorch.org/libtorch/test/cu118/libtorch-shared-with-deps-latest.zip"],
)

# Download these tarballs manually from the NVIDIA website
Expand All @@ -71,10 +71,10 @@ http_archive(
http_archive(
name = "cudnn",
build_file = "@//third_party/cudnn/archive:BUILD",
sha256 = "5454a6fd94f008728caae9adad993c4e85ef36302e26bce43bea7d458a5e7b6d",
strip_prefix = "cudnn-linux-x86_64-8.5.0.96_cuda11-archive",
sha256 = "36fff137153ef73e6ee10bfb07f4381240a86fb9fb78ce372414b528cbab2293",
strip_prefix = "cudnn-linux-x86_64-8.8.0.121_cuda11-archive",
urls = [
"https://developer.nvidia.com/compute/cudnn/secure/8.5.0/local_installers/11.7/cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar.xz",
"https://developer.download.nvidia.com/compute/cudnn/secure/8.8.0/local_installers/11.8/cudnn-linux-x86_64-8.8.0.121_cuda11-archive.tar.xz",
],
)

Expand Down
89 changes: 54 additions & 35 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,40 +1,54 @@
# Base image starts with CUDA
ARG BASE_IMG=nvidia/cuda:11.7.1-devel-ubuntu20.04
ARG BASE_IMG=nvidia/cuda:11.8.0-devel-ubuntu22.04
FROM ${BASE_IMG} as base

ARG TENSORRT_VERSION
RUN test -n "$TENSORRT_VERSION" || (echo "No tensorrt version specified, please use --build-arg TENSORRT_VERSION=x.y.z to specify a version." && exit 1)
ARG CUDNN_VERSION
RUN test -n "$CUDNN_VERSION" || (echo "No cudnn version specified, please use --build-arg CUDNN_VERSION=x.y.z to specify a version." && exit 1)

ARG PYTHON_VERSION=3.10
ENV PYTHON_VERSION=${PYTHON_VERSION}

ARG USE_CXX11_ABI
ENV USE_CXX11=${USE_CXX11_ABI}
ENV DEBIAN_FRONTEND=noninteractive

# Install basic dependencies
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt install -y build-essential manpages-dev wget zlib1g software-properties-common git
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y python3.8 python3.8-distutils python3.8-dev
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN ln -s /usr/bin/python3.8 /usr/bin/python
RUN python get-pip.py
RUN pip3 install wheel

# Install Pytorch
RUN pip3 install torch==2.0.0.dev20230103+cu117 torchvision==0.15.0.dev20230103+cu117 --extra-index-url https://download.pytorch.org/whl/nightly/cu117

# Install CUDNN + TensorRT
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
RUN mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
RUN apt install -y build-essential manpages-dev wget zlib1g software-properties-common git libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget ca-certificates curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev mecab-ipadic-utf8

# Install PyEnv and desired Python version
ENV HOME="/root"
ENV PYENV_DIR="$HOME/.pyenv"
ENV PATH="$PYENV_DIR/shims:$PYENV_DIR/bin:$PATH"
RUN wget -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer &&\
chmod 755 pyenv-installer &&\
bash pyenv-installer &&\
eval "$(pyenv init -)"

RUN pyenv install -v ${PYTHON_VERSION}
RUN pyenv global ${PYTHON_VERSION}

# Install CUDNN + TensorRT + dependencies
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
RUN mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/7fa2af80.pub
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 536F8F1DE80F6A35
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
RUN apt-get update
RUN apt-get install -y libcudnn8=8.5.0* libcudnn8-dev=8.5.0*
RUN apt-get install -y libcudnn8=${CUDNN_VERSION}* libcudnn8-dev=${CUDNN_VERSION}*

RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
RUN apt-get update

RUN apt-get install -y libnvinfer8=8.5.1* libnvinfer-plugin8=8.5.1* libnvinfer-dev=8.5.1* libnvinfer-plugin-dev=8.5.1* libnvonnxparsers8=8.5.1-1* libnvonnxparsers-dev=8.5.1-1* libnvparsers8=8.5.1-1* libnvparsers-dev=8.5.1-1*
RUN apt-get install -y libnvinfer8=${TENSORRT_VERSION}* libnvinfer-plugin8=${TENSORRT_VERSION}* libnvinfer-dev=${TENSORRT_VERSION}* libnvinfer-plugin-dev=${TENSORRT_VERSION}* libnvonnxparsers8=${TENSORRT_VERSION}-1* libnvonnxparsers-dev=${TENSORRT_VERSION}-1* libnvparsers8=${TENSORRT_VERSION}-1* libnvparsers-dev=${TENSORRT_VERSION}-1*

# Setup Bazel
ARG BAZEL_VERSION=5.2.0
RUN wget -q https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-linux-x86_64 -O /usr/bin/bazel \
&& chmod a+x /usr/bin/bazel
# Setup Bazel via Bazelisk
RUN wget -q https://github.com/bazelbuild/bazelisk/releases/download/v1.16.0/bazelisk-linux-amd64 -O /usr/bin/bazel &&\
chmod a+x /usr/bin/bazel

# Build Torch-TensorRT in an auxillary container
FROM base as torch-tensorrt-builder-base
Expand All @@ -43,19 +57,24 @@ ARG ARCH="x86_64"
ARG TARGETARCH="amd64"

RUN apt-get install -y python3-setuptools
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
RUN apt-get update
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub

RUN apt-get update && apt-get install -y --no-install-recommends locales ninja-build && rm -rf /var/lib/apt/lists/* && locale-gen en_US.UTF-8
RUN apt-get update &&\
apt-get install -y --no-install-recommends locales ninja-build &&\
rm -rf /var/lib/apt/lists/* &&\
locale-gen en_US.UTF-8

FROM torch-tensorrt-builder-base as torch-tensorrt-builder

COPY . /workspace/torch_tensorrt/src
WORKDIR /workspace/torch_tensorrt/src
RUN cp ./docker/WORKSPACE.docker WORKSPACE

# Symlink the path pyenv is using for python with the /opt directory for package sourcing
RUN ln -s "`pyenv which python | xargs dirname | xargs dirname`/lib/python$PYTHON_VERSION/site-packages" "/opt/python3"

# This script builds both libtorchtrt bin/lib/include tarball and the Python wheel, in dist/
RUN ./docker/dist-build.sh
RUN bash ./docker/dist-build.sh

# Copy and install Torch-TRT into the main container
FROM base as torch-tensorrt
Expand All @@ -64,13 +83,13 @@ COPY . /opt/torch_tensorrt
COPY --from=torch-tensorrt-builder /workspace/torch_tensorrt/src/py/dist/ .

RUN cp /opt/torch_tensorrt/docker/WORKSPACE.docker /opt/torch_tensorrt/WORKSPACE
RUN pip3 install *.whl && rm -fr /workspace/torch_tensorrt/py/dist/* *.whl

# Install native tensorrt python package required by torch_tensorrt whl file
RUN pip install tensorrt==8.5.1.7
RUN pip install -r /opt/torch_tensorrt/py/requirements.txt
RUN pip install tensorrt==${TENSORRT_VERSION}.*
RUN pip install *.whl && rm -fr /workspace/torch_tensorrt/py/dist/* *.whl

WORKDIR /opt/torch_tensorrt
ENV LD_LIBRARY_PATH /usr/local/lib/python3.8/dist-packages/torch/lib:/usr/local/lib/python3.8/dist-packages/torch_tensorrt/lib:/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}
ENV PATH /usr/local/lib/python3.8/dist-packages/torch_tensorrt/bin:${PATH}

ENV LD_LIBRARY_PATH /opt/python3/site-packages/torch/lib:/opt/python3/site-packages/torch_tensorrt/lib:/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}
ENV PATH /opt/python3/site-packages/torch_tensorrt/bin:${PATH}

CMD /bin/bash
15 changes: 11 additions & 4 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,29 @@

* Use `Dockerfile` to build a container which provides the exact development environment that our master branch is usually tested against.

* `Dockerfile` currently uses the exact library versions (Torch, CUDA, CUDNN, TensorRT) listed in <a href="https://github.com/pytorch/TensorRT#dependencies">dependencies</a> to build Torch-TensorRT.
* The `Dockerfile` currently uses <a href="https://github.com/bazelbuild/bazelisk">Bazelisk</a> to select the Bazel version, and uses the exact library versions of Torch and CUDA listed in <a href="https://github.com/pytorch/TensorRT#dependencies">dependencies</a>.
* The desired versions of CUDNN and TensorRT must be specified as build-args, with major, minor, and patch versions as in: `--build-arg TENSORRT_VERSION=a.b.c --build-arg CUDNN_VERSION=x.y.z`
* [**Optional**] The desired base image be changed by explicitly setting a base image, as in `--build-arg BASE_IMG=nvidia/cuda:11.8.0-devel-ubuntu22.04`, though this is optional
* [**Optional**] Additionally, the desired Python version can be changed by explicitly setting a version, as in `--build-arg PYTHON_VERSION=3.10`, though this is optional as well.

* This `Dockerfile` installs `pre-cxx11-abi` versions of Pytorch and builds Torch-TRT using `pre-cxx11-abi` libtorch as well.
Note: To install `cxx11_abi` version of Torch-TensorRT, enable `USE_CXX11=1` flag so that `dist-build.sh` can build it accordingly.

Note: By default the container uses the `pre-cxx11-abi` version of Torch + Torch-TRT. If you are using a workflow that requires a build of PyTorch on the CXX11 ABI (e.g. using the PyTorch NGC containers as a base image), add the Docker build argument: `--build-arg USE_CXX11_ABI=1`

### Dependencies

* Install nvidia-docker by following https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

### Instructions

- The example below uses CUDNN 8.8.0 and TensorRT 8.6.0
- See <a href="https://github.com/pytorch/TensorRT#dependencies">dependencies</a> for a list of current default dependencies.

> From root of Torch-TensorRT repo

Build:
```
DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile -t torch_tensorrt:latest .
DOCKER_BUILDKIT=1 docker build --build-arg TENSORRT_VERSION=8.6.0 --build-arg CUDNN_VERSION=8.8.0 -f docker/Dockerfile -t torch_tensorrt:latest .
```

Run:
Expand All @@ -38,4 +45,4 @@ bazel test //tests/core/conversion/converters:test_activation --compilation_mode

### Pytorch NGC containers

We also ship Torch-TensorRT in <a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch">Pytorch NGC containers </a>. Release notes for these containers can be found <a href="https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html">here</a>. Check out `release/ngc/23.XX` branch of Torch-TensorRT for source code that gets shipped with `23.XX` version of Pytorch NGC container.
We also ship Torch-TensorRT in <a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch">Pytorch NGC containers </a>. Release notes for these containers can be found <a href="https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html">here</a>. Check out `release/ngc/23.XX` branch of Torch-TensorRT for source code that gets shipped with `23.XX` version of Pytorch NGC container.
16 changes: 6 additions & 10 deletions docker/WORKSPACE.docker
Original file line number Diff line number Diff line change
Expand Up @@ -48,20 +48,16 @@ new_local_repository(
# Tarballs and fetched dependencies (default - use in cases when building from precompiled bin and tarballs)
#############################################################################################################

http_archive(
new_local_repository(
name = "libtorch",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "59b8b5e1954a86d50b79c13f06398d385b200da13e37a08ecf31d3c62e5ca127",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/nightly/cu117/libtorch-cxx11-abi-shared-with-deps-2.0.0.dev20230103%2Bcu117.zip"],
path = "/opt/python3/site-packages/torch/",
build_file = "third_party/libtorch/BUILD"
)

http_archive(
new_local_repository(
name = "libtorch_pre_cxx11_abi",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "e260fc7476be89d1650953e8643e9f7363845f5a52de4bab87ac0e619c1f6ad4",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/nightly/cu117/libtorch-shared-with-deps-2.0.0.dev20230103%2Bcu117.zip"],
path = "/opt/python3/site-packages/torch/",
build_file = "third_party/libtorch/BUILD"
)

####################################################################################
Expand Down
Loading