Skip to content

Commit

Permalink
fix: Upgrade to PyTorch 2.0.1 Release Candidate + Other improvements (#…
Browse files Browse the repository at this point in the history
  • Loading branch information
gs-olive authored May 2, 2023
1 parent e2184bc commit 2b70a2f
Show file tree
Hide file tree
Showing 16 changed files with 141 additions and 113 deletions.
22 changes: 11 additions & 11 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ commands:
sudo apt-get --purge remove "*nvidia*"
install-cudnn:
description: "Install CUDNN 8.5.0"
description: "Install CUDNN 8.8.0"
parameters:
os:
type: string
Expand All @@ -112,10 +112,10 @@ commands:
default: "x86_64"
cudnn-version:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
cuda-version:
type: string
default: "cuda11.7"
default: "cuda11.8"
steps:
- run:
name: Install CUDNN
Expand Down Expand Up @@ -200,7 +200,7 @@ commands:
default: "cuda11.8"
cudnn-version:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
trt-version-short:
type: string
default: "8.6.0"
Expand Down Expand Up @@ -252,7 +252,7 @@ commands:
default: "8.6.0"
cudnn-version-long:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
steps:
- run:
name: Set up python environment
Expand All @@ -269,10 +269,10 @@ commands:
parameters:
torch-build:
type: string
default: "2.0.0"
default: "2.0.1"
torch-build-index:
type: string
default: "https://download.pytorch.org/whl/cu118"
default: "https://download.pytorch.org/whl/test/cu118"
steps:
- run:
name: Install Torch
Expand Down Expand Up @@ -474,6 +474,7 @@ commands:
- run: mkdir -p /tmp/artifacts
- run:
name: Run core / C++ tests
no_output_timeout: 15m
environment:
LD_LIBRARY_PATH: "/opt/circleci/.pyenv/versions/3.9.4/lib/python3.9/site-packages/torch_tensorrt.libs:/home/circleci/project/bazel-project/external/libtorch_pre_cxx11_abi/lib/:/home/circleci/project/bazel-project/external/tensorrt/lib/:/usr/local/cuda-11.8/lib64/:$LD_LIBRARY_PATH"
command: |
Expand Down Expand Up @@ -1205,10 +1206,10 @@ parameters:
# Nightly platform config
torch-build:
type: string
default: "2.0.0"
default: "2.0.1"
torch-build-index:
type: string
default: "https://download.pytorch.org/whl/cu118"
default: "https://download.pytorch.org/whl/test/cu118"
torch-build-legacy:
type: string
default: "1.13.1+cu117"
Expand All @@ -1217,7 +1218,7 @@ parameters:
default: "https://download.pytorch.org/whl/cu117"
cudnn-version:
type: string
default: "8.5.0.96"
default: "8.8.0.121"
trt-version-short:
type: string
default: "8.6.0"
Expand Down Expand Up @@ -1412,4 +1413,3 @@ workflows:
trt-version-short: << pipeline.parameters.trt-version-short >>
cudnn-version: << pipeline.parameters.cudnn-version >>
python-version: << pipeline.parameters.python-version >>

26 changes: 14 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,7 @@ In the case of building on top of a custom base container, you first must determ
version of the PyTorch C++ ABI. If your source of PyTorch is pytorch.org, likely this is the pre-cxx11-abi in which case you must modify `//docker/dist-build.sh` to not build the
C++11 ABI version of Torch-TensorRT.

You can then build the container using:


```bash
docker build --build-arg BASE_IMG=<IMAGE> -f docker/Dockerfile -t torch_tensorrt:latest .
```
You can then build the container using the build command in the [docker README](docker/README.md#instructions)

If you would like to build outside a docker container, please follow the section [Compiling Torch-TensorRT](#compiling-torch-tensorrt)

Expand Down Expand Up @@ -73,6 +68,7 @@ import torch_tensorrt
...
trt_ts_module = torch_tensorrt.compile(torch_script_module,
# If the inputs to the module are plain Tensors, specify them via the `inputs` argument:
inputs = [example_tensor, # Provide example tensor for input shape or...
torch_tensorrt.Input( # Specify input object with shape and dtype
min_shape=[1, 3, 224, 224],
Expand All @@ -81,6 +77,12 @@ trt_ts_module = torch_tensorrt.compile(torch_script_module,
# For static size shape=[1, 3, 224, 224]
dtype=torch.half) # Datatype of input tensor. Allowed options torch.(float|half|int8|int32|bool)
],
# For inputs containing tuples or lists of tensors, use the `input_signature` argument:
# Below, we have an input consisting of a Tuple of two Tensors (Tuple[Tensor, Tensor])
# input_signature = ( (torch_tensorrt.Input(shape=[1, 3, 224, 224], dtype=torch.half),
# torch_tensorrt.Input(shape=[1, 3, 224, 224], dtype=torch.half)), ),
enabled_precisions = {torch.half}, # Run with FP16
)
Expand Down Expand Up @@ -114,17 +116,17 @@ torch.jit.save(trt_ts_module, "trt_torchscript_module.ts") # save the TRT embedd
These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

- Bazel 5.2.0
- Libtorch 2.0.0.dev20230103 (built with CUDA 11.7)
- CUDA 11.7
- cuDNN 8.5.0
- TensorRT 8.5.1.7
- Libtorch 2.0.1 (built with CUDA 11.8)
- CUDA 11.8
- cuDNN 8.8.0
- TensorRT 8.6.0

## Prebuilt Binaries and Wheel files

Releases: https://github.com/pytorch/TensorRT/releases

```
pip install torch-tensorrt==1.2.0 --find-links https://github.com/pytorch/TensorRT/releases/expanded_assets/v1.2.0
pip install torch-tensorrt
```

## Compiling Torch-TensorRT
Expand Down Expand Up @@ -245,7 +247,7 @@ A tarball with the include files and library can then be found in bazel-bin
### Running Torch-TensorRT on a JIT Graph

> Make sure to add LibTorch to your LD_LIBRARY_PATH <br>
> `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(pwd)/bazel-Torch-TensorRT/external/libtorch/lib`
> `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(pwd)/bazel-TensorRT/external/libtorch/lib`
``` shell
bazel run //cpp/bin/torchtrtc -- $(realpath <PATH TO GRAPH>) out.ts <input-size>
Expand Down
14 changes: 7 additions & 7 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,17 @@ new_local_repository(
http_archive(
name = "libtorch",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "292b3f81e7c857fc102be93e2e44c40cdb4d8ef03d98121bc6af434c66e8490b",
sha256 = "c5174f18c0866421a5738d389aaea0c02f32a1a5be5f0747dc8dd0d96034c9b0",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/cu118/libtorch-cxx11-abi-shared-with-deps-2.0.0%2Bcu118.zip"],
urls = ["https://download.pytorch.org/libtorch/test/cu118/libtorch-cxx11-abi-shared-with-deps-latest.zip"],
)

http_archive(
name = "libtorch_pre_cxx11_abi",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "f3cbd7e9593f0c64b8671d02a21d562c98b60ef1abf5898c0ee9acfbc5a6b5d2",
sha256 = "cc19b398cf435e0e34f347ef90fc11c2a42703998330a9c4a9fb0d2291737df7",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/cu118/libtorch-shared-with-deps-2.0.0%2Bcu118.zip"],
urls = ["https://download.pytorch.org/libtorch/test/cu118/libtorch-shared-with-deps-latest.zip"],
)

# Download these tarballs manually from the NVIDIA website
Expand All @@ -71,10 +71,10 @@ http_archive(
http_archive(
name = "cudnn",
build_file = "@//third_party/cudnn/archive:BUILD",
sha256 = "5454a6fd94f008728caae9adad993c4e85ef36302e26bce43bea7d458a5e7b6d",
strip_prefix = "cudnn-linux-x86_64-8.5.0.96_cuda11-archive",
sha256 = "36fff137153ef73e6ee10bfb07f4381240a86fb9fb78ce372414b528cbab2293",
strip_prefix = "cudnn-linux-x86_64-8.8.0.121_cuda11-archive",
urls = [
"https://developer.nvidia.com/compute/cudnn/secure/8.5.0/local_installers/11.7/cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar.xz",
"https://developer.download.nvidia.com/compute/cudnn/secure/8.8.0/local_installers/11.8/cudnn-linux-x86_64-8.8.0.121_cuda11-archive.tar.xz",
],
)

Expand Down
89 changes: 54 additions & 35 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,40 +1,54 @@
# Base image starts with CUDA
ARG BASE_IMG=nvidia/cuda:11.7.1-devel-ubuntu20.04
ARG BASE_IMG=nvidia/cuda:11.8.0-devel-ubuntu22.04
FROM ${BASE_IMG} as base

ARG TENSORRT_VERSION
RUN test -n "$TENSORRT_VERSION" || (echo "No tensorrt version specified, please use --build-arg TENSORRT_VERSION=x.y.z to specify a version." && exit 1)
ARG CUDNN_VERSION
RUN test -n "$CUDNN_VERSION" || (echo "No cudnn version specified, please use --build-arg CUDNN_VERSION=x.y.z to specify a version." && exit 1)

ARG PYTHON_VERSION=3.10
ENV PYTHON_VERSION=${PYTHON_VERSION}

ARG USE_CXX11_ABI
ENV USE_CXX11=${USE_CXX11_ABI}
ENV DEBIAN_FRONTEND=noninteractive

# Install basic dependencies
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt install -y build-essential manpages-dev wget zlib1g software-properties-common git
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y python3.8 python3.8-distutils python3.8-dev
RUN wget https://bootstrap.pypa.io/get-pip.py
RUN ln -s /usr/bin/python3.8 /usr/bin/python
RUN python get-pip.py
RUN pip3 install wheel

# Install Pytorch
RUN pip3 install torch==2.0.0.dev20230103+cu117 torchvision==0.15.0.dev20230103+cu117 --extra-index-url https://download.pytorch.org/whl/nightly/cu117

# Install CUDNN + TensorRT
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
RUN mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
RUN apt install -y build-essential manpages-dev wget zlib1g software-properties-common git libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget ca-certificates curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev mecab-ipadic-utf8

# Install PyEnv and desired Python version
ENV HOME="/root"
ENV PYENV_DIR="$HOME/.pyenv"
ENV PATH="$PYENV_DIR/shims:$PYENV_DIR/bin:$PATH"
RUN wget -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer &&\
chmod 755 pyenv-installer &&\
bash pyenv-installer &&\
eval "$(pyenv init -)"

RUN pyenv install -v ${PYTHON_VERSION}
RUN pyenv global ${PYTHON_VERSION}

# Install CUDNN + TensorRT + dependencies
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
RUN mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/7fa2af80.pub
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 536F8F1DE80F6A35
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys A4B469963BF863CC
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
RUN apt-get update
RUN apt-get install -y libcudnn8=8.5.0* libcudnn8-dev=8.5.0*
RUN apt-get install -y libcudnn8=${CUDNN_VERSION}* libcudnn8-dev=${CUDNN_VERSION}*

RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
RUN add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
RUN apt-get update

RUN apt-get install -y libnvinfer8=8.5.1* libnvinfer-plugin8=8.5.1* libnvinfer-dev=8.5.1* libnvinfer-plugin-dev=8.5.1* libnvonnxparsers8=8.5.1-1* libnvonnxparsers-dev=8.5.1-1* libnvparsers8=8.5.1-1* libnvparsers-dev=8.5.1-1*
RUN apt-get install -y libnvinfer8=${TENSORRT_VERSION}* libnvinfer-plugin8=${TENSORRT_VERSION}* libnvinfer-dev=${TENSORRT_VERSION}* libnvinfer-plugin-dev=${TENSORRT_VERSION}* libnvonnxparsers8=${TENSORRT_VERSION}-1* libnvonnxparsers-dev=${TENSORRT_VERSION}-1* libnvparsers8=${TENSORRT_VERSION}-1* libnvparsers-dev=${TENSORRT_VERSION}-1*

# Setup Bazel
ARG BAZEL_VERSION=5.2.0
RUN wget -q https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-linux-x86_64 -O /usr/bin/bazel \
&& chmod a+x /usr/bin/bazel
# Setup Bazel via Bazelisk
RUN wget -q https://github.com/bazelbuild/bazelisk/releases/download/v1.16.0/bazelisk-linux-amd64 -O /usr/bin/bazel &&\
chmod a+x /usr/bin/bazel

# Build Torch-TensorRT in an auxillary container
FROM base as torch-tensorrt-builder-base
Expand All @@ -43,19 +57,24 @@ ARG ARCH="x86_64"
ARG TARGETARCH="amd64"

RUN apt-get install -y python3-setuptools
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub
RUN apt-get update
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub

RUN apt-get update && apt-get install -y --no-install-recommends locales ninja-build && rm -rf /var/lib/apt/lists/* && locale-gen en_US.UTF-8
RUN apt-get update &&\
apt-get install -y --no-install-recommends locales ninja-build &&\
rm -rf /var/lib/apt/lists/* &&\
locale-gen en_US.UTF-8

FROM torch-tensorrt-builder-base as torch-tensorrt-builder

COPY . /workspace/torch_tensorrt/src
WORKDIR /workspace/torch_tensorrt/src
RUN cp ./docker/WORKSPACE.docker WORKSPACE

# Symlink the path pyenv is using for python with the /opt directory for package sourcing
RUN ln -s "`pyenv which python | xargs dirname | xargs dirname`/lib/python$PYTHON_VERSION/site-packages" "/opt/python3"

# This script builds both libtorchtrt bin/lib/include tarball and the Python wheel, in dist/
RUN ./docker/dist-build.sh
RUN bash ./docker/dist-build.sh

# Copy and install Torch-TRT into the main container
FROM base as torch-tensorrt
Expand All @@ -64,13 +83,13 @@ COPY . /opt/torch_tensorrt
COPY --from=torch-tensorrt-builder /workspace/torch_tensorrt/src/py/dist/ .

RUN cp /opt/torch_tensorrt/docker/WORKSPACE.docker /opt/torch_tensorrt/WORKSPACE
RUN pip3 install *.whl && rm -fr /workspace/torch_tensorrt/py/dist/* *.whl

# Install native tensorrt python package required by torch_tensorrt whl file
RUN pip install tensorrt==8.5.1.7
RUN pip install -r /opt/torch_tensorrt/py/requirements.txt
RUN pip install tensorrt==${TENSORRT_VERSION}.*
RUN pip install *.whl && rm -fr /workspace/torch_tensorrt/py/dist/* *.whl

WORKDIR /opt/torch_tensorrt
ENV LD_LIBRARY_PATH /usr/local/lib/python3.8/dist-packages/torch/lib:/usr/local/lib/python3.8/dist-packages/torch_tensorrt/lib:/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}
ENV PATH /usr/local/lib/python3.8/dist-packages/torch_tensorrt/bin:${PATH}

ENV LD_LIBRARY_PATH /opt/python3/site-packages/torch/lib:/opt/python3/site-packages/torch_tensorrt/lib:/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}
ENV PATH /opt/python3/site-packages/torch_tensorrt/bin:${PATH}

CMD /bin/bash
15 changes: 11 additions & 4 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,29 @@

* Use `Dockerfile` to build a container which provides the exact development environment that our master branch is usually tested against.

* `Dockerfile` currently uses the exact library versions (Torch, CUDA, CUDNN, TensorRT) listed in <a href="https://github.com/pytorch/TensorRT#dependencies">dependencies</a> to build Torch-TensorRT.
* The `Dockerfile` currently uses <a href="https://github.com/bazelbuild/bazelisk">Bazelisk</a> to select the Bazel version, and uses the exact library versions of Torch and CUDA listed in <a href="https://github.com/pytorch/TensorRT#dependencies">dependencies</a>.
* The desired versions of CUDNN and TensorRT must be specified as build-args, with major, minor, and patch versions as in: `--build-arg TENSORRT_VERSION=a.b.c --build-arg CUDNN_VERSION=x.y.z`
* [**Optional**] The desired base image be changed by explicitly setting a base image, as in `--build-arg BASE_IMG=nvidia/cuda:11.8.0-devel-ubuntu22.04`, though this is optional
* [**Optional**] Additionally, the desired Python version can be changed by explicitly setting a version, as in `--build-arg PYTHON_VERSION=3.10`, though this is optional as well.

* This `Dockerfile` installs `pre-cxx11-abi` versions of Pytorch and builds Torch-TRT using `pre-cxx11-abi` libtorch as well.
Note: To install `cxx11_abi` version of Torch-TensorRT, enable `USE_CXX11=1` flag so that `dist-build.sh` can build it accordingly.

Note: By default the container uses the `pre-cxx11-abi` version of Torch + Torch-TRT. If you are using a workflow that requires a build of PyTorch on the CXX11 ABI (e.g. using the PyTorch NGC containers as a base image), add the Docker build argument: `--build-arg USE_CXX11_ABI=1`

### Dependencies

* Install nvidia-docker by following https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

### Instructions

- The example below uses CUDNN 8.8.0 and TensorRT 8.6.0
- See <a href="https://github.com/pytorch/TensorRT#dependencies">dependencies</a> for a list of current default dependencies.

> From root of Torch-TensorRT repo
Build:
```
DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile -t torch_tensorrt:latest .
DOCKER_BUILDKIT=1 docker build --build-arg TENSORRT_VERSION=8.6.0 --build-arg CUDNN_VERSION=8.8.0 -f docker/Dockerfile -t torch_tensorrt:latest .
```

Run:
Expand All @@ -38,4 +45,4 @@ bazel test //tests/core/conversion/converters:test_activation --compilation_mode

### Pytorch NGC containers

We also ship Torch-TensorRT in <a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch">Pytorch NGC containers </a>. Release notes for these containers can be found <a href="https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html">here</a>. Check out `release/ngc/23.XX` branch of Torch-TensorRT for source code that gets shipped with `23.XX` version of Pytorch NGC container.
We also ship Torch-TensorRT in <a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch">Pytorch NGC containers </a>. Release notes for these containers can be found <a href="https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html">here</a>. Check out `release/ngc/23.XX` branch of Torch-TensorRT for source code that gets shipped with `23.XX` version of Pytorch NGC container.
16 changes: 6 additions & 10 deletions docker/WORKSPACE.docker
Original file line number Diff line number Diff line change
Expand Up @@ -48,20 +48,16 @@ new_local_repository(
# Tarballs and fetched dependencies (default - use in cases when building from precompiled bin and tarballs)
#############################################################################################################

http_archive(
new_local_repository(
name = "libtorch",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "59b8b5e1954a86d50b79c13f06398d385b200da13e37a08ecf31d3c62e5ca127",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/nightly/cu117/libtorch-cxx11-abi-shared-with-deps-2.0.0.dev20230103%2Bcu117.zip"],
path = "/opt/python3/site-packages/torch/",
build_file = "third_party/libtorch/BUILD"
)

http_archive(
new_local_repository(
name = "libtorch_pre_cxx11_abi",
build_file = "@//third_party/libtorch:BUILD",
sha256 = "e260fc7476be89d1650953e8643e9f7363845f5a52de4bab87ac0e619c1f6ad4",
strip_prefix = "libtorch",
urls = ["https://download.pytorch.org/libtorch/nightly/cu117/libtorch-shared-with-deps-2.0.0.dev20230103%2Bcu117.zip"],
path = "/opt/python3/site-packages/torch/",
build_file = "third_party/libtorch/BUILD"
)

####################################################################################
Expand Down
Loading

0 comments on commit 2b70a2f

Please sign in to comment.