Skip to content

Commit

Permalink
TensorRT 9.3 updates (#3661)
Browse files Browse the repository at this point in the history
* TensorRT 9.3 updates (no submodule updates)

Signed-off-by: Michal Guzek <mguzek@nvidia.com>

* Update to ONNX-TensorRT 9.3

Signed-off-by: Michal Guzek <mguzek@nvidia.com>

---------

Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Co-authored-by: Michal Guzek <mguzek@nvidia.com>
  • Loading branch information
moraxu and moraxu authored Feb 9, 2024
1 parent 93b6044 commit 6d1397e
Show file tree
Hide file tree
Showing 111 changed files with 5,491 additions and 680 deletions.
11 changes: 10 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
# TensorRT OSS Release Changelog

## 9.2.0 GA - 2023-12-04
## 9.3.0 GA - 2024-02-09

Key Features and Updates:

- Demo changes
- Faster Text-to-image using SDXL & INT8 quantization using AMMO
- Updated tooling
- Polygraphy v0.49.7

## 9.2.0 GA - 2023-11-27

Key Features and Updates:

Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ You can skip the **Build** section to enjoy TensorRT with Python.
To build the TensorRT-OSS components, you will first need the following software packages.

**TensorRT GA build**
* TensorRT v9.2.0.5
* TensorRT v9.3.0.1
* Available from direct download links listed below

**System Packages**
Expand Down Expand Up @@ -73,16 +73,16 @@ To build the TensorRT-OSS components, you will first need the following software
If using the TensorRT OSS build container, TensorRT libraries are preinstalled under `/usr/lib/x86_64-linux-gnu` and you may skip this step.

Else download and extract the TensorRT GA build from [NVIDIA Developer Zone](https://developer.nvidia.com) with the direct links below:
- [TensorRT 9.2.0.5 for CUDA 11.8, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/9.2.0/tensorrt-9.2.0.5.linux.x86_64-gnu.cuda-11.8.tar.gz)
- [TensorRT 9.2.0.5 for CUDA 12.2, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/9.2.0/tensorrt-9.2.0.5.linux.x86_64-gnu.cuda-12.2.tar.gz)
- [TensorRT 9.3.0.1 for CUDA 11.8, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/9.3.0/tensorrt-9.3.0.1.linux.x86_64-gnu.cuda-11.8.tar.gz)
- [TensorRT 9.3.0.1 for CUDA 12.2, Linux x86_64](https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/9.3.0/tensorrt-9.3.0.1.linux.x86_64-gnu.cuda-12.2.tar.gz)


**Example: Ubuntu 20.04 on x86-64 with cuda-12.2**

```bash
cd ~/Downloads
tar -xvzf tensorrt-9.2.0.5.linux.x86_64-gnu.cuda-12.2.tar.gz
export TRT_LIBPATH=`pwd`/TensorRT-9.2.0.5
tar -xvzf tensorrt-9.3.0.1.linux.x86_64-gnu.cuda-12.2.tar.gz
export TRT_LIBPATH=`pwd`/TensorRT-9.3.0.1
```

## Setting Up The Build Environment
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
9.2.0.5
9.3.0.1
18 changes: 13 additions & 5 deletions demo/Diffusion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This demo application ("demoDiffusion") showcases the acceleration of Stable Dif
### Clone the TensorRT OSS repository

```bash
git clone git@github.com:NVIDIA/TensorRT.git -b release/9.2 --single-branch
git clone git@github.com:NVIDIA/TensorRT.git -b release/9.3 --single-branch
cd TensorRT
```

Expand All @@ -16,7 +16,7 @@ cd TensorRT
Install nvidia-docker using [these intructions](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker).

```bash
docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.07-py3 /bin/bash
docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.12-py3 /bin/bash
```

### Install latest TensorRT release
Expand All @@ -26,7 +26,7 @@ python3 -m pip install --upgrade pip
python3 -m pip install --pre --upgrade --extra-index-url https://pypi.nvidia.com tensorrt
```

> NOTE: TensorRT 9.0 is only available as a pre-release
> NOTE: TensorRT 9.x is only available as a pre-release
Check your installed version using:
`python3 -c 'import tensorrt;print(tensorrt.__version__)'`
Expand All @@ -48,8 +48,8 @@ diffusers 0.23.1
onnx 1.14.0
onnx-graphsurgeon 0.3.26
onnxruntime 1.15.1
polygraphy 0.49.1
tensorrt 9.2.0.5
polygraphy 0.49.7
tensorrt 9.3.0.1
tokenizers 0.13.2
torch 2.1.0
transformers 4.31.0
Expand Down Expand Up @@ -137,6 +137,14 @@ It is also possible to combine multiple LoRAs.
python3 demo_txt2img_xl.py "Picture of a rustic Italian village with Olive trees and mountains" --version=xl-1.0 --lora-path "ostris/crayon_style_lora_sdxl" "ostris/watercolor_style_lora_sdxl" --lora-scale 0.3 0.7 --onnx-dir onnx-sdxl-lora --engine-dir engine-sdxl-lora --build-enable-refit
```

### Faster Text-to-image using SDXL & INT8 quantization using AMMO

```bash
python3 demo_txt2img_xl.py "a photo of an astronaut riding a horse on mars" --version xl-1.0 --onnx-dir onnx-sdxl --engine-dir engine-sdxl --int8 --quantization-level 3
```

Note that the calibration process can be quite time-consuming, and will be repeated if `--quantization-level`, `--denoising-steps`, or `--onnx-dir` is changed.

### Faster Text-to-Image using SDXL + LCM (Latent Consistency Model) LoRA weights
[LCM-LoRA](https://arxiv.org/abs/2311.05556) produces good quality images in 4 to 8 denoising steps instead of 30+ needed base model. Note that we use LCM scheduler and disable classifier-free-guidance by setting `--guidance-scale` to 0.
LoRA weights are fused into the ONNX and finalized TensorRT plan files in this example.
Expand Down
Loading

0 comments on commit 6d1397e

Please sign in to comment.