Torch-TensorRT v1.0.0 #693
Replies: 4 comments 2 replies
-
Can I install this on Jetson Nano running latest JetPack (4.6)? |
Beta Was this translation helpful? Give feedback.
-
Please follow these instructions: https://nvidia.github.io/Torch-TensorRT/tutorials/installation.html#building-natively-on-aarch64-jetson
…________________________________
From: Nikolay Ulmasov ***@***.***>
Sent: Tuesday, December 21, 2021 2:07:29 PM
To: NVIDIA/Torch-TensorRT ***@***.***>
Cc: Naren Dasan ***@***.***>; Author ***@***.***>
Subject: Re: [NVIDIA/Torch-TensorRT] Torch-TensorRT v1.0.0 (Discussion #693)
Can I install this on Jetson Nano running latest JetPack (4.6)? pip3 install torch-tensorrt -f https://github.com/NVIDIA/Torch-TensorRT/releases installs torch_tensorrt-0.0.0-py3-none-any.whl but that results in module 'torch_tensorrt' has no attribute 'compile' error. Do I need to go via basel? Can I install that on Jetson Nano?
—
Reply to this email directly, view it on GitHub<#693 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AANVFFLOEKKOE4QKEQJTPTLUSDF7DANCNFSM5HUVAXOA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thank you for you reply @narendasan, I spent another hour trying to make sense of it but failed - |
Beta Was this translation helpful? Give feedback.
-
Would it be possible for you guys to release compiled Windows binaries or at least a step-by-step tutorial on how to do so? I am not comfortable enough yet with the process to do so. |
Beta Was this translation helpful? Give feedback.
-
New Name!, Support for PyTorch 1.10, CUDA 11.3, New Packaging and Distribution Options, Stabilized APIs, Stabilized Partial Compilation, Adjusted Default Behavior, Usability Improvements, New Converters, Bug Fixes
This is the first stable release of Torch-TensorRT targeting PyTorch 1.10, CUDA 11.3 (on x86_64, CUDA 10.2 on aarch64), cuDNN 8.2 and TensorRT 8.0 with backwards compatible source for TensorRT 7.1. On aarch64 TRTorch targets Jetpack 4.6 primarily with backwards compatible source for Jetpack 4.5. This version also removes deprecated APIs such as
InputRange
andop_preicsion
New Name
TRTorch is now Torch-TensorRT! TRTorch started out as a small experimental project compiling TorchScript to TensorRT almost two years ago and now as we are hitting v1.0.0 with APIs and major features stabilizing we felt that the name of the project should reflect the ecosystem of tools it is joining with this release, namely TF-TRT (https://blog.tensorflow.org/2021/01/leveraging-tensorflow-tensorrt-integration.html) and MXNet-TensorRT(https://mxnet.apache.org/versions/1.8.0/api/python/docs/tutorials/performance/backend/tensorrt/tensorrt). Since we were already significantly changing APIs with this release to reflect what we learned over the last two years of using TRTorch, we felt this is was the right time to change the name as well.
The overall process to port forward from TRTorch is as follows:
Python
trtorch
totorch_tensorrt
trtorch
namespace have now been separated. IR agnostic components:torch_tensorrt.Input
,torch_tensorrt.Device
,torch_tensorrt.ptq
,torch_tensorrt.logging
will continue to live under the top level namespace. IR specific components liketorch_tensorrt.ts.compile
,torch_tensorrt.ts.convert_method_to_trt_engine
,torch_tensorrt.ts.TensorRTCompileSpec
will live in a TorchScript specific namespace. This gives us space to explore the other IRs that might be relevant to the project in the future. In the place of the old top levelcompile
andconvert_method_to_engine
are new ones which will call the IR specific versions based on what is provided to them. This also means that you can now provide a rawtorch.nn.Module
totorch_tensorrt.compile
and Torch-TensorRT will handle the TorchScripting step for you. For the most part the sole change that will be needed to change over namespaces is to exchangetrtorch
totorch_tensorrt
C++
trtorch
totorch_tensorrt
and components specific to the IR likecompile
,convert_method_to_trt_engine
andCompileSpec
are in atorchscript
namespace, while agnostic components are at the top level. Namespace aliases fortorch_tensorrt
->torchtrt
andtorchscript
->ts
are included. Again the port forward process for namespaces should be a find and replace. Finally the librarieslibtrtorch.so
,libtrtorchrt.so
andlibtrtorch_plugins.so
have been renamed tolibtorchtrt.so
,libtorchtrt_runtime.so
andlibtorchtrt_plugins.so
respectively.CLI:
trtorch
has been renamed totorchtrtc
Stabilized APIs
Python
Many of the APIs have change slighly in this release to be more self consistent and more usable. These changes begin with the Python API for which
compile
,convert_method_to_trt_engine
andTensorRTCompileSpec
now instead of dictionaries use kwargs. As features many features came out of beta and experimental stability the necessity to have multiple levels of nesting in settings has decreased, therefore kwargs make much more sense. You can simply port forward to the new APIs by unwrapping your existingcompile_spec
dict in the arguments tocompile
or similar functions.Example:
This release also introduces support for providing tensors as examples to Torch-TensorRT. In place of a
torch_tensorrt.Input
in the list of inputs you can pass a Tensor. This can only be used to set a static input size. There are also some things to be aware of which will be discussed later in the release notes.Now that Torch-TensorRT separates components specific to particular IRs to their own namespaces, there is now a replacement for the old
compile
andconvert_method_to_trt_engine
functions on the top level. These functions take any PyTorch generated format includingtorch.nn.Module
s and decides the best way to compile it down to TensorRT. In v1.0.0 this means to go through TorchScript and return aTorch.jit.ScriptModule
. You can specify the IR to try using their
arg for these functions.Due to partial compilation becoming stable in v1.0.0, there are now four new fields which replace the old
torch_fallback
struct.C++
The changes for the C++ API other than the reorganization and renaming of the namespaces, mostly serve to make Torch-TensorRT consistent between Python and C++ namely by renaming
trtorch::CompileGraph
totorch_tensorrt::ts::compile
andtrtorch::ConvertGraphToTRTEngine
totorch_tensorrt::ts::convert_method_to_trt_engine
. Beyond that similar to Python, the partial compilation structTorchFallback
has been removed and replaced by four fields intorch_tensorrt::ts::CompileSpec
CLI
Similarly these partial compilation fields have been renamed in
torchtrtc
:Going forward breaking changes to the API the sort of magnitude seen in this release will be accompanied by a major version bump.
Stabilized Partial Compilation
Partial compilation should be considered stable for static input shape and is now enabled by default. In the case of dynamic shape, set
require_full_compilation
toTrue
.Adjusted Defaults
Input Types
Default behavior of Torch-TensorRT has shifted slightly. The most important of these changes is the changes to inferred input type. In prior versions the expected input type for a Tensor barring it being set explicitly was based on the
op_precision
. With that field being removed in this release and being replaced withenabled_precisions
introduced in v0.4.0 this sort of behavior no longer makes sense. Therefore now Torch-TensorRT follows these rules to determine Input type for a Tensor.If no dtype is specified for an Input, Torch-TensorRT will determine the input type by inspecting the uses of this Input. It will trace the lifetime of this tensor to the first tensor operation using weights stored in the provided module. The type of the weights is the inferred type of the Input using the rule that PyTorch requires like types for Tensor operations. The goal with this behavior is to maintain the concept that Torch-TensorRT modules should feel no different than normal PyTorch modules. Therefore you can expect
Users can override this behavior to set the Input type to whatever they wish using the
dtype
field oftorch_tensorrt.Input
. Torch-TensorRT will always respect the user setting but may throw a warning stating that the model provided expects a different input type. This is mainly to notify you that just dropping the compiled module in place of the rawtorch.nn.Module
might throw errors and casting before inference might be necessary.Input(shape=(1, 3, 32, 32), dtype=dtype.half, format=TensorFormat.contiguous)
. This is subject to the behavior in 2.Workspace Size
Now by default the workspace size is set to 1GB for all GPUs Pascal based and newer (SM capability 6 or above). Maxwell and older cards including Jetson Nano have a workspace of 256MB by default. This value is user settable.
Dependencies
1.0.0 (2021-11-09)
Bug Fixes
Features
Add functionality for tests to use precompiled libraries (b5c324a)
Add QAT patch which modifies scale factor dtype to INT32 (4a10673)
Add TF32 override flag in bazelrc for CI-Testing (7a0c9a5)
Add VGG QAT sample notebook which demonstrates end-end workflow for QAT models (8bf6dd6)
Augment python package to include bin, lib, include directories (ddc0685)
handle scalar type of size [] in shape_analysis (fca53ce)
support aten::and.bool evaluator (6d73e43)
support aten::conv1d and aten::conv_transpose1d (c8dc6e9)
support aten::eq.str evaluator (5643972)
support setting input types of subgraph in fallback, handle Tensor type in evaluated_value_map branch in MarkOutputs (4778b2b)
support truncate_long_and_double in fallback subgraph input type (0bc3c05)
Update documentation with new library name Torch-TensorRT (e5f96d9)
Updating the pre_built to prebuilt (51412c7)
//:libtrtorch: Ship a WORKSPACE file and BUILD file with the (7ac6f1c)
//core/partitioning: Improved logging and code org for the (8927e77)
//cpp: Adding example tensors as a way to set input spec (70a7bb3)
//py: Add the git revision to non release builds (4a0a918)
//py: Allow example tensors from torch to set shape (01d525d)
feat!: Changing the default behavior for selecting the input type (a234335)
refactor!: Removing deprecated InputRange, op_precision and input_shapes (621bc67)
feat(//py)!: Porting forward the API to use kwargs (17e0e8a)
refactor(//py)!: Kwargs updates and support for shifting internal apis (2a0d1c8)
refactor!(//cpp): Inlining partial compilation settings since the (19ecc64)
refactor! : Update default workspace size based on platforms. (391a4c0)
feat!: Turning on partial compilation by default (52e2f05)
refactor!: API level rename (483ef59)
refactor!: Changing the C++ api to be snake case (f34e230)
refactor! : Update Pytorch version to 1.10 (cc7d0b7)
refactor!: Updating bazel version for py build container (06533fe)
BREAKING CHANGES
input shape fields which were deprecated in TRTorch v0.4.0
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com
to build Torch-TensorRT to 4.2.1.
This was done since the only version of bazel available
in our build container for python apis is 4.2.1
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com
from a dictionary of settings to a set of kwargs for the various
compilation functions. This will break existing code. However
there is simple guidance to port forward your code:
Given a dict of valid TRTorch CompileSpec settings
You can use this same dict with the new APIs by changing your code from:
to:
which will unpack the dictionary as arguments to the function
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com
arguements to a set of kwargs. You can port forward using
Also in preparation for partial compilation to be enabled by default
settings related to torch fallback have been moved to the top level
instead of
now there are new settings
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com
to inline settings regarding partial compilation in preparation
for it to be turned on by default
Now in the compile spec instead of a
torch_fallback
field with itsassociated struct, there are four new fields in the compile spec
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com
Signed-off-by: Dheeraj Peri peri.dheeraj@gmail.com
Signed-off-by: Dheeraj Peri peri.dheeraj@gmail.com
Signed-off-by: Dheeraj Peri peri.dheeraj@gmail.com
Signed-off-by: Dheeraj Peri peri.dheeraj@gmail.com
Signed-off-by: Dheeraj Peri peri.dheeraj@gmail.com
by default. Unsupported modules will attempt to be
run partially in PyTorch and partially in TensorRT
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com
TRTorch/Torch-TensorRT APIs. Now torchscript specific functions
are segregated into their own torch_tensorrt::torchscript /
torch_tensorrt.ts namespaces. Generic utils will remain in the
torch_tensorrt namespace. Guidance on how to port forward will follow in
the next commits
APIs to be snake case and for CompileModules to
become just compile
Signed-off-by: Naren Dasan narens@nvidia.com
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Dheeraj Peri peri.dheeraj@gmail.com
Signed-off-by: Dheeraj Peri peri.dheeraj@gmail.com
the compiler where if the user does not specify an input data
type explicity instead of using the enabled precision, now
the compiler will inspect the model provided to infer the
data type for the input that will not cause an error if
the model was run in torch. In practice this means
then default input type is FP32
then default input type is FP16
If the data type cannot be determined the compiler will
default to FP32.
This calculation is done per input tensor so if one input
is inferred to use FP32 and another INT32 then the expected
types will be the same (FP32, INT32)
As was the same before if the user defines the data type
explicitly or provides an example tensor the data type
specified there will be respected
Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com
Operators Supported
Operators Currently Supported Through Converters
Operators Currently Supported Through Evaluators
Device? device=None, bool? pin_memory=None) -> (Tensor)
Layout? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor)
Layout? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor)
This discussion was created from the release Torch-TensorRT v1.0.0.
Beta Was this translation helpful? Give feedback.
All reactions