From b0937fb094fdd4781a5717b5dcc427982bd82353 Mon Sep 17 00:00:00 2001 From: Naren Dasan Date: Mon, 13 May 2024 16:45:10 -0700 Subject: [PATCH] tool: Opset coverage notebook Signed-off-by: Naren Dasan Signed-off-by: Naren Dasan --- .gitignore | 5 + .../classtorch__tensorrt_1_1DataType.html | 52 +- ...rch__tensorrt_1_1Device_1_1DeviceType.html | 36 +- .../classtorch__tensorrt_1_1TensorFormat.html | 56 +- ...ensorrt_1_1ptq_1_1Int8CacheCalibrator.html | 46 +- ...ch__tensorrt_1_1ptq_1_1Int8Calibrator.html | 59 +- ...8h_1a18d295a837ac71add5578860b55e5502.html | 16 +- ...8h_1a282fd3c0b1c3a215148ae372070e1268.html | 12 +- ...8h_1a31398a6d4d27e28817afb0f0139e909e.html | 12 +- ...8h_1a35703561b26b1a9d2738ad7d58b27827.html | 12 +- ...8h_1abd1465eb38256d3f22cc1426b23d516b.html | 12 +- ...8h_1abe87b341f562fd1cf40b7672e4d759da.html | 12 +- ...8h_1ad19939408f7be171a74a89928b36eb59.html | 12 +- ...8h_1adad592a7b1b7eed529cdf6acd584c883.html | 12 +- docs/_cpp_api/dir_cpp.html | 12 +- docs/_cpp_api/dir_cpp_include.html | 12 +- .../dir_cpp_include_torch_tensorrt.html | 12 +- ...8h_1a130f65408ad8cbaee060f05e8db69558.html | 240 +++--- ...8h_1a3fbe5d72e4fc624dbd038853079620eb.html | 240 +++--- ..._cpp_include_torch_tensorrt_logging.h.html | 40 +- ...e_cpp_include_torch_tensorrt_macros.h.html | 18 +- ...file_cpp_include_torch_tensorrt_ptq.h.html | 27 +- ...clude_torch_tensorrt_torch_tensorrt.h.html | 40 +- ...8h_1a0593f776f469c20469e2f729fc7861a3.html | 240 +++--- ...8h_1a0c012cb374addd90eb1f42eaec570650.html | 240 +++--- ...8h_1a56e110feaaba2c3fd44bd201fd21a76a.html | 240 +++--- ...8h_1a7cb50492421ea9de4e3db895819df6f2.html | 240 +++--- ...8h_1ac46ac0901cb97e3ae6e93b45f24e90b8.html | 240 +++--- ...8h_1ad2efd47b6c3689e58ccc595680579ae5.html | 240 +++--- ...8h_1af8f3443813315af7901903d25dd495cc.html | 240 +++--- ...8h_1a226e3c83379d1012cde8578c1c86b16c.html | 240 +++--- ...8h_1a6186e305f47c1d94b6130ef6c7f7e178.html | 240 +++--- ...8h_1a5b405fd3bf3c8fc2e2a54cbbab979797.html | 240 +++--- ...8h_1a6e19490a08fb1553c9dd347a5ae79db9.html | 240 +++--- ...8h_1a81f9783517335dda877d8cfcf38987c9.html | 240 +++--- ...8h_1ac4ab8313ae72c2c899ea31548b528528.html | 240 +++--- ...8h_1ad1acd06eaeaffbbcf6e7ebf426891384.html | 240 +++--- ...8h_1ad6a4ee8ca6c8f6e5519eb1128ec7f4a1.html | 240 +++--- ...8h_1ae8d56472106eeef37fbe51ff7f40c9b2.html | 240 +++--- docs/_cpp_api/namespace_torch.html | 238 +++--- docs/_cpp_api/namespace_torch_tensorrt.html | 30 +- .../namespace_torch_tensorrt__logging.html | 30 +- .../namespace_torch_tensorrt__ptq.html | 18 +- ...namespace_torch_tensorrt__torchscript.html | 22 +- ..._cpp_include_torch_tensorrt_logging.h.html | 88 +- ...e_cpp_include_torch_tensorrt_macros.h.html | 86 +- ...file_cpp_include_torch_tensorrt_ptq.h.html | 361 ++++---- ...clude_torch_tensorrt_torch_tensorrt.h.html | 620 +++++++------- .../structtorch__tensorrt_1_1Device.html | 38 +- .../structtorch__tensorrt_1_1GraphInputs.html | 14 +- .../structtorch__tensorrt_1_1Input.html | 52 +- ...ensorrt_1_1torchscript_1_1CompileSpec.html | 18 +- docs/_cpp_api/torch_tensort_cpp.html | 109 +-- docs/_cpp_api/unabridged_orphan.html | 19 +- .../torch_compile_resnet_example.ipynb | 2 +- .../torch_compile_advanced_usage.py | 7 +- .../torch_compile_stable_diffusion.py | 5 +- .../_rendered_examples_jupyter.zip | Bin 18728 -> 47589 bytes .../_rendered_examples_python.zip | Bin 10762 -> 36261 bytes .../torch_compile_advanced_usage.ipynb | 6 +- .../torch_compile_stable_diffusion.ipynb | 4 +- .../torch_compile_transformers_example.ipynb | 4 +- .../torch_compile_transformers_example.py | 1 + docs/_modules/index.html | 17 +- docs/_modules/torch_tensorrt/_Device.html | 167 ++-- docs/_modules/torch_tensorrt/_Input.html | 125 +-- docs/_modules/torch_tensorrt/_compile.html | 228 ++++- .../torch_tensorrt/dynamo/_SourceIR.html | 6 +- .../torch_tensorrt/dynamo/_compiler.html | 368 +++++--- .../torch_tensorrt/dynamo/_exporter.html | 245 ++++-- .../torch_tensorrt/dynamo/_settings.html | 27 +- .../torch_tensorrt/dynamo/_tracer.html | 6 +- docs/_modules/torch_tensorrt/fx/fx2trt.html | 20 +- .../torch_tensorrt/fx/input_tensor_spec.html | 6 +- docs/_modules/torch_tensorrt/fx/lower.html | 29 +- .../torch_tensorrt/fx/trt_module.html | 24 +- docs/_modules/torch_tensorrt/logging.html | 278 ++++--- .../_sources/_cpp_api/dir_cpp_include.rst.txt | 1 + .../dir_cpp_include_torch_tensorrt.rst.txt | 1 + ...p_include_torch_tensorrt_logging.h.rst.txt | 27 + ...pp_include_torch_tensorrt_macros.h.rst.txt | 1 + ...e_cpp_include_torch_tensorrt_ptq.h.rst.txt | 10 + ...de_torch_tensorrt_torch_tensorrt.h.rst.txt | 27 + .../_cpp_api/namespace_torch_tensorrt.rst.txt | 8 +- .../namespace_torch_tensorrt__logging.rst.txt | 16 +- .../namespace_torch_tensorrt__ptq.rst.txt | 4 +- ...espace_torch_tensorrt__torchscript.rst.txt | 8 +- ...p_include_torch_tensorrt_logging.h.rst.txt | 2 +- ...pp_include_torch_tensorrt_macros.h.rst.txt | 4 +- ...e_cpp_include_torch_tensorrt_ptq.h.rst.txt | 7 +- ...de_torch_tensorrt_torch_tensorrt.h.rst.txt | 2 +- .../structtorch__tensorrt_1_1Input.rst.txt | 2 +- .../getting_started/installation.rst.txt | 25 +- docs/_sources/index.rst.txt | 1 + docs/_sources/py_api/dynamo.rst.txt | 2 + docs/_sources/py_api/torch_tensorrt.rst.txt | 6 +- .../_rendered_examples/dynamo/index.rst.txt | 19 + .../torch_compile_advanced_usage.rst.txt | 15 +- .../torch_compile_stable_diffusion.rst.txt | 11 +- ...torch_compile_transformers_example.rst.txt | 17 +- .../_rendered_examples/index.rst.txt | 18 + .../_sources/user_guide/saving_models.rst.txt | 62 +- docs/_static/basic.css | 40 +- docs/_static/collapsible-lists/LICENSE.md | 11 +- docs/_static/css/theme.css | 5 +- docs/_static/doctools.js | 480 +++++------ docs/_static/documentation_options.js | 6 +- docs/_static/jquery.js | 4 +- docs/_static/language_data.js | 100 +-- docs/_static/searchtools.js | 784 +++++++++--------- docs/cli/torchtrtc.html | 10 +- docs/contributors/conversion.html | 14 +- docs/contributors/dynamo_converters.html | 26 +- docs/contributors/lowering.html | 50 +- docs/contributors/partitioning.html | 22 +- docs/contributors/phases.html | 18 +- docs/contributors/runtime.html | 20 +- docs/contributors/system_overview.html | 20 +- docs/contributors/ts_converters.html | 28 +- docs/contributors/useful_links.html | 22 +- .../writing_dynamo_aten_lowering_passes.html | 18 +- docs/dynamo/dynamo_export.html | 16 +- docs/dynamo/torch_compile.html | 36 +- docs/fx/getting_started_with_fx_path.html | 18 +- docs/genindex.html | 210 ++--- .../getting_started_with_windows.html | 18 +- docs/getting_started/installation.html | 74 +- docs/index.html | 36 +- docs/indices/supported_ops.html | 14 +- docs/objects.inv | Bin 28336 -> 28070 bytes docs/py-modindex.html | 18 +- docs/py_api/dynamo.html | 85 +- docs/py_api/fx.html | 14 +- docs/py_api/logging.html | 138 +-- docs/py_api/ptq.html | 122 +-- docs/py_api/torch_tensorrt.html | 133 +-- docs/py_api/ts.html | 280 +------ docs/search.html | 8 +- docs/searchindex.js | 2 +- .../pytorch-sphinx-theme/docs/changelog.html | 10 +- .../docs/configuring.html | 24 +- .../pytorch-sphinx-theme/docs/demo/api.html | 16 +- .../pytorch-sphinx-theme/docs/demo/demo.html | 70 +- .../docs/demo/lists_tables.html | 38 +- .../pytorch-sphinx-theme/docs/demo/long.html | 84 +- .../docs/demo/structure.html | 24 +- docs/src/pytorch-sphinx-theme/docs/index.html | 8 +- .../pytorch-sphinx-theme/docs/installing.html | 12 +- ...creating_torchscript_module_in_python.html | 14 +- docs/ts/getting_started_with_cpp_api.html | 24 +- docs/ts/getting_started_with_python_api.html | 10 +- .../ts/torchscript_frontend_from_pytorch.html | 10 +- .../_rendered_examples/dynamo/index.html | 14 +- .../dynamo/torch_compile_advanced_usage.html | 27 +- .../dynamo/torch_compile_resnet_example.html | 22 +- .../torch_compile_stable_diffusion.html | 23 +- .../torch_compile_transformers_example.html | 23 +- docs/tutorials/_rendered_examples/index.html | 16 +- docs/tutorials/notebooks.html | 36 +- .../serving_torch_tensorrt_with_triton.html | 16 +- docs/user_guide/dynamic_shapes.html | 20 +- docs/user_guide/ptq.html | 16 +- docs/user_guide/runtime.html | 18 +- docs/user_guide/saving_models.html | 83 +- docs/user_guide/using_dla.html | 10 +- .../dynamo/tools/opset_coverage.py | 9 + tools/opset_coverage.ipynb | 602 ++++++++++++++ 167 files changed, 7212 insertions(+), 5416 deletions(-) create mode 100644 tools/opset_coverage.ipynb diff --git a/.gitignore b/.gitignore index 918b69b27c..c8dd4c5308 100644 --- a/.gitignore +++ b/.gitignore @@ -69,3 +69,8 @@ bazel-tensorrt bazel-project build/ wheelhouse/ +*_status.json +tests/py/dynamo/models/*.ts +tests/py/dynamo/models/*.ep +*.deb +*.tar.xz \ No newline at end of file diff --git a/docs/_cpp_api/classtorch__tensorrt_1_1DataType.html b/docs/_cpp_api/classtorch__tensorrt_1_1DataType.html index c6a0d0c0bc..b34b75d496 100644 --- a/docs/_cpp_api/classtorch__tensorrt_1_1DataType.html +++ b/docs/_cpp_api/classtorch__tensorrt_1_1DataType.html @@ -10,7 +10,7 @@ - Class DataType — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Class DataType — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -237,7 +237,7 @@
- v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
@@ -304,6 +304,9 @@
  • Compiling a Transformer using torch.compile and TensorRT
  • Torch Compile Advanced Usage
  • Torch Compile Stable Diffusion
  • +
  • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
  • +
  • Wrapping Custom Kernels to use in TensorRT
  • +
  • Using Torch-TensorRT to Insert the Kernel
  • Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    + - +
      @@ -590,9 +642,9 @@

      Resources

    @@ -629,9 +681,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -691,7 +743,7 @@

    Resources

  • Resources
  • - + - - + + @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Enum EngineCapability

    +

    Enum EngineCapability

    -

    Enum Documentation

    +

    Enum Documentation

    enum class torch_tensorrt::EngineCapability : int8_t
    @@ -401,25 +452,25 @@

    Enum Documentation - + - - + + - +

    + - - +
    - +

    @@ -427,11 +478,11 @@

    Enum DocumentationSphinx using a theme provided by Read the Docs.

    - + @@ -454,23 +505,24 @@

    Enum Documentation + + - - + @@ -481,7 +533,7 @@

    Enum DocumentationTwitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -569,9 +621,9 @@

      Resources

    @@ -608,9 +660,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -670,7 +722,7 @@

    Resources

  • Resources
  • - +

    Python API Documenation

    -

    Definition (cpp/include/torch_tensorrt/logging.h)

    +

    Definition (cpp/include/torch_tensorrt/logging.h)

    -

    Includes

    +

    Includes

    -

    Included By

    +

    Included By

    -

    Namespaces

    +

    Namespaces

    +
    +

    Enums

    + +
    +
    +

    Functions

    + +
    @@ -492,6 +515,8 @@

    NamespacesIncludes
  • Included By
  • Namespaces
  • +
  • Enums
  • +
  • Functions
  • @@ -512,6 +537,7 @@

    Namespaces + diff --git a/docs/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.html b/docs/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.html index fc44e9f539..5085008489 100644 --- a/docs/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.html +++ b/docs/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.html @@ -10,7 +10,7 @@ - File macros.h — Torch-TensorRT v2.3.0.dev0+85971ff documentation + File macros.h — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -235,7 +235,7 @@
    - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
    @@ -302,6 +302,9 @@
  • Compiling a Transformer using torch.compile and TensorRT
  • Torch Compile Advanced Usage
  • Torch Compile Stable Diffusion
  • +
  • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
  • +
  • Wrapping Custom Kernels to use in TensorRT
  • +
  • Using Torch-TensorRT to Insert the Kernel
  • Python API Documenation

      @@ -412,7 +415,7 @@
      -

      File macros.h

      +

      File macros.h

      Parent directory (cpp/include/torch_tensorrt)

      Contents

      @@ -424,7 +427,7 @@

    -

    Definition (cpp/include/torch_tensorrt/macros.h)

    +

    Definition (cpp/include/torch_tensorrt/macros.h)

    -

    Included By

    +

    Included By

    -

    Defines

    +

    Defines

    Python API Documenation

    -

    Definition (cpp/include/torch_tensorrt/ptq.h)

    +

    Definition (cpp/include/torch_tensorrt/ptq.h)

    -

    Includes

    +

    Includes

    -

    Classes

    +

    Classes

    +
    +

    Functions

    + +
    @@ -502,6 +513,7 @@

    ClassesIncludes
  • Namespaces
  • Classes
  • +
  • Functions
  • @@ -522,6 +534,7 @@

    Classes + diff --git a/docs/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.html b/docs/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.html index a4be8fad24..e8b37aeb69 100644 --- a/docs/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.html +++ b/docs/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.html @@ -10,7 +10,7 @@ - File torch_tensorrt.h — Torch-TensorRT v2.3.0.dev0+85971ff documentation + File torch_tensorrt.h — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -235,7 +235,7 @@
    - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
    @@ -302,6 +302,9 @@
  • Compiling a Transformer using torch.compile and TensorRT
  • Torch Compile Advanced Usage
  • Torch Compile Stable Diffusion
  • +
  • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
  • +
  • Wrapping Custom Kernels to use in TensorRT
  • +
  • Using Torch-TensorRT to Insert the Kernel
  • Python API Documenation

    -

    Definition (cpp/include/torch_tensorrt/torch_tensorrt.h)

    +

    Definition (cpp/include/torch_tensorrt/torch_tensorrt.h)

    -

    Includes

    +

    Includes

    -

    Classes

    +

    Classes

    +
    +

    Enums

    + +
    +
    +

    Functions

    + +
    @@ -504,6 +527,8 @@

    ClassesIncludes
  • Namespaces
  • Classes
  • +
  • Enums
  • +
  • Functions
  • @@ -524,6 +549,7 @@

    Classes + diff --git a/docs/_cpp_api/function_logging_8h_1a0593f776f469c20469e2f729fc7861a3.html b/docs/_cpp_api/function_logging_8h_1a0593f776f469c20469e2f729fc7861a3.html index c3e3443a38..e6d49b265a 100644 --- a/docs/_cpp_api/function_logging_8h_1a0593f776f469c20469e2f729fc7861a3.html +++ b/docs/_cpp_api/function_logging_8h_1a0593f776f469c20469e2f729fc7861a3.html @@ -9,38 +9,48 @@ + + Function torch_tensorrt::logging::get_logging_prefix — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation + - Function torch_tensorrt::logging::get_logging_prefix — Torch-TensorRT v1.4.0+7d1d80773 documentation - - - - - - - - - - - + + + + + + + + + + + + + + + - - - - - - + + + + + + @@ -81,7 +91,19 @@
  • - Mobile +
  • @@ -196,9 +218,9 @@ + - - +
  • Operators Supported
  • - - + + @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::logging::get_logging_prefix

    +

    Function torch_tensorrt::logging::get_logging_prefix

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API std::string torch_tensorrt::logging::get_logging_prefix()
    @@ -384,25 +435,25 @@

    Function Documentation - +

    @@ -437,23 +488,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -464,7 +516,7 @@

    Function Documentation + @@ -528,7 +580,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -552,9 +604,9 @@

      Resources

    @@ -591,9 +643,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -653,7 +705,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::logging::get_reportable_log_level

    +

    Function torch_tensorrt::logging::get_reportable_log_level

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API Level torch_tensorrt::logging::get_reportable_log_level()
    @@ -390,25 +441,25 @@

    Function Documentation - +

    @@ -443,23 +494,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -470,7 +522,7 @@

    Function Documentation + @@ -534,7 +586,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -558,9 +610,9 @@

      Resources

    @@ -597,9 +649,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -659,7 +711,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::logging::get_is_colored_output_on

    +

    Function torch_tensorrt::logging::get_is_colored_output_on

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API bool torch_tensorrt::logging::get_is_colored_output_on()
    @@ -390,25 +441,25 @@

    Function Documentation - +

    @@ -443,23 +494,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -470,7 +522,7 @@

    Function Documentation + @@ -534,7 +586,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -558,9 +610,9 @@

      Resources

    @@ -597,9 +649,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -659,7 +711,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::logging::set_reportable_log_level

    +

    Function torch_tensorrt::logging::set_reportable_log_level

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API void torch_tensorrt::logging::set_reportable_log_level(Level lvl)
    @@ -390,25 +441,25 @@

    Function Documentation - +

    @@ -443,23 +494,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -470,7 +522,7 @@

    Function Documentation + @@ -534,7 +586,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -558,9 +610,9 @@

      Resources

    @@ -597,9 +649,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -659,7 +711,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::logging::log

    +

    Function torch_tensorrt::logging::log

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API void torch_tensorrt::logging::log(Level lvl, std::string msg)
    @@ -393,25 +444,25 @@

    Function Documentation - +

    @@ -446,23 +497,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -473,7 +525,7 @@

    Function Documentation + @@ -537,7 +589,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -561,9 +613,9 @@

      Resources

    @@ -600,9 +652,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -662,7 +714,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::logging::set_is_colored_output_on

    +

    Function torch_tensorrt::logging::set_is_colored_output_on

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API void torch_tensorrt::logging::set_is_colored_output_on(bool colored_output_on)
    @@ -390,25 +441,25 @@

    Function Documentation - +

    @@ -443,23 +494,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -470,7 +522,7 @@

    Function Documentation + @@ -534,7 +586,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -558,9 +610,9 @@

      Resources

    @@ -597,9 +649,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -659,7 +711,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::logging::set_logging_prefix

    +

    Function torch_tensorrt::logging::set_logging_prefix

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API void torch_tensorrt::logging::set_logging_prefix(std::string prefix)
    @@ -384,25 +435,25 @@

    Function Documentation - +

    @@ -437,23 +488,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -464,7 +516,7 @@

    Function Documentation + @@ -528,7 +580,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -552,9 +604,9 @@

      Resources

    @@ -591,9 +643,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -653,7 +705,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Template Function torch_tensorrt::ptq::make_int8_cache_calibrator

    +

    Template Function torch_tensorrt::ptq::make_int8_cache_calibrator

    -

    Function Documentation

    +

    Function Documentation

    template<typename Algorithm = nvinfer1::IInt8EntropyCalibrator2>
    inline Int8CacheCalibrator<Algorithm> torch_tensorrt::ptq::make_int8_cache_calibrator(const std::string &cache_file_path)
    @@ -399,25 +450,25 @@

    Function Documentation - +

    @@ -452,23 +503,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -479,7 +531,7 @@

    Function Documentation + @@ -543,7 +595,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -567,9 +619,9 @@

      Resources

    @@ -606,9 +658,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -668,7 +720,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Template Function torch_tensorrt::ptq::make_int8_calibrator

    +

    Template Function torch_tensorrt::ptq::make_int8_calibrator

    -

    Function Documentation

    +

    Function Documentation

    template<typename Algorithm = nvinfer1::IInt8EntropyCalibrator2, typename DataLoader>
    inline Int8Calibrator<Algorithm, DataLoader> torch_tensorrt::ptq::make_int8_calibrator(DataLoader dataloader, const std::string &cache_file_path, bool use_cache)
    @@ -405,25 +456,25 @@

    Function Documentation - +

    @@ -458,23 +509,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -485,7 +537,7 @@

    Function Documentation + @@ -549,7 +601,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -573,9 +625,9 @@

      Resources

    @@ -612,9 +664,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -674,7 +726,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::torchscript::check_method_operator_support

    +

    Function torch_tensorrt::torchscript::check_method_operator_support

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API bool torch_tensorrt::torchscript::check_method_operator_support(const torch::jit::Module &module, std::string method_name)
    @@ -399,25 +450,25 @@

    Function Documentation - +

    @@ -452,23 +503,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -479,7 +531,7 @@

    Function Documentation + @@ -543,7 +595,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -567,9 +619,9 @@

      Resources

    @@ -606,9 +658,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -668,7 +720,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::torchscript::compile

    +

    Function torch_tensorrt::torchscript::compile

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API torch::jit::Module torch_tensorrt::torchscript::compile(const torch::jit::Module &module, CompileSpec info)
    @@ -399,25 +450,25 @@

    Function Documentation - +

    @@ -452,23 +503,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -479,7 +531,7 @@

    Function Documentation + @@ -543,7 +595,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -567,9 +619,9 @@

      Resources

    @@ -606,9 +658,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -668,7 +720,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::torchscript::embed_engine_in_new_module

    +

    Function torch_tensorrt::torchscript::embed_engine_in_new_module

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API torch::jit::Module torch_tensorrt::torchscript::embed_engine_in_new_module(const std::string &engine, Device device, const std::vector<std::string> &input_binding_names = std::vector<std::string>(), const std::vector<std::string> &output_binding_names = std::vector<std::string>())
    @@ -405,25 +456,25 @@

    Function Documentation - +

    @@ -458,23 +509,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -485,7 +537,7 @@

    Function Documentation + @@ -549,7 +601,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -573,9 +625,9 @@

      Resources

    @@ -612,9 +664,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -674,7 +726,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::get_build_info

    +

    Function torch_tensorrt::get_build_info

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API std::string torch_tensorrt::get_build_info()
    @@ -390,25 +441,25 @@

    Function Documentation - +

    @@ -443,23 +494,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -470,7 +522,7 @@

    Function Documentation + @@ -534,7 +586,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -558,9 +610,9 @@

      Resources

    @@ -597,9 +649,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -659,7 +711,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::set_device

    +

    Function torch_tensorrt::set_device

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API void torch_tensorrt::set_device(const int gpu_id)
    @@ -390,25 +441,25 @@

    Function Documentation - +

    @@ -443,23 +494,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -470,7 +522,7 @@

    Function Documentation + @@ -534,7 +586,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -558,9 +610,9 @@

      Resources

    @@ -597,9 +649,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -659,7 +711,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::dump_build_info

    +

    Function torch_tensorrt::dump_build_info

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API void torch_tensorrt::dump_build_info()
    @@ -385,25 +436,25 @@

    Function Documentation - +

    @@ -438,23 +489,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -465,7 +517,7 @@

    Function Documentation + @@ -529,7 +581,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -553,9 +605,9 @@

      Resources

    @@ -592,9 +644,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -654,7 +706,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,20 +407,24 @@
    - - + + + + +
    - +
    - +
    -

    Function torch_tensorrt::torchscript::convert_method_to_trt_engine

    +

    Function torch_tensorrt::torchscript::convert_method_to_trt_engine

    -

    Function Documentation

    +

    Function Documentation

    TORCHTRT_API std::string torch_tensorrt::torchscript::convert_method_to_trt_engine(const torch::jit::Module &module, std::string method_name, CompileSpec info)
    @@ -399,25 +450,25 @@

    Function Documentation - +

    @@ -452,23 +503,24 @@

    Function Documentation

    + + - - - + + + - - + @@ -479,7 +531,7 @@

    Function Documentation + @@ -543,7 +595,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +

    - +
      @@ -567,9 +619,9 @@

      Resources

    @@ -606,9 +658,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -668,7 +720,7 @@

    Resources

  • Resources
  • - + - - + +
    @@ -304,7 +351,7 @@
    - + @@ -323,32 +370,32 @@
    - +
    @@ -360,38 +407,42 @@
    - - + + + + +
    - +
    - +
    -

    Namespace torch

    +

    Namespace torch

    - +
    @@ -423,23 +474,24 @@
    + + - - - + + + - - + @@ -450,7 +502,7 @@ jQuery(function () { SphinxRtdTheme.Navigation.enable(true); }); - + @@ -514,7 +566,7 @@

    Resources

  • Twitter
  • YouTube
  • LinkedIn
  • - +
    - +
      @@ -538,9 +590,9 @@

      Resources

    @@ -577,9 +629,9 @@

    Resources

  • Ecosystem
  • - +
  • - Mobile + Mobile
  • @@ -639,7 +691,7 @@

    Resources

  • Resources
  • - +
    • diff --git a/docs/_cpp_api/namespace_torch_tensorrt.html b/docs/_cpp_api/namespace_torch_tensorrt.html index 6ae0c90880..6ca314035d 100644 --- a/docs/_cpp_api/namespace_torch_tensorrt.html +++ b/docs/_cpp_api/namespace_torch_tensorrt.html @@ -10,7 +10,7 @@ - Namespace torch_tensorrt — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Namespace torch_tensorrt — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -40,7 +40,7 @@ - + + diff --git a/docs/_cpp_api/namespace_torch_tensorrt__logging.html b/docs/_cpp_api/namespace_torch_tensorrt__logging.html index cb545a7e95..8ef1676e8f 100644 --- a/docs/_cpp_api/namespace_torch_tensorrt__logging.html +++ b/docs/_cpp_api/namespace_torch_tensorrt__logging.html @@ -10,7 +10,7 @@ - Namespace torch_tensorrt::logging — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Namespace torch_tensorrt::logging — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -237,7 +237,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -304,6 +304,9 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
    • +
    • Wrapping Custom Kernels to use in TensorRT
    • +
    • Using Torch-TensorRT to Insert the Kernel

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

      @@ -412,47 +415,47 @@
      -

      Program Listing for File logging.h

      +

      Program Listing for File logging.h

      Return to documentation for file (cpp/include/torch_tensorrt/logging.h)

      -
      /*
      - * Copyright (c) NVIDIA Corporation.
      - * All rights reserved.
      - *
      - * This library is licensed under the BSD-style license found in the
      - * LICENSE file in the root directory of this source tree.
      - */
      -#pragma once
      -
      -#include <string>
      -#include "torch_tensorrt/macros.h"
      -
      -namespace torch_tensorrt {
      -namespace logging {
      -enum Level {
      -  kINTERNAL_ERROR,
      -  kERROR,
      -  kWARNING,
      -  kINFO,
      -  kDEBUG,
      -  kGRAPH,
      -};
      -
      -// Are these ones necessary for the user?
      -TORCHTRT_API std::string get_logging_prefix();
      -TORCHTRT_API void set_logging_prefix(std::string prefix);
      -
      -TORCHTRT_API void set_reportable_log_level(Level lvl);
      -
      -TORCHTRT_API void set_is_colored_output_on(bool colored_output_on);
      -
      -TORCHTRT_API Level get_reportable_log_level();
      -
      -TORCHTRT_API bool get_is_colored_output_on();
      -
      -// Dont know if we want this?
      -TORCHTRT_API void log(Level lvl, std::string msg);
      -} // namespace logging
      -} // namespace torch_tensorrt
      +
      /*
      + * Copyright (c) NVIDIA Corporation.
      + * All rights reserved.
      + *
      + * This library is licensed under the BSD-style license found in the
      + * LICENSE file in the root directory of this source tree.
      + */
      +#pragma once
      +
      +#include <string>
      +#include "torch_tensorrt/macros.h"
      +
      +namespace torch_tensorrt {
      +namespace logging {
      +enum Level {
      +  kINTERNAL_ERROR,
      +  kERROR,
      +  kWARNING,
      +  kINFO,
      +  kDEBUG,
      +  kGRAPH,
      +};
      +
      +// Are these ones necessary for the user?
      +TORCHTRT_API std::string get_logging_prefix();
      +TORCHTRT_API void set_logging_prefix(std::string prefix);
      +
      +TORCHTRT_API void set_reportable_log_level(Level lvl);
      +
      +TORCHTRT_API void set_is_colored_output_on(bool colored_output_on);
      +
      +TORCHTRT_API Level get_reportable_log_level();
      +
      +TORCHTRT_API bool get_is_colored_output_on();
      +
      +// Dont know if we want this?
      +TORCHTRT_API void log(Level lvl, std::string msg);
      +} // namespace logging
      +} // namespace torch_tensorrt
       
      @@ -510,6 +513,7 @@ + diff --git a/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.html b/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.html index c733e0bb7a..47f77b4c0e 100644 --- a/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.html +++ b/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.html @@ -10,7 +10,7 @@ - Program Listing for File macros.h — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Program Listing for File macros.h — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -235,7 +235,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -302,6 +302,9 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
    • +
    • Wrapping Custom Kernels to use in TensorRT
    • +
    • Using Torch-TensorRT to Insert the Kernel

    Python API Documenation

      @@ -412,46 +415,46 @@
      -

      Program Listing for File macros.h

      +

      Program Listing for File macros.h

      Return to documentation for file (cpp/include/torch_tensorrt/macros.h)

      -
      /*
      - * Copyright (c) NVIDIA Corporation.
      - * All rights reserved.
      - *
      - * This library is licensed under the BSD-style license found in the
      - * LICENSE file in the root directory of this source tree.
      - */
      -#pragma once
      -
      -#if defined(USE_CMAKE_GENERATED_EXPORT_HEADER)
      -#include <torch_tensorrt_export.h>
      -#else
      -#if defined(__GNUC__)
      -#define TORCHTRT_API __attribute__((__visibility__("default")))
      -#define TORCHTRT_HIDDEN __attribute__((__visibility__("hidden")))
      -#else
      -#define TORCHTRT_API
      -#define TORCHTRT_HIDDEN
      -#endif // defined(__GNUC__)
      -#endif // defined(USE_CMAKE_GENERATED_EXPORT_HEADER)
      -
      -// Does this need to be gaurded or something?
      -#define XSTR(x) #x
      -#define STR(x) XSTR(x)
      -
      -#define TORCH_TENSORRT_MAJOR_VERSION 2
      -#define TORCH_TENSORRT_MINOR_VERSION 3
      -#define TORCH_TENSORRT_PATCH_VERSION 0
      -#define TORCH_TENSORRT_VERSION      \
      -  STR(TORCH_TENSORRT_MAJOR_VERSION) \
      -  "." STR(TORCH_TENSORRT_MINOR_VERSION) "." STR(TORCH_TENSORRT_PATCH_VERSION)
      -
      -// Setup namespace aliases for ease of use
      -namespace torch_tensorrt {
      -namespace torchscript {}
      -namespace ts = torchscript;
      -} // namespace torch_tensorrt
      -namespace torchtrt = torch_tensorrt;
      +
      /*
      + * Copyright (c) NVIDIA Corporation.
      + * All rights reserved.
      + *
      + * This library is licensed under the BSD-style license found in the
      + * LICENSE file in the root directory of this source tree.
      + */
      +#pragma once
      +
      +#if defined(USE_CMAKE_GENERATED_EXPORT_HEADER)
      +#include <torch_tensorrt_export.h>
      +#else
      +#if defined(__GNUC__)
      +#define TORCHTRT_API __attribute__((__visibility__("default")))
      +#define TORCHTRT_HIDDEN __attribute__((__visibility__("hidden")))
      +#else
      +#define TORCHTRT_API
      +#define TORCHTRT_HIDDEN
      +#endif // defined(__GNUC__)
      +#endif // defined(USE_CMAKE_GENERATED_EXPORT_HEADER)
      +
      +// Does this need to be gaurded or something?
      +#define XSTR(x) #x
      +#define STR(x) XSTR(x)
      +
      +#define TORCH_TENSORRT_MAJOR_VERSION 2
      +#define TORCH_TENSORRT_MINOR_VERSION 4
      +#define TORCH_TENSORRT_PATCH_VERSION 0
      +#define TORCH_TENSORRT_VERSION      \
      +  STR(TORCH_TENSORRT_MAJOR_VERSION) \
      +  "." STR(TORCH_TENSORRT_MINOR_VERSION) "." STR(TORCH_TENSORRT_PATCH_VERSION)
      +
      +// Setup namespace aliases for ease of use
      +namespace torch_tensorrt {
      +namespace torchscript {}
      +namespace ts = torchscript;
      +} // namespace torch_tensorrt
      +namespace torchtrt = torch_tensorrt;
       
      @@ -509,6 +512,7 @@ + diff --git a/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.html b/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.html index 5277714042..223204dc60 100644 --- a/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.html +++ b/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.html @@ -10,7 +10,7 @@ - Program Listing for File ptq.h — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Program Listing for File ptq.h — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -235,7 +235,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -302,6 +302,9 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
    • +
    • Wrapping Custom Kernels to use in TensorRT
    • +
    • Using Torch-TensorRT to Insert the Kernel

    Python API Documenation

      @@ -412,186 +415,181 @@
      -

      Program Listing for File ptq.h

      +

      Program Listing for File ptq.h

      Return to documentation for file (cpp/include/torch_tensorrt/ptq.h)

      -
      /*
      - * Copyright (c) NVIDIA Corporation.
      - * All rights reserved.
      - *
      - * This library is licensed under the BSD-style license found in the
      - * LICENSE file in the root directory of this source tree.
      - */
      -#pragma once
      -
      -#include <fstream>
      -#include <iostream>
      -#include <iterator>
      -#include <memory>
      -#include <sstream>
      -#include <string>
      -#include <vector>
      -
      -#include "NvInfer.h"
      -#include "torch/torch.h"
      -#include "torch_tensorrt/logging.h"
      -#include "torch_tensorrt/macros.h"
      -
      -#ifndef DOXYGEN_SHOULD_SKIP_THIS
      -namespace nvinfer1 {
      -class IInt8Calibrator;
      -class IInt8EntropyCalibrator2;
      -} // namespace nvinfer1
      -
      -namespace torch_tensorrt {
      -namespace ptq {
      -TORCHTRT_API bool get_batch_impl(void* bindings[], const char* names[], int nbBindings, torch::Tensor& data);
      -}
      -} // namespace torch_tensorrt
      -#endif // DOXYGEN_SHOULD_SKIP_THIS
      -
      -namespace torch_tensorrt {
      -namespace ptq {
      -
      -template <typename Algorithm, typename DataLoaderUniquePtr>
      -class Int8Calibrator : Algorithm {
      -  using DataLoader = typename DataLoaderUniquePtr::element_type;
      -  using Batch = typename DataLoader::super::BatchType;
      -
      - public:
      -  Int8Calibrator(DataLoaderUniquePtr dataloader, const std::string& cache_file_path, bool use_cache)
      -      : dataloader_(dataloader.get()), cache_file_path_(cache_file_path), use_cache_(use_cache) {
      -    for (auto batch : *dataloader_) {
      -      batched_data_.push_back(batch.data);
      -    }
      -    it_ = batched_data_.begin();
      -  }
      -
      -  int getBatchSize() const noexcept override {
      -    // HACK: Torch-TensorRT only uses explict batch sizing, INT8 Calibrator does not
      -    // work when reporting the batch size here and having explicity batching.
      -    // So we just report batch size 1 (warnings will still be printed out).
      -    return 1;
      -    // return static_cast<int>(dataloader_->options().batch_size);
      -  }
      -
      -  bool getBatch(void* bindings[], const char* names[], int nbBindings) noexcept override {
      -    if (it_ != batched_data_.end()) {
      -      auto status = get_batch_impl(bindings, names, nbBindings, *it_);
      -      it_ = ++it_;
      -      return status;
      -    } else {
      -      // Reset iterator if incase calibrator is going to be used again
      -      it_ = batched_data_.begin();
      -      return false;
      -    }
      -  }
      -
      -  const void* readCalibrationCache(size_t& length) noexcept override {
      -    if (use_cache_) {
      -      std::stringstream ss;
      -      ss << "Reading Calibration Cache from " << cache_file_path_;
      -      logging::log(logging::Level::kINFO, ss.str());
      -
      -      cache_.clear();
      -      std::ifstream input(cache_file_path_, std::ios::binary);
      -      input >> std::noskipws;
      -      if (input.good()) {
      -        std::copy(std::istream_iterator<char>(input), std::istream_iterator<char>(), std::back_inserter(cache_));
      -        logging::log(logging::Level::kDEBUG, "Cache read");
      -      }
      -      length = cache_.size();
      -      return length ? cache_.data() : nullptr;
      -    }
      -    return nullptr;
      -  }
      -
      -  void writeCalibrationCache(const void* cache, size_t length) noexcept override {
      -    std::ofstream cache_file(cache_file_path_, std::ios::binary);
      -    cache_file.write(reinterpret_cast<const char*>(cache), length);
      -    std::stringstream ss;
      -    ss << "Saved Calibration Cache to " << cache_file_path_;
      -    logging::log(logging::Level::kINFO, ss.str());
      -  }
      -
      -  operator nvinfer1::IInt8Calibrator*() {
      -    return reinterpret_cast<nvinfer1::IInt8Calibrator*>(this);
      -  }
      -
      - private:
      -  DataLoader* dataloader_;
      -  const std::string& cache_file_path_;
      -  size_t cache_size_ = 0;
      -  bool use_cache_;
      -  std::vector<char> cache_;
      -  std::vector<torch::Tensor> batched_data_;
      -  std::vector<torch::Tensor>::iterator it_;
      -};
      -
      -template <typename Algorithm>
      -class Int8CacheCalibrator : Algorithm {
      - public:
      -  Int8CacheCalibrator(const std::string& cache_file_path) : cache_file_path_(cache_file_path) {}
      -
      -  int getBatchSize() const noexcept override {
      -    // HACK: Torch-TensorRT only uses explict batch sizing, INT8 Calibrator does not
      -    // work when reporting the batch size here and having explicity batching.
      -    // So we just report batch size 1 (warnings will still be printed out).
      -    return 1;
      -  }
      -
      -  bool getBatch(void* bindings[], const char* names[], int nbBindings) noexcept override {
      -    return false;
      -  }
      -
      -  const void* readCalibrationCache(size_t& length) noexcept override {
      -    std::stringstream ss;
      -    ss << "Reading Calibration Cache from " << cache_file_path_;
      -    logging::log(logging::Level::kINFO, ss.str());
      -
      -    cache_.clear();
      -    std::ifstream input(cache_file_path_, std::ios::binary);
      -    input >> std::noskipws;
      -    if (input.good()) {
      -      std::copy(std::istream_iterator<char>(input), std::istream_iterator<char>(), std::back_inserter(cache_));
      -      logging::log(logging::Level::kDEBUG, "Cache read");
      -    }
      -    length = cache_.size();
      -    return length ? cache_.data() : nullptr;
      -  }
      -
      -  void writeCalibrationCache(const void* cache, size_t length) noexcept override {
      -    std::ofstream cache_file(cache_file_path_, std::ios::binary);
      -    cache_file.write(reinterpret_cast<const char*>(cache), length);
      -    std::stringstream ss;
      -    ss << "Saved Calibration Cache to " << cache_file_path_;
      -    logging::log(logging::Level::kINFO, ss.str());
      -  }
      -
      -  operator nvinfer1::IInt8Calibrator*() {
      -    return reinterpret_cast<nvinfer1::IInt8Calibrator*>(this);
      -  }
      -
      - private:
      -  const std::string& cache_file_path_;
      -  size_t cache_size_ = 0;
      -  std::vector<char> cache_;
      -};
      -
      -template <typename Algorithm = nvinfer1::IInt8EntropyCalibrator2, typename DataLoader>
      -inline Int8Calibrator<Algorithm, DataLoader> make_int8_calibrator(
      -    DataLoader dataloader,
      -    const std::string& cache_file_path,
      -    bool use_cache) {
      -  return Int8Calibrator<Algorithm, DataLoader>(std::move(dataloader), cache_file_path, use_cache);
      -}
      -
      -template <typename Algorithm = nvinfer1::IInt8EntropyCalibrator2>
      -inline Int8CacheCalibrator<Algorithm> make_int8_cache_calibrator(const std::string& cache_file_path) {
      -  return Int8CacheCalibrator<Algorithm>(cache_file_path);
      -}
      -
      -} // namespace ptq
      -} // namespace torch_tensorrt
      +
      /*
      + * Copyright (c) NVIDIA Corporation.
      + * All rights reserved.
      + *
      + * This library is licensed under the BSD-style license found in the
      + * LICENSE file in the root directory of this source tree.
      + */
      +#pragma once
      +
      +#include <fstream>
      +#include <iostream>
      +#include <iterator>
      +#include <memory>
      +#include <sstream>
      +#include <string>
      +#include <vector>
      +
      +#include "NvInfer.h"
      +#include "torch/torch.h"
      +#include "torch_tensorrt/logging.h"
      +#include "torch_tensorrt/macros.h"
      +
      +#ifndef DOXYGEN_SHOULD_SKIP_THIS
      +namespace torch_tensorrt {
      +namespace ptq {
      +TORCHTRT_API bool get_batch_impl(void* bindings[], const char* names[], int nbBindings, torch::Tensor& data);
      +}
      +} // namespace torch_tensorrt
      +#endif // DOXYGEN_SHOULD_SKIP_THIS
      +
      +namespace torch_tensorrt {
      +namespace ptq {
      +
      +template <typename Algorithm, typename DataLoaderUniquePtr>
      +class Int8Calibrator : Algorithm {
      +  using DataLoader = typename DataLoaderUniquePtr::element_type;
      +  using Batch = typename DataLoader::super::BatchType;
      +
      + public:
      +  Int8Calibrator(DataLoaderUniquePtr dataloader, const std::string& cache_file_path, bool use_cache)
      +      : dataloader_(dataloader.get()), cache_file_path_(cache_file_path), use_cache_(use_cache) {
      +    for (auto batch : *dataloader_) {
      +      batched_data_.push_back(batch.data);
      +    }
      +    it_ = batched_data_.begin();
      +  }
      +
      +  int getBatchSize() const noexcept override {
      +    // HACK: Torch-TensorRT only uses explict batch sizing, INT8 Calibrator does not
      +    // work when reporting the batch size here and having explicity batching.
      +    // So we just report batch size 1 (warnings will still be printed out).
      +    return 1;
      +    // return static_cast<int>(dataloader_->options().batch_size);
      +  }
      +
      +  bool getBatch(void* bindings[], const char* names[], int nbBindings) noexcept override {
      +    if (it_ != batched_data_.end()) {
      +      auto status = get_batch_impl(bindings, names, nbBindings, *it_);
      +      it_ = ++it_;
      +      return status;
      +    } else {
      +      // Reset iterator if incase calibrator is going to be used again
      +      it_ = batched_data_.begin();
      +      return false;
      +    }
      +  }
      +
      +  const void* readCalibrationCache(size_t& length) noexcept override {
      +    if (use_cache_) {
      +      std::stringstream ss;
      +      ss << "Reading Calibration Cache from " << cache_file_path_;
      +      logging::log(logging::Level::kINFO, ss.str());
      +
      +      cache_.clear();
      +      std::ifstream input(cache_file_path_, std::ios::binary);
      +      input >> std::noskipws;
      +      if (input.good()) {
      +        std::copy(std::istream_iterator<char>(input), std::istream_iterator<char>(), std::back_inserter(cache_));
      +        logging::log(logging::Level::kDEBUG, "Cache read");
      +      }
      +      length = cache_.size();
      +      return length ? cache_.data() : nullptr;
      +    }
      +    return nullptr;
      +  }
      +
      +  void writeCalibrationCache(const void* cache, size_t length) noexcept override {
      +    std::ofstream cache_file(cache_file_path_, std::ios::binary);
      +    cache_file.write(reinterpret_cast<const char*>(cache), length);
      +    std::stringstream ss;
      +    ss << "Saved Calibration Cache to " << cache_file_path_;
      +    logging::log(logging::Level::kINFO, ss.str());
      +  }
      +
      +  operator nvinfer1::IInt8Calibrator*() {
      +    return reinterpret_cast<nvinfer1::IInt8Calibrator*>(this);
      +  }
      +
      + private:
      +  DataLoader* dataloader_;
      +  const std::string& cache_file_path_;
      +  size_t cache_size_ = 0;
      +  bool use_cache_;
      +  std::vector<char> cache_;
      +  std::vector<torch::Tensor> batched_data_;
      +  std::vector<torch::Tensor>::iterator it_;
      +};
      +
      +template <typename Algorithm>
      +class Int8CacheCalibrator : Algorithm {
      + public:
      +  Int8CacheCalibrator(const std::string& cache_file_path) : cache_file_path_(cache_file_path) {}
      +
      +  int getBatchSize() const noexcept override {
      +    // HACK: Torch-TensorRT only uses explict batch sizing, INT8 Calibrator does not
      +    // work when reporting the batch size here and having explicity batching.
      +    // So we just report batch size 1 (warnings will still be printed out).
      +    return 1;
      +  }
      +
      +  bool getBatch(void* bindings[], const char* names[], int nbBindings) noexcept override {
      +    return false;
      +  }
      +
      +  const void* readCalibrationCache(size_t& length) noexcept override {
      +    std::stringstream ss;
      +    ss << "Reading Calibration Cache from " << cache_file_path_;
      +    logging::log(logging::Level::kINFO, ss.str());
      +
      +    cache_.clear();
      +    std::ifstream input(cache_file_path_, std::ios::binary);
      +    input >> std::noskipws;
      +    if (input.good()) {
      +      std::copy(std::istream_iterator<char>(input), std::istream_iterator<char>(), std::back_inserter(cache_));
      +      logging::log(logging::Level::kDEBUG, "Cache read");
      +    }
      +    length = cache_.size();
      +    return length ? cache_.data() : nullptr;
      +  }
      +
      +  void writeCalibrationCache(const void* cache, size_t length) noexcept override {
      +    std::ofstream cache_file(cache_file_path_, std::ios::binary);
      +    cache_file.write(reinterpret_cast<const char*>(cache), length);
      +    std::stringstream ss;
      +    ss << "Saved Calibration Cache to " << cache_file_path_;
      +    logging::log(logging::Level::kINFO, ss.str());
      +  }
      +
      +  operator nvinfer1::IInt8Calibrator*() {
      +    return reinterpret_cast<nvinfer1::IInt8Calibrator*>(this);
      +  }
      +
      + private:
      +  const std::string& cache_file_path_;
      +  size_t cache_size_ = 0;
      +  std::vector<char> cache_;
      +};
      +
      +template <typename Algorithm = nvinfer1::IInt8EntropyCalibrator2, typename DataLoader>
      +inline Int8Calibrator<Algorithm, DataLoader> make_int8_calibrator(
      +    DataLoader dataloader,
      +    const std::string& cache_file_path,
      +    bool use_cache) {
      +  return Int8Calibrator<Algorithm, DataLoader>(std::move(dataloader), cache_file_path, use_cache);
      +}
      +
      +template <typename Algorithm = nvinfer1::IInt8EntropyCalibrator2>
      +inline Int8CacheCalibrator<Algorithm> make_int8_cache_calibrator(const std::string& cache_file_path) {
      +  return Int8CacheCalibrator<Algorithm>(cache_file_path);
      +}
      +
      +} // namespace ptq
      +} // namespace torch_tensorrt
       
      @@ -649,6 +647,7 @@ + diff --git a/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.html b/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.html index c769ce9e6f..87f2ab086c 100644 --- a/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.html +++ b/docs/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.html @@ -10,7 +10,7 @@ - Program Listing for File torch_tensorrt.h — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Program Listing for File torch_tensorrt.h — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -235,7 +235,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -302,6 +302,9 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
    • +
    • Wrapping Custom Kernels to use in TensorRT
    • +
    • Using Torch-TensorRT to Insert the Kernel

    Python API Documenation

      @@ -412,350 +415,350 @@
      -

      Program Listing for File torch_tensorrt.h

      +

      Program Listing for File torch_tensorrt.h

      Return to documentation for file (cpp/include/torch_tensorrt/torch_tensorrt.h)

      -
      /*
      - * Copyright (c) NVIDIA Corporation.
      - * All rights reserved.
      - *
      - * This library is licensed under the BSD-style license found in the
      - * LICENSE file in the root directory of this source tree.
      - */
      -
      -#pragma once
      -
      -#include <cuda_runtime.h>
      -#include <iostream>
      -#include <memory>
      -#include <set>
      -#include <string>
      -#include <vector>
      -#include "torch/custom_class.h"
      -
      -#include "torch_tensorrt/macros.h"
      -
      -// Just include the .h?
      -#ifndef DOXYGEN_SHOULD_SKIP_THIS
      -namespace torch {
      -namespace jit {
      -struct Graph;
      -struct Module;
      -} // namespace jit
      -} // namespace torch
      -
      -namespace c10 {
      -enum class DeviceType : int8_t;
      -enum class ScalarType : int8_t;
      -template <class>
      -class ArrayRef;
      -} // namespace c10
      -
      -namespace nvinfer1 {
      -class IInt8Calibrator;
      -}
      -#endif // DOXYGEN_SHOULD_SKIP_THIS
      -
      -namespace torch_tensorrt {
      -class DataType {
      - public:
      -  enum Value : int8_t {
      -    kLong,
      -    kDouble,
      -    kFloat,
      -    kHalf,
      -    kChar,
      -    kInt,
      -    kBool,
      -    kUnknown
      -  };
      -
      -  DataType() = default;
      -  constexpr DataType(Value t) : value(t) {}
      -  TORCHTRT_API DataType(c10::ScalarType t);
      -  operator Value() const {
      -    return value;
      -  }
      -  explicit operator bool() = delete;
      -  constexpr bool operator==(DataType other) const {
      -    return value == other.value;
      -  }
      -  constexpr bool operator==(DataType::Value other) const {
      -    return value == other;
      -  }
      -  constexpr bool operator!=(DataType other) const {
      -    return value != other.value;
      -  }
      -  constexpr bool operator!=(DataType::Value other) const {
      -    return value != other;
      -  }
      -
      - private:
      -  friend TORCHTRT_API std::ostream& operator<<(std::ostream& os, const DataType& dtype);
      -  Value value;
      -};
      -
      -struct Device {
      -  class DeviceType {
      -   public:
      -    enum Value : int8_t {
      -      kGPU,
      -      kDLA,
      -    };
      -
      -    DeviceType() = default;
      -    constexpr DeviceType(Value t) : value(t) {}
      -    DeviceType(c10::DeviceType t);
      -    operator Value() const {
      -      return value;
      -    }
      -    explicit operator bool() = delete;
      -    constexpr bool operator==(DeviceType other) const {
      -      return value == other.value;
      -    }
      -    constexpr bool operator!=(DeviceType other) const {
      -      return value != other.value;
      -    }
      -
      -   private:
      -    Value value;
      -  };
      -
      -  DeviceType device_type;
      -
      -  /*
      -   * Target gpu id
      -   */
      -  int64_t gpu_id;
      -
      -  /*
      -   * When using DLA core on NVIDIA AGX platforms gpu_id should be set as Xavier device
      -   */
      -  int64_t dla_core;
      -
      -  bool allow_gpu_fallback;
      -
      -  Device() : device_type(DeviceType::kGPU), gpu_id(0), dla_core(0), allow_gpu_fallback(false) {}
      -};
      -
      -enum class EngineCapability : int8_t {
      -  kSTANDARD,
      -  kSAFETY,
      -  kDLA_STANDALONE,
      -};
      -
      -class TensorFormat {
      - public:
      -  enum Value : int8_t {
      -    kContiguous,
      -    kChannelsLast,
      -    kUnknown,
      -  };
      -
      -  TensorFormat() = default;
      -  constexpr TensorFormat(Value t) : value(t) {}
      -  TORCHTRT_API TensorFormat(at::MemoryFormat t);
      -  operator Value() const {
      -    return value;
      -  }
      -  explicit operator bool() = delete;
      -  constexpr bool operator==(TensorFormat other) const {
      -    return value == other.value;
      -  }
      -  constexpr bool operator==(TensorFormat::Value other) const {
      -    return value == other;
      -  }
      -  constexpr bool operator!=(TensorFormat other) const {
      -    return value != other.value;
      -  }
      -  constexpr bool operator!=(TensorFormat::Value other) const {
      -    return value != other;
      -  }
      -
      - private:
      -  friend TORCHTRT_API std::ostream& operator<<(std::ostream& os, const TensorFormat& format);
      -  Value value;
      -};
      -
      -struct Input : torch::CustomClassHolder {
      -  std::vector<int64_t> min_shape;
      -  std::vector<int64_t> opt_shape;
      -  std::vector<int64_t> max_shape;
      -  std::vector<int64_t> shape;
      -  DataType dtype;
      -  TensorFormat format;
      -  std::vector<double> tensor_domain;
      -
      -  Input() {}
      -  TORCHTRT_API Input(std::vector<int64_t> shape, TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(
      -      std::vector<int64_t> shape,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(std::vector<int64_t> shape, DataType dtype, TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(
      -      std::vector<int64_t> shape,
      -      DataType dtype,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(c10::ArrayRef<int64_t> shape, TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(
      -      c10::ArrayRef<int64_t> shape,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(c10::ArrayRef<int64_t> shape, DataType dtype, TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(
      -      c10::ArrayRef<int64_t> shape,
      -      DataType dtype,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(
      -      std::vector<int64_t> min_shape,
      -      std::vector<int64_t> opt_shape,
      -      std::vector<int64_t> max_shape,
      -      TensorFormat format = TensorFormat::kContiguous);
      -  TORCHTRT_API Input(
      -      std::vector<int64_t> min_shape,
      -      std::vector<int64_t> opt_shape,
      -      std::vector<int64_t> max_shape,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(
      -      std::vector<int64_t> min_shape,
      -      std::vector<int64_t> opt_shape,
      -      std::vector<int64_t> max_shape,
      -      DataType dtype,
      -      TensorFormat format = TensorFormat::kContiguous);
      -
      -  TORCHTRT_API Input(
      -      std::vector<int64_t> min_shape,
      -      std::vector<int64_t> opt_shape,
      -      std::vector<int64_t> max_shape,
      -      DataType dtype,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      +
      /*
      + * Copyright (c) NVIDIA Corporation.
      + * All rights reserved.
      + *
      + * This library is licensed under the BSD-style license found in the
      + * LICENSE file in the root directory of this source tree.
      + */
      +
      +#pragma once
      +
      +#include <cuda_runtime.h>
      +#include <iostream>
      +#include <memory>
      +#include <set>
      +#include <string>
      +#include <vector>
      +#include "torch/custom_class.h"
      +
      +#include "torch_tensorrt/macros.h"
      +
      +// Just include the .h?
      +#ifndef DOXYGEN_SHOULD_SKIP_THIS
      +namespace torch {
      +namespace jit {
      +struct Graph;
      +struct Module;
      +} // namespace jit
      +} // namespace torch
      +
      +namespace c10 {
      +enum class DeviceType : int8_t;
      +enum class ScalarType : int8_t;
      +template <class>
      +class ArrayRef;
      +} // namespace c10
      +
      +namespace nvinfer1 {
      +class IInt8Calibrator;
      +}
      +#endif // DOXYGEN_SHOULD_SKIP_THIS
      +
      +namespace torch_tensorrt {
      +class DataType {
      + public:
      +  enum Value : int8_t {
      +    kLong,
      +    kDouble,
      +    kFloat,
      +    kHalf,
      +    kChar,
      +    kInt,
      +    kBool,
      +    kUnknown
      +  };
      +
      +  DataType() = default;
      +  constexpr DataType(Value t) : value(t) {}
      +  TORCHTRT_API DataType(c10::ScalarType t);
      +  operator Value() const {
      +    return value;
      +  }
      +  explicit operator bool() = delete;
      +  constexpr bool operator==(DataType other) const {
      +    return value == other.value;
      +  }
      +  constexpr bool operator==(DataType::Value other) const {
      +    return value == other;
      +  }
      +  constexpr bool operator!=(DataType other) const {
      +    return value != other.value;
      +  }
      +  constexpr bool operator!=(DataType::Value other) const {
      +    return value != other;
      +  }
      +
      + private:
      +  friend TORCHTRT_API std::ostream& operator<<(std::ostream& os, const DataType& dtype);
      +  Value value;
      +};
      +
      +struct Device {
      +  class DeviceType {
      +   public:
      +    enum Value : int8_t {
      +      kGPU,
      +      kDLA,
      +    };
      +
      +    DeviceType() = default;
      +    constexpr DeviceType(Value t) : value(t) {}
      +    DeviceType(c10::DeviceType t);
      +    operator Value() const {
      +      return value;
      +    }
      +    explicit operator bool() = delete;
      +    constexpr bool operator==(DeviceType other) const {
      +      return value == other.value;
      +    }
      +    constexpr bool operator!=(DeviceType other) const {
      +      return value != other.value;
      +    }
      +
      +   private:
      +    Value value;
      +  };
      +
      +  DeviceType device_type;
      +
      +  /*
      +   * Target gpu id
      +   */
      +  int64_t gpu_id;
      +
      +  /*
      +   * When using DLA core on NVIDIA AGX platforms gpu_id should be set as Xavier device
      +   */
      +  int64_t dla_core;
      +
      +  bool allow_gpu_fallback;
      +
      +  Device() : device_type(DeviceType::kGPU), gpu_id(0), dla_core(0), allow_gpu_fallback(false) {}
      +};
      +
      +enum class EngineCapability : int8_t {
      +  kSTANDARD,
      +  kSAFETY,
      +  kDLA_STANDALONE,
      +};
      +
      +class TensorFormat {
      + public:
      +  enum Value : int8_t {
      +    kContiguous,
      +    kChannelsLast,
      +    kUnknown,
      +  };
      +
      +  TensorFormat() = default;
      +  constexpr TensorFormat(Value t) : value(t) {}
      +  TORCHTRT_API TensorFormat(at::MemoryFormat t);
      +  operator Value() const {
      +    return value;
      +  }
      +  explicit operator bool() = delete;
      +  constexpr bool operator==(TensorFormat other) const {
      +    return value == other.value;
      +  }
      +  constexpr bool operator==(TensorFormat::Value other) const {
      +    return value == other;
      +  }
      +  constexpr bool operator!=(TensorFormat other) const {
      +    return value != other.value;
      +  }
      +  constexpr bool operator!=(TensorFormat::Value other) const {
      +    return value != other;
      +  }
      +
      + private:
      +  friend TORCHTRT_API std::ostream& operator<<(std::ostream& os, const TensorFormat& format);
      +  Value value;
      +};
      +
      +struct Input : torch::CustomClassHolder {
      +  std::vector<int64_t> min_shape;
      +  std::vector<int64_t> opt_shape;
      +  std::vector<int64_t> max_shape;
      +  std::vector<int64_t> shape;
      +  DataType dtype;
      +  TensorFormat format;
      +  std::vector<double> tensor_domain;
      +
      +  Input() {}
      +  TORCHTRT_API Input(std::vector<int64_t> shape, TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(
      +      std::vector<int64_t> shape,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(std::vector<int64_t> shape, DataType dtype, TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(
      +      std::vector<int64_t> shape,
      +      DataType dtype,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(c10::ArrayRef<int64_t> shape, TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(
      +      c10::ArrayRef<int64_t> shape,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(c10::ArrayRef<int64_t> shape, DataType dtype, TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(
      +      c10::ArrayRef<int64_t> shape,
      +      DataType dtype,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(
      +      std::vector<int64_t> min_shape,
      +      std::vector<int64_t> opt_shape,
      +      std::vector<int64_t> max_shape,
      +      TensorFormat format = TensorFormat::kContiguous);
      +  TORCHTRT_API Input(
      +      std::vector<int64_t> min_shape,
      +      std::vector<int64_t> opt_shape,
      +      std::vector<int64_t> max_shape,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(
      +      std::vector<int64_t> min_shape,
      +      std::vector<int64_t> opt_shape,
      +      std::vector<int64_t> max_shape,
      +      DataType dtype,
      +      TensorFormat format = TensorFormat::kContiguous);
      +
      +  TORCHTRT_API Input(
      +      std::vector<int64_t> min_shape,
      +      std::vector<int64_t> opt_shape,
      +      std::vector<int64_t> max_shape,
      +      DataType dtype,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
       
      -  TORCHTRT_API Input(
      -      c10::ArrayRef<int64_t> min_shape,
      -      c10::ArrayRef<int64_t> opt_shape,
      -      c10::ArrayRef<int64_t> max_shape,
      -      TensorFormat format = TensorFormat::kContiguous);
      +  TORCHTRT_API Input(
      +      c10::ArrayRef<int64_t> min_shape,
      +      c10::ArrayRef<int64_t> opt_shape,
      +      c10::ArrayRef<int64_t> max_shape,
      +      TensorFormat format = TensorFormat::kContiguous);
       
      -  TORCHTRT_API Input(
      -      c10::ArrayRef<int64_t> min_shape,
      -      c10::ArrayRef<int64_t> opt_shape,
      -      c10::ArrayRef<int64_t> max_shape,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      +  TORCHTRT_API Input(
      +      c10::ArrayRef<int64_t> min_shape,
      +      c10::ArrayRef<int64_t> opt_shape,
      +      c10::ArrayRef<int64_t> max_shape,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
       
      -  TORCHTRT_API Input(
      -      c10::ArrayRef<int64_t> min_shape,
      -      c10::ArrayRef<int64_t> opt_shape,
      -      c10::ArrayRef<int64_t> max_shape,
      -      DataType dtype,
      -      TensorFormat format = TensorFormat::kContiguous);
      +  TORCHTRT_API Input(
      +      c10::ArrayRef<int64_t> min_shape,
      +      c10::ArrayRef<int64_t> opt_shape,
      +      c10::ArrayRef<int64_t> max_shape,
      +      DataType dtype,
      +      TensorFormat format = TensorFormat::kContiguous);
       
      -  TORCHTRT_API Input(
      -      c10::ArrayRef<int64_t> min_shape,
      -      c10::ArrayRef<int64_t> opt_shape,
      -      c10::ArrayRef<int64_t> max_shape,
      -      DataType dtype,
      -      std::vector<double> tensor_domain,
      -      TensorFormat format = TensorFormat::kContiguous);
      +  TORCHTRT_API Input(
      +      c10::ArrayRef<int64_t> min_shape,
      +      c10::ArrayRef<int64_t> opt_shape,
      +      c10::ArrayRef<int64_t> max_shape,
      +      DataType dtype,
      +      std::vector<double> tensor_domain,
      +      TensorFormat format = TensorFormat::kContiguous);
       
      -  TORCHTRT_API Input(at::Tensor tensor);
      +  TORCHTRT_API Input(at::Tensor tensor);
       
      - private:
      -  friend TORCHTRT_API std::ostream& operator<<(std::ostream& os, const Input& input);
      -  bool input_is_dynamic;
      -};
      + private:
      +  friend TORCHTRT_API std::ostream& operator<<(std::ostream& os, const Input& input);
      +  bool input_is_dynamic;
      +};
       
      -struct GraphInputs {
      -  torch::jit::IValue input_signature; // nested Input, full input spec
      -  std::vector<Input> inputs; // flatten input spec
      -};
      +struct GraphInputs {
      +  torch::jit::IValue input_signature; // nested Input, full input spec
      +  std::vector<Input> inputs; // flatten input spec
      +};
       
      -TORCHTRT_API std::string get_build_info();
      +TORCHTRT_API std::string get_build_info();
       
      -TORCHTRT_API void dump_build_info();
      +TORCHTRT_API void dump_build_info();
       
      -TORCHTRT_API void set_device(const int gpu_id);
      +TORCHTRT_API void set_device(const int gpu_id);
       
      -namespace torchscript {
      -struct CompileSpec {
      -  TORCHTRT_API CompileSpec(std::vector<std::vector<int64_t>> fixed_sizes);
      +namespace torchscript {
      +struct CompileSpec {
      +  TORCHTRT_API CompileSpec(std::vector<std::vector<int64_t>> fixed_sizes);
       
      -  TORCHTRT_API CompileSpec(std::vector<c10::ArrayRef<int64_t>> fixed_sizes);
      +  TORCHTRT_API CompileSpec(std::vector<c10::ArrayRef<int64_t>> fixed_sizes);
       
      -  TORCHTRT_API CompileSpec(std::vector<Input> inputs);
      +  TORCHTRT_API CompileSpec(std::vector<Input> inputs);
       
      -  TORCHTRT_API CompileSpec(torch::jit::IValue input_signature);
      -  // Defaults should reflect TensorRT defaults for BuilderConfig
      +  TORCHTRT_API CompileSpec(torch::jit::IValue input_signature);
      +  // Defaults should reflect TensorRT defaults for BuilderConfig
       
      -  GraphInputs graph_inputs;
      -  std::set<DataType> enabled_precisions = {DataType::kFloat};
      +  GraphInputs graph_inputs;
      +  std::set<DataType> enabled_precisions = {DataType::kFloat};
       
      -  bool disable_tf32 = false;
      +  bool disable_tf32 = false;
       
      -  bool sparse_weights = false;
      +  bool sparse_weights = false;
       
      -  bool refit = false;
      +  bool refit = false;
       
      -  bool debug = false;
      +  bool debug = false;
       
      -  bool truncate_long_and_double = false;
      +  bool truncate_long_and_double = false;
       
      -  bool allow_shape_tensors = false;
      +  bool allow_shape_tensors = false;
       
      -  Device device;
      +  Device device;
       
      -  EngineCapability capability = EngineCapability::kSTANDARD;
      +  EngineCapability capability = EngineCapability::kSTANDARD;
       
      -  uint64_t num_avg_timing_iters = 1;
      +  uint64_t num_avg_timing_iters = 1;
       
      -  uint64_t workspace_size = 0;
      +  uint64_t workspace_size = 0;
       
      -  uint64_t dla_sram_size = 1048576;
      +  uint64_t dla_sram_size = 1048576;
       
      -  uint64_t dla_local_dram_size = 1073741824;
      +  uint64_t dla_local_dram_size = 1073741824;
       
      -  uint64_t dla_global_dram_size = 536870912;
      +  uint64_t dla_global_dram_size = 536870912;
       
      -  nvinfer1::IInt8Calibrator* ptq_calibrator = nullptr;
      +  nvinfer1::IInt8Calibrator* ptq_calibrator = nullptr;
       
      -  bool require_full_compilation = false;
      +  bool require_full_compilation = false;
       
      -  uint64_t min_block_size = 3;
      +  uint64_t min_block_size = 3;
       
      -  std::vector<std::string> torch_executed_ops;
      +  std::vector<std::string> torch_executed_ops;
       
      -  std::vector<std::string> torch_executed_modules;
      -};
      +  std::vector<std::string> torch_executed_modules;
      +};
       
      -TORCHTRT_API bool check_method_operator_support(const torch::jit::Module& module, std::string method_name);
      +TORCHTRT_API bool check_method_operator_support(const torch::jit::Module& module, std::string method_name);
       
      -TORCHTRT_API torch::jit::Module compile(const torch::jit::Module& module, CompileSpec info);
      +TORCHTRT_API torch::jit::Module compile(const torch::jit::Module& module, CompileSpec info);
       
      -TORCHTRT_API std::string convert_method_to_trt_engine(
      -    const torch::jit::Module& module,
      -    std::string method_name,
      -    CompileSpec info);
      +TORCHTRT_API std::string convert_method_to_trt_engine(
      +    const torch::jit::Module& module,
      +    std::string method_name,
      +    CompileSpec info);
       
      -TORCHTRT_API torch::jit::Module embed_engine_in_new_module(
      -    const std::string& engine,
      -    Device device,
      -    const std::vector<std::string>& input_binding_names = std::vector<std::string>(),
      -    const std::vector<std::string>& output_binding_names = std::vector<std::string>());
      -} // namespace torchscript
      -} // namespace torch_tensorrt
      +TORCHTRT_API torch::jit::Module embed_engine_in_new_module(
      +    const std::string& engine,
      +    Device device,
      +    const std::vector<std::string>& input_binding_names = std::vector<std::string>(),
      +    const std::vector<std::string>& output_binding_names = std::vector<std::string>());
      +} // namespace torchscript
      +} // namespace torch_tensorrt
       
      @@ -813,6 +816,7 @@ + diff --git a/docs/_cpp_api/structtorch__tensorrt_1_1Device.html b/docs/_cpp_api/structtorch__tensorrt_1_1Device.html index 1ebec72914..ca8c5dc4d2 100644 --- a/docs/_cpp_api/structtorch__tensorrt_1_1Device.html +++ b/docs/_cpp_api/structtorch__tensorrt_1_1Device.html @@ -10,7 +10,7 @@ - Struct Device — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Struct Device — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -237,7 +237,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -304,6 +304,9 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
    • +
    • Wrapping Custom Kernels to use in TensorRT
    • +
    • Using Torch-TensorRT to Insert the Kernel

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    @@ -562,6 +570,7 @@

    Files + diff --git a/docs/_downloads/0daf1d0af656cac7b808856b71e6616f/torch_compile_resnet_example.ipynb b/docs/_downloads/0daf1d0af656cac7b808856b71e6616f/torch_compile_resnet_example.ipynb index 1c4b8134a9..4ef7182d48 100644 --- a/docs/_downloads/0daf1d0af656cac7b808856b71e6616f/torch_compile_resnet_example.ipynb +++ b/docs/_downloads/0daf1d0af656cac7b808856b71e6616f/torch_compile_resnet_example.ipynb @@ -150,7 +150,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.0" + "version": "3.11.7" } }, "nbformat": 4, diff --git a/docs/_downloads/0e30a6276601af7e5fc4d5166e2e3d37/torch_compile_advanced_usage.py b/docs/_downloads/0e30a6276601af7e5fc4d5166e2e3d37/torch_compile_advanced_usage.py index 96146a43d8..8ebedab111 100644 --- a/docs/_downloads/0e30a6276601af7e5fc4d5166e2e3d37/torch_compile_advanced_usage.py +++ b/docs/_downloads/0e30a6276601af7e5fc4d5166e2e3d37/torch_compile_advanced_usage.py @@ -43,7 +43,7 @@ def forward(self, x: torch.Tensor, y: torch.Tensor): # For the default settings, we can simply call torch.compile # with the backend "torch_tensorrt", and run the model on an # input to cause compilation, as so: -optimized_model = torch.compile(model, backend="torch_tensorrt") +optimized_model = torch.compile(model, backend="torch_tensorrt", dynamic=False) optimized_model(*sample_inputs) # %% @@ -81,7 +81,10 @@ def forward(self, x: torch.Tensor, y: torch.Tensor): # Run the model on an input to cause compilation, as so: optimized_model_custom = torch.compile( - model_half, backend="torch_tensorrt", options=backend_kwargs + model_half, + backend="torch_tensorrt", + options=backend_kwargs, + dynamic=False, ) optimized_model_custom(*sample_inputs_half) diff --git a/docs/_downloads/46b3e6febaab06324aa2715896895544/torch_compile_stable_diffusion.py b/docs/_downloads/46b3e6febaab06324aa2715896895544/torch_compile_stable_diffusion.py index 0511e5a363..a0b725572b 100644 --- a/docs/_downloads/46b3e6febaab06324aa2715896895544/torch_compile_stable_diffusion.py +++ b/docs/_downloads/46b3e6febaab06324aa2715896895544/torch_compile_stable_diffusion.py @@ -18,9 +18,8 @@ # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ import torch -from diffusers import DiffusionPipeline - import torch_tensorrt +from diffusers import DiffusionPipeline model_id = "CompVis/stable-diffusion-v1-4" device = "cuda:0" @@ -39,7 +38,7 @@ backend=backend, options={ "truncate_long_and_double": True, - "precision": torch.float16, + "enabled_precisions": {torch.float32, torch.float16}, }, dynamic=False, ) diff --git a/docs/_downloads/6a6052d9668b2cb8332d349d328e21c1/_rendered_examples_jupyter.zip b/docs/_downloads/6a6052d9668b2cb8332d349d328e21c1/_rendered_examples_jupyter.zip index 0bbc029cee765f05ff2f2d0b22f3580e30bd0309..507661ba9d87c52f1fe5fdac96da9514958cbb71 100644 GIT binary patch literal 47589 zcmeHwOK%)mmY&8lW5eUk0MiBx3=D)xG%8s|2J^v(N-h?fC6#K3s!|CO)m`PJ913Oz z$&fN5(h-p&vkO`@FslJ(VZi9E0r$%LM#DS*1TV%rulyrs;e|2tedpYJtCew>y%?tF z$;!HYJ&uZS7#1Pke*eln&C_W%U=3|d8ahgjl2z|X9w(>XT^Wn_rHm!_&g0@NPQ1e? z$1gQ1$zU`cqMD~r@DvJojz_j>FOqJ(sSI(x+xinPpH3!eR^;AEI2v&z$Kl`w zhx2eICB1k&8AanLDZ(Pg_`LMQWF(!5~rp?8{vlQZu)i^3OPm<$2<(a6i9 z;dCHPdKlhGnvKH*-%){;$LTN{`A5kiK1X35M`-IjoN+KG@r5_yq*2sB!$WnlQ8Ikm zIxC7verI!YGAoD`ewv+bS`cgwP%4X$r+Afbo@>*AB7XqXNCMkUpxyXq#dx%~=52WA zXYt?+V>}V8W6!!!X?B!wouX_k%V$@X6O=V08k(6$#InXBdYA`LE={m7Xk>2Q?#&M} zWi!>X(aTJbgLd!Dvv!UBrTo*UiL71?xMZmc55^OQ- z|CtQ_zTy(P=2pwsJrqFuz{Na1=1xkn;Yk@>gGw6tB8w)#Z`~ReBfJkk1l^2>(T5)j zj=E|}J-Gh>AL9PYG$!^RB}Ay|&f?C?BBhaD3fwr0!XdGIED&{D6yrCe_ww+C?8uVT zVIQp%wKx>rEsnT&gE$*ZM_~pC$_DhkyNxFV0j7O29l0GliqVCuNi0f7Q`xRSkzo`~ zyipWp*bk=|*CD2Kl8&&WxNYNccp9M$)&aX$NEP=`7K4n3BlEfiIl$fC|MJ#RGi0ray*(wYakOx$tUsn)XTHM-qFels2>PG!CfAQqkL2P&Nta8_VqA` zE<(~>e*(y_fWQrlGgRB|bSD@1eip^2XGP^fJ_tt~<9#@aPZNB_(9epal~0e7 z6~pPC%3s6cet7PVm^VfpXXC-M$*W0@C_1uU{jcbuAWEUx$rw# zONZH}!8$+U9<8ke7uLg8SN=h^6ryX(!>~@u7emWCNnwj9;pqC`Orm7}z&pgFG%3k6 zrjUfO57NIGC&TDs6Z9Dc)`W?lp{_Pr8Hm@vtKyUJ*@VSQkXJclxkRJz4K7?4tohj#9#mE~0_Zp#YFe3J8_S zbTo3JW>o++Ta{8PXcO2@G(>U8XL)2^l_F|7Ehf`K1m35vu|;yL&}9g~O6Vx^d5@$2 zG>rhyb#FhJK}3LLgQo&p_br)@Ctw2LWRs&r+Yph(F085K)_{A$HYB^G@zo!N$?24I z7Ht+I7GL%5QDou3xEG8}3xow$Wzz(NRi1zG@X1l~6Wx0Lb71x=#rZoPC>>Vm2zyE? zfsjvaq;-@~X8-n%PRds%x2{hi?y_hewDEf6$4SxaHeYp@zuH>{hc}Osf$BQ~E z-*sM7Z+;|Yz202C@9;#HsW#*9f*e9j6wLQy-^}-7-^};%Z7_~*1@;u?Xp%CF7x%Cn>n)mX36Jf9{TRN!3KkUf<_#m8|x>YYyvw6z5_1u zb?aNSZOg3-2JUHnhm|M^O-2^AzC+P?-Mgq3n@O=%F!cy$(`kZp9odO@j+5_1g$L}P z6w(a{At=xP!93pUno4pX~&$q(gVEBIk)2jQ1{QXoVl7Q)`T*ZM0Fw4d8l+Q=#*<>!Dy9(eUAKOb zjv(>@(3o8G5{HMq3tvAtqNW`Vl| zwiY~>tDz%ntOtkw=CycG!v7->d|kWO!&g_+bJVdfvqw9m6kJQJOtYyH3oQKRY6%9^ z22fki<54(1L!wIs3HM8X@igZpAKasdBYzB}^N&0;O=VhgCH--I`=kG0hThc7>C`+vuD3 zkD*!-6%%xvR;`pla*v^i05$OjI5k6KJkJ5R_ssdz_n>Tsh6(zpbNoe1U>ifUFepYM z-9LM#bcu(k3TyBe5cX%!NY1Wl_N!1R970#i!vWcuvM`*0uW=0I!K#b1;4JkCq6NR= zSl+F3VRRA1367AUR*C{fl>XRq0eyM`VWwE~q3`1nqSaaV@P7~guK`~mPYkBR@Mr~_ z#868TKP}W<_b9(>GfCo~K$=lqhmW!?+dp)yO6b?U53C^a!P+&U*|TTQ5{MWWrPyv- zef(_um>qt%@T;GzM9cH~xA9^Jzq|O=&y50IylCUMgJ1pJDDc~K@$~6aCmwu2sNZRS z0Ptc4*S0CYDK{e3M5S@)z1|3lpqC(k^C)|RrgJ;t9Y9mCyt5FzQw?I^m7wEe=$XdT zG5IExe$aNQ>bAyOD!R^f4|X^fcADDg0tQ;Aq=k}I)mBiwx|0_Q45&|}<jH1@kTLTBN ze0kBy33$KC6P&zJx~bY-peCwe@&zS{qP}+)wgx-orh$D8CsCSGayx4ZC*g=D986i8 z0T>9U0`^zvhA|5s&tT1h*&7BROW3@q82DmNhdd|%5GbpO3*yemqpjX+eNFXjoIubS z3nTXWG+Mnz(K`;~q^9Ow)m?8jzgFoUoB+aWYn+rGqs6|B#h?uJ9BOM?odlr_BrL}w z6Em1V9e)1aNT}s998jV8mw)z7jrY{paplw+sZ*VXC@8H)Mg{syb;wMc*33kU3e2|FEU(dd#OHB(B zJw8lFuyHTe4^k~b+f0CSMqC5|yky0#=`% zH=m?iAVSq?U?gmcKhzYu8Hmb^jk}?;h}A=INyf84_5y0zb*Y}dnrAdy+)1+6wbvJ& zvo_j(cc;Cz?%~5$7dAM4=(V?M8Xz3EHhO;hb`M8ReCTfNU?uUP)7`Bpq*xB^ZEeF6 z#P-^|ILGqI?c2LJI1yHao&5CsIGuHGcW+CHc5fGL=z( z9Baql-s#BU5wX+TZq5cn-2HfG z2d@E??T%LH^madPbd96x_O`SKptPmEo`3sev^O_vjIzCL0JPKD;k59i)1GUx?Qa8; z()GtXTo{aDd*^mdF$m~zWV^ds?sjKK%WZW#I}5tTlJq)Uy0+ih+2Y!C{BHZ>xwYQ% zdpiQZEzG8k>2GZNovp3i#;ozQ-P;@#Q081f52ilJvHSZg^|jHvPZv9=z`5}oh!hko^L0K|huMxhy6fI?M{>vtZNL3vGz8}~xUAUbOOHVu ze094`vkV^6I4w*QR~`0hNS)%bZQLR`?I_E|=aJr%+^IDA;eap10EiIN?|4I8gt!rj z8ZsNoMS`pTB{Wp98lHpz)*Vkg2ndpL%n8J{=BV>Xjy5C}-2L1i=0B^3ZYe0bP}dBg ztAV4+-uIt~so6j2!98a8(^u3|98XX8?$aR)lTEW1Vqw?K3c;hfec?pMZqwgR$5|(lQ1vf5a8fw zh@HX-Q^JNqLx>I-1vT$@H+*Btz9$y#mKg2z^C9f_KD@ftJyMi-xYq_ejAbi;OYcbL z7DIc4Ww4VQ#yJ-#D6ks#8K9qGr03!k>-x>$R94s`N?wCOI0=vAQC!USKKWe2B<{K= z{^P^_2lw_L-NUZ=be)_v2w$ECgk^FXKx>!fdyrg)qcF(nmn5f$PJ8S2_AcZTJ|?aM z3YC|;z1^+O?JfjHzCRtM$F&06y`9^;kbWS*)Hul*jvzROM01NP^wBfq?=e;c;dZl6?}3w894#iz(e8y{iEwU)^NqSU-Z2u1P6zGQrc0PiawSU3ml1ED`|BEEVgea)zq+EkTJ zxpLa}hwcZA4tttT!h-lYJoW958ODZppE??!Gu{x-qk*zDc$|QUkUG!zMI9B@+AraMuKFoj*2aoAVKYQt7QfnPL-0}T^=P~(#rf;*x;peUj0*YGk>FSyu3i%8aHMSpC&E6LSw z<3B)$eE5uUJ?^T%^_M1G4W|-nu^Zz4VM=dwupb%<2>3GAN`gWVE|meJ#Z4Wr(TI4O zSBwY^ZD{xs{7md2m7aX;HR$5C3m$l|tl~kAXU3%b|bl_7_IRN2Co1@HsETOPSr!-*mT5s4#8SpJ_9@voQ1OL zrH>sT0*N=B07j6EjNFP?8Dml5g7S^AT#z81h0${|=G^2ARY^w{_BqJNTgl z!8wiw=r|sTuuKOLThu{Sh>^u=G$4y-&n<$`S3>D`>!`Ru4@&MKj`8d8SER!78;Dhm z&?p_+#aF-_DS4VS2Tn*1rOUi|Hkj(8!34*|2e9a=O^>o8P61*S*nfE7oifG@khKT{ zd@x5u(J3mOqVO&?AfwXYM!{|9-Fh`XQqI#&> z(}xm6+roj^KhV!V?TXzp9(;aLTv4ri5=WyUI$O69Nw2iV{LZc9lkzm`1 zzC^;UtvJeczlRDG`i;+THzFyezqLkK8aHpxDnWOnJ9=X za87opJnx>&dX;{x6EYeX-)M}mbX2a)1v&zXb|6<3H&GpfXGf}lQ^6Dk{l`zfHp(tr zkbm7omUq3NR&^*4r5N~H$O_S5lb4>nZsrwV>8htDM^J;HL5>PYs+fVUPU2G>hzwtL zh&eHoKo*I?*zrJVkP_5;5ggeeNlu|KaYV{cz`1gQ6^$MEt@Ts~xMb6yL7#|GZfBXQ zp-Vs)c&HjTZMsBjefa&V@Xcw9C|W_5O0g=5WXWpN{g|Xq+r#H7bgU4d`&FiKJcE1x z{N4kz@P}YxdFy*iH13oH_mx`_(gUNI$pauaoG^G|Jw~*w;2sAC4M&RIXQ%>6M+L@n zz|~{7BAryyVW#P*GD<`1NW8Tqsau73oA>A!;XZ^Lw5jOV4Vy573TyyC>6<=^R#*rq zFJJpQBk$ko=2ld>FSGQ&I(YE;{-es6*rq{FQ43kX^ZBX~nPsqEekP%#mGu}BWJs)k|Eq8I57*dst&DJaONafx+W+c4ajxul z)sP!IOmPLGmZ!y2i$qxfDORGSnR0X@ah9u1We*vlptntYSr`v zFMkBn%E9oFj1MZQeDGR?t5>pw?>i)QwUfldAp7-C$@> zS!fkWGjJlKL`?}HN9^|*Rnfy3PF_??>yqJ>Mh9zDC51whn1IVk6_5*2E!)O=iSm;i z&R9)+)rP{Z6;NzeUMGMi>Ep7Bxfv8r3f9mo3?QMRt--{n>#F|*{^&_G{944;2Cq`6 zqD>G56F^a}sZd!E*foTApfTx*`&_&9p{*NJKd1Kfayl3caf3gDD-6XfAjeWRz((xA zDtL3*k<}Y&AaDa{lOn4CO`-J=X=9+1Yzau!#hN4H4>Ab6rh-gdE0kjh14dl}lFU%S zP#_mWu>j^SrpN*uDsKifWqToet^s*jvx^|Myj_`511J@bb!gDlZyVEqDpg~rDn987 z76N8brNQD#C2(xeRh7|JCW$kc1QG*)C0l=e^c4r>qG7b2g1y=WX2RXn3~`gXMgY7* z)p9z&wHPW=4q~L7(23|Q8StB>y0Ui;-$HCI-b$jhnoVA^uEc6J>eiyQsyzd6b`XeP z7)jV&*-v$8+~-B*%4GmP3wSD}%Jy7n7rJ$EVjK=+n9p#4|3PBK*d9#Kn3nKoVc5p4 zx1q#=kAYGA89hTeQu?tHnN5DJ#Q|sRDGZ7REwu2Qr;_eX!>^`-2(=J2Y2R$Be#6_O zvTx=)a%)u&A7gvvtGPjxuWEs0tObJ6n5A9k1w9xHs0XP@-}cR1`yk3-8f!bUN%3hgg5F!sMIu>0hJ;=b*W@j1pQ88xt}fJNv?`>pRsoKB!*~D> zKYu!bZa}+n4=c5dY*(vHAygd;7j}A=4xedORZ^KpSJF>ak4}rUOXWroLZYe@XigT6 zsj5m*x^oqU(VXgsYsDXFN>wqUdcr#J+^T1#hSIDdGW8n;;e<`n%vf%4Cgm{wr(m;< za))3u{=tL8UU#i>4yq##7oc&l=J0SZEVA`E)sbpb7LGX(%v%W23Mv{LSy-&qR8-YQ z3X$KayO$4_s;fJQ&%p8bQllXh1-Y>pJRLK#KX0*>3R+K)tEZr^rWUQdsSJto(Berg zDepFr-BpV^(5|c9h^D2|q=9F(2)EoStc{vyNSKD%`QklBWtBRh=+PGk1YJ*4g?eV3 z8iI#kJ^A9`LGbAFFAg3bKKjm}EA@ohTA)D{*^Go^K?TGV>IeeUZEGQXECm4eV&kne zLkE&{11=4~5iD+bbwo-!|{b|nCHHOAA2ULZU z^ZXMaNI0)J3qiz3);cWCTxvy0s3;~QByywjm#vxw#KLLHH<*R^*hCr(n(5Q5S;;?2 z=3;4JLo(S1#XC?^{~_y{+Uick0MRh0!8Rslisk7Hm*n;%CSGB|-g(>6A@e${d-uQ% zt8VoE$%Rj^ChJbWpPeEF8ujIeNU;&s$~uQdUttC_rS%=N)eSK9=z~liS^@~O4$mfA z7@v@7=dw+kUeHE>Tvf< zjwK8Lrc+@m`nF6=#Z-ortr@XUE|!_)pn$<*ZrP0RiAY7P0GwdtuX%{iP$>zj_4gen z!KcS6DHwaiB1FT!aA{l2mM9rGU-MW_!aJAnMNtP$)S0v%vO=- zjR=N3iLkv@o}1o8`eN!F6tJ?MnUPg7>#)oaAO#(*JW>BaTJnKjFv~8owW@nI(tkRO z->UZzZIaN61;cp$pb_#99S`?nE!LWlvXT(1Su@l_8C6qK>{kF(7bhD+q>~xT3hc=V zkVGn;ewx||IzR)#;M4=A^X$^Q_8HMh70QJ;!x9Kcxaac0nH>YUSY0%I*gxqp$~vMT zfddLMIA$=5=N>&Q;5g?2YZE#)Kn+R{OZ8?zM<{|s85?D4Y)98*o6vJ31%S;zKM zvuf?%ZpmPYG?BeJR+s_AEf)}{)o&0P^|1D;2af|KjMcN)DjaB{2zhj5j5b!-G^WKN ze(DD(Zg3VELi!|H@~1JeI+ZH8tM(a1{H?OW;CpHX6*B%9acPZOhVh= z(6L$)WF&bRsSrX{r6Nh*6-21Fk$LYk8!p|7phx0fV@?_I?;_2E+wKfISkrq`^pwzt z_gcWC;3Kmj?=!Hcj*2T&IKo#&QpTy84>~&{e$r zyO;Fx$E?Zots_5@?*Ef{`FDS}UjDZGH+lJAld!U{X95#b3q3fGwl=6BO(3i=5WvP0 zxVTx`NO{1pT+m*sB6>J;Krz`PD*82JZewU4E8E;KP{!Ch2r3G<_gLkn>SiYwm1zX7 zU7Ar*T=KOs%75qec?2$#L*R1x11^<2;4*mw`j^TXfbQO$FMxsgZ^{+Wzf_)pK1cPY z90Ba^P5A*Xmm2`xy*V#HA0hs4$_dbar}F_Iu8Cv_S|?ttK(bNS=ui>D2;4!6U}8zk zi6~Y{P~)f&DT}k&zxM>ho|;Pf0oRgujH)nyqyAc@DK!rLH$P@^r=sXzqC!OhSuOp0 z+s=Xn9f$-*I<4R&+4}?`sGphiS?y18@?4bP5|?4fofj)}muIXfC(b9)fH%6CTQgP= zJPfngOl%K$zM`>ZUa!2R!`@E8w;JkIzN5ydeBM(5#@xiQVk<(Ubdpy4q_JuoXIYww z5~1=6_j}kTaeW=b%3NP~Wt7t>-KhH!(fnHQdsj8Vo$nx!T)2{z3UU4Ee@%mgZsEXTisj5Oe0240wqz-%Qs zRtIZzu*I88+=BpWl9c_n|LoO6cO~He`tQH`_LqDA@IU|eU);LI-?j2eKvBxj^PCAh zU~bN#cf8i|dL+>HB~M|?-ruj)y_3*b$F zC=P_u!UCgL-H=1@DZ*P|x1uUr!xC*+o+xh04yd_+XSp#tH@jnupdF*AxBM$Hl(*3! zBIVnGWE8^`z|o*$LU{2#xdeuR1BAd8jtB3SIoMcC1t(odpkk;BtCdJWdHhn+O!%?{ zfQQKrl)twCz}h0uo5Y)%YZuk&;Vc{>sKKVykcBtcD_V8L$_NOcWK~cw*EBYA=XMK~ z5Xwqptt6jV^Vt*$DexsX>AOx^aZ*;E`k^)EoMVQ&VVjeHTVaqJ*%}$gV*LX zvg(2&0V|qNtYKE%Idr3zhILCI-s^3#3I&j4R;i!>57|z%t!6DbO6JqBU`4;z73J!` zA}|ozeirdC54Aj+1!bggeau@?M(HUKO7gUzA{Du&f=$*UiDxCq%8mfZcI`{i^rBw3 z?S1lz*MmlkoqZi%NbX=b?V6lh44P0q7CKw?T3_)3P!Cu+g=Iuj;WCrvg0u;cXzIoQ z^tP+rxR5+ShDVi@Jj{-Aiqs$@2Is@bYFITrgf!ihNoNJ#S0LG?KrV9iZr znC83?XK+u@(xx-#%b=uiPyd+W^BZe#{*fuGX^!My^Q68*_Ud0vnL1wY)xVlDI^HUp z?#(|!+3V6LeeN=t7`vw4ZYa@jJd#s-)2vi4<50fIaN*^ohE4=RiCDHy{+*|ArFphw zNo|#uEHtuNDGInXbTR^bcG-j^pgLNaU!x-k$n#IvJvacpTKo4>7fGNjMzwI;D;Q0_ zcD8MG%m+C1KbV(Lak0Q(+LjN(FLYs67)ROi)g_{ zm~qbwPjITYHzZW4$RcU!8F(P=Q_O;hG*WQ7W@TKh+Z>F^QWfJ;WeEaC?xK)cVE|8? z8vZruj-*@%T$AKNo4Lw2F3$3k@V!cm4iy|?i#5PNd_?- z_-C8p5Rc^+@9WU~H9J?W3-Yo#ouYVVP~~`n&@W_F0U3y(O_Ys$iL30#;d5TxHbBzN zym&8fvj4-@QA$oU-70jhHhN8mtPkT!1aE&4?By(j46cw*;dc-0{%LG4Ro`%%ZR2HU zV~b9#>M$l{_5Lz-QKsIIKEqHQY68yM-tcUHs!Y895Q$Sv!dKimhq0i&xv}}215^1L z^^^j-swBR)AdSZvvAnH_v0hxoGOiSA+k8<4D zs`KNKOwNx_Xd1A&B8Bw}ZE(ta?aH4*1BxbfAcg~)S;!dp-khoJNz(i^Ps=l;W2I0R zKp2s9zla7t6;7&oe|e@9r^k}-E<1V}U7Sf=W5DJcP`xsp+rJXx6Aw%>-s?No)6QU} zlLbNotxjHP9UMf6Ik$GRck6(JTeFZ~LvRb%;TXaSaf*bMVi)xj2&bNEemA6rJ>`BN zhCq4Fs%ySHWJ)puLcaJpZbw6iu)f)}&DQ8OP}`J6md!bF`$o$812e$><^TWl|N3wK z^{rd{-CQPRf!HQFZM$F_?|>N=>6Xg@S!--nIV)P+(h6{mGFSyt8^=gBJ2XPrl~iSK zsj$Bha^P}QBCM5jK7$X$*Hp7#LBsHF@r-45SmDtXD;&@1OAN6N2=ju}3QS;yAT8qV zc{}0|QB-fT$f{VO!VK-QIOnu*UjowK%WYL#E&-Gl!E0oss99=_VDXks+!%U8NP5;S z3a&TI*NfD{(GZVMc&&vH7tsuQX`5R#T*Q1%d4m9o_^7r>j@0J~* z7Fh3W1+lsEG*&gf9<4ZSxR^fpzW&5e$0K+ERxglZ!TJ+dPzY(wUx0<@pQ3`SoN?x?YS(Mtxw`pTTtAW+HUZ?3oi{$~ z0$R{GXd0XILYD!xSmxGS0HI3rz`C&>tM95XzbkOWAIN(A&WKMpM(IUHX{8yU`zF=K zLLn9Yhn=^Sb1&b-Kb7~uTOK*@-k7ps_4aQ}**B)_Q&>=LOxX-!xruUtKJ^x(T>kuD z{HwqHKR>#4i@!^wT*Bc?rq>w;(;UkF-?1Tk-weik!ogg{jE#HJ;862220z6ymu3vu zD+cs$rpw7$4CkO{ON3cCIwG&nBJWrFK(0O0vk}})px@#ZVN6^G_ZdW(A?v4F5!s(M zZ$;#19YCR18cK zA762$rcHZ=O7MC@nlmI!E)z00E|yu8qnL*e{z#r+_<2nsak@SpMPEHphbR_dA27rq z58%+;g@Yu6T2}2H>)x#Xz>bc!FX(oP#$89P3mldChCj2px3kLIi@?6uRzA}5Wp$Iq zN(VEB$6w1-xlWfp@xz&mRVZnJ=5~c9h?gnj8aPcr_g-6);c6>+FRjX7tpVofLQ+YL ztOjo#C2Aa4=xEz&;ZD_EL{rg`gV!1BdQ!LFc!i^g>xj+>r zT+c=>=vWx9El`mSGRuSP)l0bBsuiNH`+!)(>G9@ zK*}%BsyGT`T>A|TYHG=r7PBPDcJTW~mkv&tq4AuT5_7Xp^CA`ZvjHr(@UON?aV&)h z-lgBzDlQ%64Zks{53z+yud8d|rtCXIKls?vvFmhCOd zEM%Mf^)LQn;@VTDYZ%nRJVVW9n^8$+v;JQ%X?AgpQnTTvBW~I7fBD-b4OhJ!>u$Nt n#+wT0vhn}--(*cX4}SYc$O(_O delta 405 zcmaF*nQ6r&M(F@=W)=|!5LmKMD`KZ(xjZW;0|N+aPhQ_EF`0Fz*y8dQ=E;5S+LJFA ziErNCexFgQpeQvtvp6$9FGk5qp(MX3IYTckCqJ>o&}{Rm&MC}{29x7^?fF0kPDyOv zeMOjo;fnBN!Nt;(*Xv->0|U$E)xER1Hb>3K j-3@d*s)@Udz$WS}7MXm0CkG!30}De6BLl-acaU)a-(Gwu diff --git a/docs/_downloads/798cda8f83bd9f5e2cc93f329a04332c/_rendered_examples_python.zip b/docs/_downloads/798cda8f83bd9f5e2cc93f329a04332c/_rendered_examples_python.zip index 6dbb2d437c2c36b5106a8cd62a53af6556edc120..6301a433d46555adea8fd5be1a74c315ceee3b33 100644 GIT binary patch literal 36261 zcmeHw&2t<{b{}O)KD2dF#NJf~Vmu8FQ0Q)aap1wkazJv07!JuG!Cj6KbfM5)0E+A{ zRaFzfjz)?-`{2`Fe6(UCd~}2rI_R&^U*HJ)=!1{;(fa+~%goBEt_Fb_YP60Zb}?*J z<;TmH@B6*Xe)jbr{?X6w@aMOG_cyQp@^}C6r+CgXgPJwFT{ zjxk!I&x6BB)<0jf?^=65{-@RYRT9Km5}Zv4Ofd-t$!Idpi!3gZ;Cyls#CY;98BV52 z7W5~x;UGxI{o!nYZh@{W(3Rsk9$DamL;w$Fvd9|Xt?J)FKATP_S&;{)@o-2uPU8Ms z0`u%jS_bK8I!s2%xQL4s&;^szfB>1lFmZ971pPQq@F6bhcp6NA!11IA-j1;#E($~U zC!=XPWQSx#&@JF$@aK~YtT(nqCG&E*UKiE&}w%8uI`k zoLmL*U_iWZGYI%3Pl<@2ohQj|>-bk0{>1AcAkX+=8s|Bn<`|s$fe>nNnoY(<0${&5 zFN$ftyS_fXDo9e{Bs*KT2w3lbjK z>FGQMwoa0MJQF%EQA--BGoxS?&q?;vI1{SP3s7pD4IEoxR~(@Ic+9qz{L&&AgT27{ zSAhJH5)$SmljW-cUSn|3VV<6li6&rlN(48Yib1}}k|}6gxxx}daQ7}`WI9Oh-W39M z-BfCj^?)4G{kus@%5RaBe4ZlDWl27nWn?gr#d#7BNZ}*Fren^~!XD)DTVcggz=4BS zNmK+x8N~^epr2;_*)Yz4JYl^s*snby0#D4Teda zjj;~j^eWfO|S*vTM<#@oh*g)kB8=U8}em^)4C}L9Qb8IDbqg7l4QIh z(x|vd`Q&6cOIDz1poL_D-Q8flAYZ5PFkhEP`8o#!Cl8|JGNzafr$BcQ1iKgMpg7+R zwz|FPr93!K(zElzf0Fm(VS)#p;1g*P57V>pZUC^)i&m>e4cJ-^KKTUyu`sz&h}d1?7BqY{8RbgmEqY8BC820CZP|WGZPmZRVLU#YQSM;8Vn|MXK*ddfxVOfv!ijdJMkB$}JU zzL=ZAzL*>TRWwW>WUJ`PP5e(Y_%_gCt$T**@kI4 z2=+MQG@HP(9YyJ&-C2=WDPHjP-Jl#T_>67i4D^$=QgwbZqDxJv?rw!TNgnVkIbB^|V!YzLnpl=XdF9myXQmk6Y>C|It+t->^$N z?ZB8Vi4zMhH@w4nygy~8Hws+W@7Aky>sGrdfNyl=J-^##rJJm(oJ~>jhIniPn;wwc zm2Ml=**0`>!wK|!I=EzS0h^mu4!(h1F8}IP{_5hdHyzi>N637)H@Z5GUT^qDSD>%n z=>99E3DFQV2jv%m(7M9TH=sJcY`?;=w$-Xo+n)9|^oMrv$+M*WEgGh)!KL5m3XPVb z=Mwl_voYM2J{+^)0uJ3&^(k0n0-X=h2EphIYpX7)fM9mQJceU9g=O3Nxs$i;CE@%c zD_~9S?y8WCtZhL_w**LLaAT7^inFtPuf4KT6`)6>H1EF_G1}0*eOVJO);`^J%Hs$^ z^tJG}FNKuPC$mDNAW%7;hG74^4O!B@422t4R)Ya$`5xYhB-VYQ@=CdC<#)Wy(_bfT z+MZAqE99XbtW54Gf9xDb9NGRs-!R2*_cf+A(=BT7nO)OS3YzP^#ul2n^a1|vzE+x` zh|yIhqiJ!~E@m)b-K1eL8N&Dhim|Zd9ZcfA%TPZUI(dTrJ_2MBmhBZuzfQ7A-fll3 zD60!Fz_f2>xNy6Ro%kCiM}IQH-@1w z;09tl>LacG!70YKk1Wkzbbr@KP-fuH7x6^Sa?-uj%1t)5C4=p`(r?f6(%2 zx3R51SJ`&MYkO5fXvyiQGL~WXqsmbBp+6Ea>=0}%C2)<~*-8U5g8#lW4fe2w+Rto2 zmK3vW99%g^@cs$hbqQG@VrIh>87$uv?k%`E=);9W@Nr)7k_C(>6b6V}B4mO1#s&Vx zNRZY8jL>8{1t?k}Gv?FhRzT>}Q>e9KB}8b0=3=|M8sPsM`2Px)(O&A$2JzBL z%kUDpLtFG#1K4Xi#CRN@La(VK7@}$0&W$jV%7E42Q>&Cdr8fMaKfo6P4pgdZ!jFhL zjG=18&sF&XUm4Nu_7Q%zLaYtHoA}kwpzviYhTOkL_warjzdQKV&&}<6c;CTq7r*+s zx!sSL_x0=7PM>_r4cYAo5;e~8v?LuDF;EiF-MIll&hOVxyhpd&pPTH>(GB+HCbZBr z*Y*1WIdL9C;Al)50tT!)f|EO%jVNs3u*1jJ$V-Lmy-3AsfM7TqyFg2Fi3Ev@u^>jG zJ~JXG&PqgsK4Srl-kbP)I}WB9;+hb3h~I16>Lg*;YeL_U!pn?ijT4oQx(-OkrAG|S zsE@f+6E`Ms3}{T0#R>$228lig$>}NNrOIHg-8lRjHxr~uT!%tSDMEsEF2-FGcqD9q zdH}1LQO(@0ni4%8GMI`b>o`CI@l+rOg2)b5L4y)eWu%=DinYuYBt<`zgc#JDtY5IK zp=RiaVK9h`IB0*vz)CuXb{Gkx1;-LuUmTbqFGkfv{4%yGmB1vVJ0?YU2649 zk_p>E-SY-nQ^@l@?94(G9DlbOUTF?A|#Ed1M)sL#H#y9eNT2}Ht+&@3nUzn z*J9snGDG~IWWo1NQ_BD@O{iSUd?5JpB)OWP3!CSC5T9}gUf8Ih1QUa%#Fzy3ZcH<7 zF@&BmKge#dg~Y&LN;0$pGLx9}ub4ZP1R>hqGzPwgw!Rspc9V?4VC$SAgVd&`oG@38 z2=Fj4A%EE$F~va5z!d|(Hv8krCM({0I!dg2!=-l$NtJ!#e1*ufmIL^~TBbiw9^ z?u8rOO@h%2w>LHkdpF$L-niG)wi|ZuZSQE4&bCglxv{;m0E}+9^Z5pvWA=^BduR@0 zxVy2{v_gQq^ZE8RngC&2UG3D}*!jEx44CR$TQV*X){${H!h4@%+(v9*J)JFsyzTBb zmx?FdPUFy>a0{3gFh1Yr1_GX~?R)g3>zE^?J3HF;UUysDZuYv{3&6nEZFITKov^#T z$vy0bz0T*2w6Pg(YzuxjvBD0}vbGg=H#c`0z`&c%#uiC!Gu+v}$8op9d)2ieuybCApPB%cjgrv;Nm zpbIMP_=h*hA+)OPT>N@{Psnz*KP_mt{^9cq* zjvpz@+AL)u1=F)a=d&zBy%_Ay+1QPOdv%ITk= zgUt%u0tSc_%9&w&1v!CWtJvJTFsMvvLM3hC*7^hmDzL6t#)MrXGGdA6Db%#@kt}uy zSI-$mY4BfBTx2Q-rN9fdLMU%k@_>uXk(lzA`h_I(u(PkiLpNkM&_4y(7fHcrUEW4% z1w8FR1n;^J*BZpO_uUtg$_P(4kXYUM@_Ra!C$qD?N2q*(NV%9~ZzXx7k^uU>E*nsT zB%&?3O!_m>1G4G)9`>P)901*rs5plga&{hxsDZ1P4rhqk*$GDRWfadYBe=9lJj(Z& zYh4Wttwj`wSOS2y?I+VB9i_jPbs?3Dx+~KK7I}m!22>zWJfM;5Qs8gwS`rN=GjAkz zIF0iH*(QfhBS;lbNKJwDFc8Wqhf$@V(1CB#B@ZNP(w3x9A~$b_0))agQPCas<7s@74%6al z?~8|0Rl4tu4V%U?8BPrp-kRgilhpeTj z3YM9Yela_Nri3>oSxYcENQat6)JLVK?RFg?Fub-F01`ZLc88Pi6j=(4h`kcZz)tpb zR_L<*6=RmliYs8&7VHL?!&`k6;LRJzdD_AKGnCJrF+}wO+1b4G!7J^9`>K8#mkFj4 znda6ScD$*eeGx6HAVo!>YhIb@uKR(~g}7j~m+(}M+DZFU>49y2#E=7Hn^3PRS@Yz~ z#Y4n9WWcX3>WuQ$xZhbare02ht}G@IhHK}gnSlLUlk9n4RV)&ykXFnh>KVUYvqjQ zgpfqF7b0U+q}KXJ>;&sHC6g#gAgcDY!0l^uMgU#|qmI5gz8u`@EPxQs;e-W)&{>QI z6OCm_yAF0n9%vz03-MN+f*uSuNoxG?%vAlc zzz_SNb#|dpu^1K*kPPygk~!ebldPm;Ja}+$ zzpQqLeoodiP$kMd*Sjz>*K;X@%~>1|t!w@<=Q+e2ywtC4O)g6b`F}fh1eo zel?kC9i@rf16ULC%&&k`6vCsi7E^+z#gYkTQ{c%~RZcQgC!5GQseo@x$lef;2INRl z@d*r#n8{CaIqQ4VU)w&Dsq^O0f+!s&eN3G8#U5w&)HZdP6;Wg96a&ZxT4xHnrZK=r z;VY$!t0+ccH0~*k8djsaxj7!cauw1T}pYVMn6J zIEl{jh%NVB!`H9HIP{8cU~$*OnpV%+kRWR*j$T)znXj}A&)6Ht^0T@FjjF9^Rl&Wc zipJ}*v`a@8ghZ|y3N29eU7cK;(2C^SMvOLhG4?Whbx}%gM@KP>mo7$G(`XyR zR!6F7t^(K zKUZ1CiM&Z{{TN1;qBa0m)|h$AfMZUJ29l^O`LQq?C>3RDoS|L#;^jAG@r!Ni*06Zu z`A$9@jDp|@P9ZzPL_=QIV!2g&HEv!34q@<6XXwF0MoA2qbxqANhgJHG8vWt z8c{}htPLr=N_lKvV^&mM6I%#(VpflC5LxO2JNo{fWA?gh!N?4^71eoB$K; zwt%RL#E2LpJ;O(jpWZ)Q;mDOXqN%Nj4gdW9<3}JcH|esGCl*90dZ3*J8xXDo6&54f zcpxir{YqfF*8pqK3~gZ{cA)t{%%hOhm5gx~WM*RK#58||tQ-`4B5)yyllQMBu1X&( zHg45P3C>JHOQ9<3bWE#M=HI)3n=(WN9*vQeM6YtiHCe`qT=*eUr422gK@^h9$V7GDzxIvxkv6FBO)5ggM4>GV}~S6NPT6}7NN!sk)4T{!&c2D2&Pcu zMoJD8!b=)@#1uy`Coxig8nP-Ze2Ln`aWeQuY|N_aQDkC3aJ?yPEjLnxC*Mhs(Vvezx86r25-WWt=L#R&X^fG7ndZq-QvU zj*wkjFew#{1cpC2p?FPuS+TTtv&Runv;ciP@yc>+SK!)!Xjj&4v5asq3(B*-nDhFm zT{LPFrWOw?%c-Els|!VFS?eRQ2E-V(aNSmZDmsXerXKx<9u6h8*xlEb<5!bh5_IuLzHIKBFj-M6jY}>I}ug! z0_^*3F*i3sx44=sbeHP*`iM8{DmC)8u3nYAKP@n3A#g_=+$8>vxX%sEO6-Gl1uA#By(^8(6!kVu=LFc-h(g*u{UM^S!pdR* zzknfKnppA@c|Hj!fF>J53GGs-RobKGYmoL)D;CbC@Y0REun)^l#vZHmy*3)O&%!&@ zvx~Q^hP`=on|P4t(=+!y}Hgc<^oF@Sb1RM%MxOjWI% zqguPRx>I8FsHQl24INo)bqIA49y~eR=&h9AiBBNJcFnmZVDQokL`>_!8VzDJLdzRV zzHFwUlH?3EG*eoAG7n6cFZ3p$P)i93V+-h{@&shw<{%~UPI1z| zV2Pi$EMIu=(1vMVPjmA*#%UN{=k+I%07p5X!K2@TiH_&yegcvHhOebYiNPVQws0YS z-9*y`VO!Yr(X+=de{t|6dj9Yi2QLnve`|=7@i-kU(qiuH3R0n=Jr3IF{RoknwB3jp z%i*BbrKNhM^?0&|#5|~Wu*O(ALJl^8qxHU0u2Ei#fEe|`SVEHYOEjm04!6wTAkcZC z1su%BI27rkUA34jN{&)VoeOMh+&}|VqkgOg6|Y8;j0EmwVYrl|n(e?g=nXE&I(WIV ze*Hca4YN%$rlRL+@{yx3R|!c1O|J35MJAdLI+gHzEoX!Div;l(-%YgeqMpX=#39dy zt_BaF0=x?3@bofd?TW2LzMq}pN)bkD4{<&>@te9Tw@@cWI>{Cf+1>^swcl|FSX%(? ztEgMCjqwSm@NYP>nFK>bxJd>#N0NvF!80lx<}il)9@06F~GlBOY=|8H8q*BE61+PRop^Jn4v7hecU}dZr7> zgtwCr*VZ7H_ZFBmXIWJK)pADU^Lsj_D>NxBLf-c(r3yioW{FSY!O`Bab*QhQA)JFR@3WG0QY96ZB#ltmIO zH>DLAfkIi~Agg(|&dCC;!e=j=<#;9l)h(f;5+PMLWKlG^6`zC<(PD%!OTLc9JzelN8@gMF)RU-PYdfpXc?CKv5&a%r~tr4M;S7z zdBFU3i~SI2geEw`yC&zmPG$_%RTjC>>T|73`opG-%%(VEGz1XJ zzs|;Ty-7rS{Lj2exL4Ul>Cr%8Q1wb!q${1InX(K;X1ZPnQ95 z*R6));0ZFHBMpCI1gQkj!;>Il$W@fnC4+J zah*91eJm(6-~XV6$a=KFrA0p0A^R7bx_aP`b;!4`rA|^_ik5>4E%T)vIWPuuJxa~8oBp- zQH^Y&SX_;KOZ^LE-dCI1oZ>?ils=xVVV;a(K(OxEmM^=LdQ5iF0Jr+o-mEQA9Am(n zGuvKt#^g|Vriq>H?F0xy*ehwva7{~tcg|KR&-23k`QQh7LATeP=_OODA^7oJug8Hz z3yv3mJl9J^$j{Lck7WN|oa=Qy0*)t;y%F5towsDiY7Bu6I1lmlCpeubS5@KYn_Rcx zxSJen6P$)It&33dB{JAQ5w~8;sve*6&L#atSgWF1gQFY1W5;UOqnmcUZpx3tcEZL3 z={UlNOIM=P@!l6WANG~GWTo@vO6!^X+Kc_^e|&ZD|NirT{co*1cli5n%CEf;ca}wu zJlc&kOAfERyhGf5`$bbTggMojVz3Bjil5+ktKL^4ijLl6nJ+Q%Ei=(^KtORA>PLM0 zMGN1-07gWj40XXcjD+~AUI9(^(CG6;uA~(c87?qybwf(e@Mn=?VuSeep|5@?7n6Z#$GpD5N znF%a{;W8Utmt>aiNtqI{MD%_PFT+!iL&Y8fXo)#F?H|}}%^yI!C18x8FUaU)us-ZcDk_NnGF(YF02312Wx8Yn)uT8vpkRM0Bt?6owz;;3*R2RO&G zYwl`Q-O2WTAfFY5hh+DF*>xyET)n|`4@a}7F&!hB`&G?%AyM7&I*1Yk6n!Ir4Raz3 z9%dPo1+!=GB@)`aPy4u65Cp3UhS?OhLvjgd%&AX38B5QB_df*jDFx~S{Zc2No|@$4 zF>PZ87I-%YjvJ%Yg1zyz8g+2ZY33APs)Ae*9F>Fzl;x@*doA8I=V+x{zTWF>F|65Q$N%S#?%d&T zdy(0TG#oFs;86sCiIZdH*S@LLwA0la!UR^j+c#qHg^YwU0W;DMv~Pbx+qMp|nt?d) z?W|^rrjnFSQL={iQ$*2{Ti7yM^D3J5UN97G`@r)D%Jyl2h;N*Z|TPV6n6PP^OC zwMyt(<}aT^bIc1yf|-Q>(CpU*Kx7)t&828}r@JXmpXy~eyu2X^UeY_XHz$aM6UK3F zn9LE*;CWy)o!>B>LN0Ujwc0nXFYxOz1T`upT6JSGJrt zCeX$h|K^mNL6Swx_d2WjM=Gw;zPa_rv#f4mXy8I!oL3<_agC^9A4p2F4<#z`h1CJ)i4$XA!KeF%4vsN?z_ZOE1fhS=g{D4ljv=@q=j6{4+}J{pfI3^= zzdTsFK9nG=sgi*41i0G#k|NyKBsZCEx?@?|1g}wQ?gb@0KS<&cx9<-w0!G9;q9~Iu zWIOo%j84b~eGE_h*zkn)Opmee?}^>P(UNLc>%E$?A@vB~CRcDbcn_hLA?GdCW%4%C zT2jdF?}Wr}?AvN<2^By!c_D)(=i7)UhnLj{$_h)>I00H!ADC9cPL;?GTFyUQZ3(&k z6rmZrE1n6Xuv#OaZhYw5nnI@UU=4)%-C!t$Z<}HUP72&UMk0Z;amsH4bl=JEn@R)`R5IC{2qvH#zo!}gQx}`!0xvTn`n2iu+|5w3aV)- zM7_H|MoZnW{||@~;iTrrI15g)@Z!|tDhqy$v;2TK3w?#j#ee_pvp>0WhrbKci}o8^ zqghVxWKoo*+H5Yya^HOKYw(hz{F6smaM2%Pf$vg8PtApGaJTx*j4%`?-;`b>A}7ZY zj7DMgeBOXj>Xl>F(ohGmrgo3eBTd4!O3+w=g~idzU|i6748%gSBXLx`|BdGRzZ=F_7c=zdc>)%a`b z${GDd%EP&lO47Kc4I9$|?KUoA-9aLkfh%62tZSm;tG#;dvdZFMmRV!%Y1R$!pWqCg z9&{88Xs}4^pQbPXIy^Fkbl7Y`g+1lsPD{#{Rilg_0cHcAnn;Y{Yqk(fWg}N}rlWA9 zO?W9BECFCPUyz0&F*Fknp5f(8x|#{;2-QIF{)|_}Xl*b_E1TmbEpA3W<_6N+P|qXY z9xAxKU6KXK1Icetf*}g|E}hJHPD_)8De4z-e%T;I(DImGnQwEqKGC{x#AH3Pd=9I@#+`5GN&5vH#>cfjbQuDkcaF^$m>U7% zXVT#hf2WV-%L1US$06!{jY?TE@nFGk^)7gH+%yniXKVMt)|=9{?FW-s^5-WuFDy4rJc%Yf0oTM=I}wI{{eyq; zvp@V9aus*({Pyqu=G9;R?jQd2?@#XB;qRZxQ~l51{)Yb=(V{$Jt{pPx8FPr4XJ40t zHC0gB@&5KV{};oX!9-Ns3ZeOWN`K6z^mIAq{~j$E6M4dVvf3ZDDREPddN*G%s-GY5 p(m3^z>uz2-@-P2-!N^r_PVoB2|C0Uw82|g-AKba~zy1UN`bU|VZBPIJ delta 390 zcmZ2Fo2e^AI>4KmMT7wamMqkY@O{Lfe3ylR0faRs7uHKmKG!3>*(C2H<7DdsP0`HU zg8ZTqg_8WDKtO{r41BuU^DP^Y0!@!`* zGnqeGdU9Q_;O1HN{H&XIv}CbNp3ovKmS0c;w5!-QDKR-aH7_MTyF9Tdy?AqeXFHQT zhDX5OgAhv^8KB;hR{*K={J&_!W-bN>5LQD`r_4H;w^w{}RgWxKA>5@v*Xp1slo9|c hv;r$aI2+wGV;Qh%b-j{&EDS6R5sVBBb2LE4004f7bou}Q diff --git a/docs/_downloads/b35883282793ac3413933fdb22d00d81/torch_compile_advanced_usage.ipynb b/docs/_downloads/b35883282793ac3413933fdb22d00d81/torch_compile_advanced_usage.ipynb index 57e26cd5ed..343b129a3b 100644 --- a/docs/_downloads/b35883282793ac3413933fdb22d00d81/torch_compile_advanced_usage.ipynb +++ b/docs/_downloads/b35883282793ac3413933fdb22d00d81/torch_compile_advanced_usage.ipynb @@ -62,7 +62,7 @@ }, "outputs": [], "source": [ - "# Next, we compile the model using torch.compile\n# For the default settings, we can simply call torch.compile\n# with the backend \"torch_tensorrt\", and run the model on an\n# input to cause compilation, as so:\noptimized_model = torch.compile(model, backend=\"torch_tensorrt\")\noptimized_model(*sample_inputs)" + "# Next, we compile the model using torch.compile\n# For the default settings, we can simply call torch.compile\n# with the backend \"torch_tensorrt\", and run the model on an\n# input to cause compilation, as so:\noptimized_model = torch.compile(model, backend=\"torch_tensorrt\", dynamic=False)\noptimized_model(*sample_inputs)" ] }, { @@ -91,7 +91,7 @@ }, "outputs": [], "source": [ - "# If we want to customize certain options in the backend,\n# but still use the torch.compile call directly, we can provide\n# custom options to the backend via the \"options\" keyword\n# which takes in a dictionary mapping options to values.\n#\n# For accepted backend options, see the CompilationSettings dataclass:\n# py/torch_tensorrt/dynamo/_settings.py\nbackend_kwargs = {\n \"enabled_precisions\": {torch.half},\n \"debug\": True,\n \"min_block_size\": 2,\n \"torch_executed_ops\": {\"torch.ops.aten.sub.Tensor\"},\n \"optimization_level\": 4,\n \"use_python_runtime\": False,\n}\n\n# Run the model on an input to cause compilation, as so:\noptimized_model_custom = torch.compile(\n model_half, backend=\"torch_tensorrt\", options=backend_kwargs\n)\noptimized_model_custom(*sample_inputs_half)" + "# If we want to customize certain options in the backend,\n# but still use the torch.compile call directly, we can provide\n# custom options to the backend via the \"options\" keyword\n# which takes in a dictionary mapping options to values.\n#\n# For accepted backend options, see the CompilationSettings dataclass:\n# py/torch_tensorrt/dynamo/_settings.py\nbackend_kwargs = {\n \"enabled_precisions\": {torch.half},\n \"debug\": True,\n \"min_block_size\": 2,\n \"torch_executed_ops\": {\"torch.ops.aten.sub.Tensor\"},\n \"optimization_level\": 4,\n \"use_python_runtime\": False,\n}\n\n# Run the model on an input to cause compilation, as so:\noptimized_model_custom = torch.compile(\n model_half,\n backend=\"torch_tensorrt\",\n options=backend_kwargs,\n dynamic=False,\n)\noptimized_model_custom(*sample_inputs_half)" ] }, { @@ -136,7 +136,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.0" + "version": "3.11.7" } }, "nbformat": 4, diff --git a/docs/_downloads/b776287bc876f7ce24942b82a66beb05/torch_compile_stable_diffusion.ipynb b/docs/_downloads/b776287bc876f7ce24942b82a66beb05/torch_compile_stable_diffusion.ipynb index d48320c8b1..24cbbefea1 100644 --- a/docs/_downloads/b776287bc876f7ce24942b82a66beb05/torch_compile_stable_diffusion.ipynb +++ b/docs/_downloads/b776287bc876f7ce24942b82a66beb05/torch_compile_stable_diffusion.ipynb @@ -22,7 +22,7 @@ }, "outputs": [], "source": [ - "import torch\nfrom diffusers import DiffusionPipeline\n\nimport torch_tensorrt\n\nmodel_id = \"CompVis/stable-diffusion-v1-4\"\ndevice = \"cuda:0\"\n\n# Instantiate Stable Diffusion Pipeline with FP16 weights\npipe = DiffusionPipeline.from_pretrained(\n model_id, revision=\"fp16\", torch_dtype=torch.float16\n)\npipe = pipe.to(device)\n\nbackend = \"torch_tensorrt\"\n\n# Optimize the UNet portion with Torch-TensorRT\npipe.unet = torch.compile(\n pipe.unet,\n backend=backend,\n options={\n \"truncate_long_and_double\": True,\n \"precision\": torch.float16,\n },\n dynamic=False,\n)" + "import torch\nimport torch_tensorrt\nfrom diffusers import DiffusionPipeline\n\nmodel_id = \"CompVis/stable-diffusion-v1-4\"\ndevice = \"cuda:0\"\n\n# Instantiate Stable Diffusion Pipeline with FP16 weights\npipe = DiffusionPipeline.from_pretrained(\n model_id, revision=\"fp16\", torch_dtype=torch.float16\n)\npipe = pipe.to(device)\n\nbackend = \"torch_tensorrt\"\n\n# Optimize the UNet portion with Torch-TensorRT\npipe.unet = torch.compile(\n pipe.unet,\n backend=backend,\n options={\n \"truncate_long_and_double\": True,\n \"enabled_precisions\": {torch.float32, torch.float16},\n },\n dynamic=False,\n)" ] }, { @@ -60,7 +60,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.0" + "version": "3.11.7" } }, "nbformat": 4, diff --git a/docs/_downloads/ce102e287ddb5744f0a1364e8c0c7f68/torch_compile_transformers_example.ipynb b/docs/_downloads/ce102e287ddb5744f0a1364e8c0c7f68/torch_compile_transformers_example.ipynb index 98c872cfb9..9d2c6dcebd 100644 --- a/docs/_downloads/ce102e287ddb5744f0a1364e8c0c7f68/torch_compile_transformers_example.ipynb +++ b/docs/_downloads/ce102e287ddb5744f0a1364e8c0c7f68/torch_compile_transformers_example.ipynb @@ -69,7 +69,7 @@ }, "outputs": [], "source": [ - "# Define backend compilation keyword arguments\ncompilation_kwargs = {\n \"enabled_precisions\": enabled_precisions,\n \"debug\": debug,\n \"workspace_size\": workspace_size,\n \"min_block_size\": min_block_size,\n \"torch_executed_ops\": torch_executed_ops,\n}\n\n# Build and compile the model with torch.compile, using Torch-TensorRT backend\noptimized_model = torch.compile(\n model,\n backend=\"torch_tensorrt\",\n options=compilation_kwargs,\n)\noptimized_model(*inputs)" + "# Define backend compilation keyword arguments\ncompilation_kwargs = {\n \"enabled_precisions\": enabled_precisions,\n \"debug\": debug,\n \"workspace_size\": workspace_size,\n \"min_block_size\": min_block_size,\n \"torch_executed_ops\": torch_executed_ops,\n}\n\n# Build and compile the model with torch.compile, using Torch-TensorRT backend\noptimized_model = torch.compile(\n model,\n backend=\"torch_tensorrt\",\n dynamic=False,\n options=compilation_kwargs,\n)\noptimized_model(*inputs)" ] }, { @@ -150,7 +150,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.0" + "version": "3.11.7" } }, "nbformat": 4, diff --git a/docs/_downloads/dfa60e8f9850fd7761f3e7da81304d32/torch_compile_transformers_example.py b/docs/_downloads/dfa60e8f9850fd7761f3e7da81304d32/torch_compile_transformers_example.py index 5422f9cc1d..01d46e96f6 100644 --- a/docs/_downloads/dfa60e8f9850fd7761f3e7da81304d32/torch_compile_transformers_example.py +++ b/docs/_downloads/dfa60e8f9850fd7761f3e7da81304d32/torch_compile_transformers_example.py @@ -61,6 +61,7 @@ optimized_model = torch.compile( model, backend="torch_tensorrt", + dynamic=False, options=compilation_kwargs, ) optimized_model(*inputs) diff --git a/docs/_modules/index.html b/docs/_modules/index.html index 8a50771b6c..5ced90b5d8 100644 --- a/docs/_modules/index.html +++ b/docs/_modules/index.html @@ -9,7 +9,7 @@ - Overview: module code — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Overview: module code — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -234,7 +234,7 @@
    - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
    @@ -301,6 +301,9 @@
  • Torch Compile Advanced Usage
  • Torch Compile Stable Diffusion
  • +
  • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
  • +
  • Wrapping Custom Kernels to use in TensorRT
  • +
  • Using Torch-TensorRT to Insert the Kernel
  • Python API Documenation

    Python API Documenation

      @@ -409,7 +410,10 @@

      Source code for torch_tensorrt._Device

      -import sys
      +from __future__ import annotations
      +
      +import logging
      +import sys
       from typing import Any, Optional, Tuple
       
       if sys.version_info >= (3, 11):
      @@ -417,19 +421,11 @@ 

      Source code for torch_tensorrt._Device

       else:
           from typing_extensions import Self
       
      -import warnings
      -
      -# from torch_tensorrt import _enums
      -import tensorrt as trt
       import torch
      -from torch_tensorrt import logging
      +from torch_tensorrt._enums import DeviceType
      +from torch_tensorrt._features import ENABLED_FEATURES
       
      -try:
      -    from torch_tensorrt import _C
      -except ImportError:
      -    warnings.warn(
      -        "Unable to import torchscript frontend core and torch-tensorrt runtime. Some dependent features may be unavailable."
      -    )
      +import tensorrt as trt
       
       
       
      [docs]class Device(object): @@ -443,12 +439,14 @@

      Source code for torch_tensorrt._Device

               allow_gpu_fallback (bool): Whether falling back to GPU if DLA cannot support an op should be allowed
           """
       
      -    device_type: Optional[
      -        trt.DeviceType
      -    ] = None  #: Target device type (GPU or DLA). Set implicitly based on if dla_core is specified.
      +    device_type: DeviceType = (
      +        DeviceType.UNKNOWN
      +    )  #: Target device type (GPU or DLA). Set implicitly based on if dla_core is specified.
           gpu_id: int = -1  #: Device ID for target GPU
           dla_core: int = -1  #: Core ID for target DLA core
      -    allow_gpu_fallback: bool = False  #: Whether falling back to GPU if DLA cannot support an op should be allowed
      +    allow_gpu_fallback: bool = (
      +        False  #: Whether falling back to GPU if DLA cannot support an op should be allowed
      +    )
       
       
      [docs] def __init__(self, *args: Any, **kwargs: Any): """__init__ Method for torch_tensorrt.Device @@ -478,32 +476,31 @@

      Source code for torch_tensorrt._Device

                       )
                   else:
                       (self.device_type, id) = Device._parse_device_str(args[0])
      -                if self.device_type == trt.DeviceType.GPU:
      -                    self.gpu_id = id
      -                else:
      +                if self.device_type == DeviceType.DLA:
                           self.dla_core = id
                           self.gpu_id = 0
      -                    logging.log(
      -                        logging.Level.Warning,
      -                        "Setting GPU id to 0 for device because device 0 manages DLA on Xavier",
      +                    logging.warning(
      +                        "Setting GPU id to 0 for device because device 0 manages DLA on AGX Devices",
                           )
      +                else:
      +                    self.gpu_id = id
       
               elif len(args) == 0:
                   if "gpu_id" in kwargs or "dla_core" in kwargs:
                       if "dla_core" in kwargs:
      -                    self.device_type = trt.DeviceType.DLA
                           self.dla_core = kwargs["dla_core"]
      -                    if "gpu_id" in kwargs:
      -                        self.gpu_id = kwargs["gpu_id"]
      -                    else:
      +                if "gpu_id" in kwargs:
      +                    self.gpu_id = kwargs["gpu_id"]
      +
      +                if self.dla_core >= 0:
      +                    self.device_type = DeviceType.DLA
      +                    if self.gpu_id != 0:
                               self.gpu_id = 0
      -                        logging.log(
      -                            logging.Level.Warning,
      -                            "Setting GPU id to 0 for device because device 0 manages DLA on Xavier",
      +                        logging.warning(
      +                            "Setting GPU id to 0 for device because device 0 manages DLA on AGX Platforms",
                               )
                       else:
      -                    self.gpu_id = kwargs["gpu_id"]
      -                    self.device_type = trt.DeviceType.GPU
      +                    self.device_type = DeviceType.GPU
                   else:
                       raise ValueError(
                           "Either gpu_id or dla_core or both must be defined if no string with device specs is provided as an arg"
      @@ -511,71 +508,96 @@ 

      Source code for torch_tensorrt._Device

       
               else:
                   raise ValueError(
      -                "Unexpected number of positional arguments for class Device \n    Found {} arguments, expected either zero or a single positional arguments".format(
      -                    len(args)
      -                )
      +                f"Unexpected number of positional arguments for class Device \n    Found {len(args)} arguments, expected either zero or a single positional arguments"
                   )
       
               if "allow_gpu_fallback" in kwargs:
                   if not isinstance(kwargs["allow_gpu_fallback"], bool):
                       raise TypeError("allow_gpu_fallback must be a bool")
      -            self.allow_gpu_fallback = kwargs["allow_gpu_fallback"]
      + self.allow_gpu_fallback = kwargs["allow_gpu_fallback"] + + if "device_type" in kwargs: + if isinstance(kwargs["device_type"], trt.DeviceType): + self.device_type = DeviceType._from(kwargs["device_type"])
      def __str__(self) -> str: - return ( - "Device(type={}, gpu_id={}".format(self.device_type, self.gpu_id) + ")" - if self.device_type == trt.DeviceType.GPU - else ", dla_core={}, allow_gpu_fallback={}".format( - self.dla_core, self.allow_gpu_fallback - ) + suffix = ( + ")" + if self.device_type == DeviceType.GPU + else f", dla_core={self.dla_core}, allow_gpu_fallback={self.allow_gpu_fallback})" ) + dev_str: str = f"Device(type={self.device_type}, gpu_id={self.gpu_id}{suffix}" + return dev_str def __repr__(self) -> str: return self.__str__() - def _to_internal(self) -> _C.Device: - internal_dev = _C.Device() - if self.device_type == trt.DeviceType.GPU: - internal_dev.device_type = _C.DeviceType.GPU - elif self.device_type == trt.DeviceType.DLA: - internal_dev.device_type = _C.DeviceType.DLA - else: - raise ValueError( - "Invalid DeviceType detected while parsing the Device class" - ) + @classmethod + def _from(cls, d: Optional[Self | torch.device | str]) -> Device: + """Cast a device-type to torch_tensorrt.Device + + Returns the corresponding torch_tensorrt.Device + """ + if isinstance(d, Device): + return d - internal_dev.gpu_id = self.gpu_id - internal_dev.dla_core = self.dla_core - internal_dev.allow_gpu_fallback = self.allow_gpu_fallback - return internal_dev + elif isinstance(d, torch.device): + if d.type != "cuda": + raise ValueError('Torch Device specs must have type "cuda"') + return cls(gpu_id=d.index) - def _to_serialized_rt_device(self) -> str: - internal_dev = self._to_internal() - serialized_rt_device: str = internal_dev._to_serialized_rt_device() - return serialized_rt_device + elif d is None: + return cls(gpu_id=torch.cuda.current_device()) + + else: + return cls(d) @classmethod - def _from_torch_device(cls, torch_dev: torch.device) -> Self: - if torch_dev.type != "cuda": - raise ValueError('Torch Device specs must have type "cuda"') - gpu_id = torch_dev.index - return cls(gpu_id=gpu_id) + def _from_torch_device(cls, torch_dev: torch.device) -> Device: + return cls._from(torch_dev) @classmethod - def _current_device(cls) -> Self: - dev = _C._get_current_device() - return cls(gpu_id=dev.gpu_id) + def _current_device(cls) -> Device: + dev_id = torch.cuda.current_device() + return cls(gpu_id=dev_id) @staticmethod def _parse_device_str(s: str) -> Tuple[trt.DeviceType, int]: s = s.lower() spec = s.split(":") if spec[0] == "gpu" or spec[0] == "cuda": - return (trt.DeviceType.GPU, int(spec[1])) + return (DeviceType.GPU, int(spec[1])) elif spec[0] == "dla": - return (trt.DeviceType.DLA, int(spec[1])) + return (DeviceType.DLA, int(spec[1])) else: - raise ValueError(f"Unknown device type {spec[0]}")
      + raise ValueError(f"Unknown device type {spec[0]}") + + def to(self, t: type) -> torch.device: + if t == torch.device: + if self.gpu_id != -1: + return torch.device(self.gpu_id) + else: + raise ValueError("Invalid GPU ID provided for the CUDA device provided") + else: + raise TypeError("Unsupported target type for device conversion") + + def _to_serialized_rt_device(self) -> str: + if not ENABLED_FEATURES.torch_tensorrt_runtime: + raise NotImplementedError("Torch-TensorRT runtime is not available") + + delim = torch.ops.tensorrt.SERIALIZED_RT_DEVICE_DELIM()[0] + dev_info = torch.cuda.get_device_properties(self.gpu_id) + rt_info = [ + self.gpu_id, + dev_info.major, + dev_info.minor, + int(self.device_type.to(trt.DeviceType)), # type: ignore[arg-type] + dev_info.name, + ] + rt_info = [str(i) for i in rt_info] + packed_rt_info: str = delim.join(rt_info) + logging.debug(f"Serialized Device Info: {packed_rt_info}") + return packed_rt_info
      @@ -627,6 +649,7 @@

      Source code for torch_tensorrt._Device

                
                
                
      +         
                
                
                
      diff --git a/docs/_modules/torch_tensorrt/_Input.html b/docs/_modules/torch_tensorrt/_Input.html
      index 9a37922194..6685f7a620 100644
      --- a/docs/_modules/torch_tensorrt/_Input.html
      +++ b/docs/_modules/torch_tensorrt/_Input.html
      @@ -9,7 +9,7 @@
         
         
         
      -  torch_tensorrt._Input — Torch-TensorRT v2.3.0.dev0+85971ff documentation
      +  torch_tensorrt._Input — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation
         
       
         
      @@ -234,7 +234,7 @@
                     
                     
                       
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

      @@ -415,7 +416,7 @@

      Source code for torch_tensorrt._Input

       from typing import Any, Dict, List, Optional, Sequence, Tuple
       
       import torch
      -from torch_tensorrt import _enums
      +from torch_tensorrt._enums import dtype, memory_format
       
       
       
      [docs]class Input(object): @@ -439,24 +440,23 @@

      Source code for torch_tensorrt._Input

               STATIC = 0
               DYNAMIC = 1
       
      -    shape_mode: Optional[
      -        _ShapeMode
      -    ] = None  #: Is input statically or dynamically shaped
      -    shape: Optional[
      -        Tuple[int, ...] | Dict[str, Tuple[int, ...]]
      -    ] = None  #: Either a single Tuple or a dict of tuples defining the input shape. Static shaped inputs will have a single tuple. Dynamic inputs will have a dict of the form ``{ "min_shape": Tuple, "opt_shape": Tuple, "max_shape": Tuple }``
      -    dtype: _enums.dtype = (
      -        _enums.dtype.unknown
      +    shape_mode: Optional[_ShapeMode] = (
      +        None  #: Is input statically or dynamically shaped
      +    )
      +    shape: Optional[Tuple[int, ...] | Dict[str, Tuple[int, ...]]] = (
      +        None  #: Either a single Tuple or a dict of tuples defining the input shape. Static shaped inputs will have a single tuple. Dynamic inputs will have a dict of the form ``{ "min_shape": Tuple, "opt_shape": Tuple, "max_shape": Tuple }``
      +    )
      +    dtype: dtype = (
      +        dtype.unknown
           )  #: The expected data type of the input tensor (default: torch_tensorrt.dtype.float32)
           _explicit_set_dtype: bool = False
      -    format: _enums.TensorFormat = (
      -        _enums.TensorFormat.contiguous
      -    )  #: The expected format of the input tensor (default: torch_tensorrt.TensorFormat.NCHW)
      +    format: memory_format = (
      +        memory_format.linear
      +    )  #: The expected format of the input tensor (default: torch_tensorrt.memory_format.linear)
       
           DOMAIN_OFFSET: float = 2.0
           low_tensor_domain_incl: float = 0.0
           high_tensor_domain_excl: float = low_tensor_domain_incl + DOMAIN_OFFSET
      -    torch_dtype: torch.dtype = torch.float32
           torch_tensor: torch.Tensor = None
           name: str = ""
       
      @@ -562,21 +562,19 @@ 

      Source code for torch_tensorrt._Input

       
               else:
                   raise ValueError(
      -                "Unexpected number of positional arguments for class Input \n    Found {} arguments, expected either zero or a single positional arguments".format(
      -                    len(args)
      -                )
      +                f"Unexpected number of positional arguments for class Input \n    Found {len(args)} arguments, expected either zero or a single positional arguments"
                   )
       
               if "dtype" in kwargs:
      -            if isinstance(kwargs["dtype"], torch.dtype):
      -                self.torch_dtype = kwargs["dtype"]
      +            self.dtype = dtype._from(kwargs["dtype"])
       
      -            self.dtype = Input._parse_dtype(kwargs["dtype"])
      -            self.torch_dtype = Input._to_torch_dtype(self.dtype)
      +        if self.dtype != dtype.unknown:
                   self._explicit_set_dtype = True
      +        else:
      +            self._explicit_set_dtype = False
       
               if "format" in kwargs:
      -            self.format = Input._parse_format(kwargs["format"])
      +            self.format = memory_format._from(kwargs["format"])
       
               if "tensor_domain" in kwargs:
                   domain = kwargs["tensor_domain"]
      @@ -623,6 +621,9 @@ 

      Source code for torch_tensorrt._Input

               else:
                   raise RuntimeError("Unknown input shape mode")
       
      +    def __repr__(self) -> str:
      +        return self.__str__()
      +
           @staticmethod
           def _supported_input_size_type(input_size: Any) -> bool:
               if isinstance(input_size, torch.Size):
      @@ -634,77 +635,6 @@ 

      Source code for torch_tensorrt._Input

               else:
                   return False
       
      -    @staticmethod
      -    def _parse_dtype(dtype: Any) -> _enums.dtype:
      -        if isinstance(dtype, torch.dtype):
      -            if dtype == torch.long:
      -                return _enums.dtype.long
      -            elif dtype == torch.int32:
      -                return _enums.dtype.int32
      -            elif dtype == torch.half:
      -                return _enums.dtype.half
      -            elif dtype == torch.float:
      -                return _enums.dtype.float
      -            elif dtype == torch.float64:
      -                return _enums.dtype.double
      -            elif dtype == torch.bool:
      -                return _enums.dtype.bool
      -            else:
      -                raise TypeError(
      -                    "Provided an unsupported data type as an input data type (support: bool, int32, long, half, float), got: "
      -                    + str(dtype)
      -                )
      -
      -        elif isinstance(dtype, _enums.dtype):
      -            return dtype
      -
      -        else:
      -            raise TypeError(
      -                "Input data type needs to be specified with a torch.dtype or a torch_tensorrt.dtype, got: "
      -                + str(type(dtype))
      -            )
      -
      -    @staticmethod
      -    def _to_torch_dtype(dtype: _enums.dtype) -> torch.dtype:
      -        if dtype == _enums.dtype.long:
      -            return torch.long
      -        elif dtype == _enums.dtype.int32:
      -            return torch.int32
      -        elif dtype == _enums.dtype.half:
      -            return torch.half
      -        elif dtype == _enums.dtype.float:
      -            return torch.float
      -        elif dtype == _enums.dtype.bool:
      -            return torch.bool
      -        elif dtype == _enums.dtype.double:
      -            return torch.float64
      -        else:
      -            # Default torch_dtype used in FX path
      -            return torch.float32
      -
      -    def is_trt_dtype(self) -> bool:
      -        return bool(self.dtype != _enums.dtype.long)
      -
      -    @staticmethod
      -    def _parse_format(format: Any) -> _enums.TensorFormat:
      -        if isinstance(format, torch.memory_format):
      -            if format == torch.contiguous_format:
      -                return _enums.TensorFormat.contiguous
      -            elif format == torch.channels_last:
      -                return _enums.TensorFormat.channels_last
      -            else:
      -                raise ValueError(
      -                    "Provided an unsupported tensor format (support: NCHW/contiguous_format, NHWC/channel_last)"
      -                )
      -
      -        elif isinstance(format, _enums.TensorFormat):
      -            return format
      -
      -        else:
      -            raise TypeError(
      -                "Tensor format needs to be specified with either torch.memory_format or torch_tensorrt.TensorFormat"
      -            )
      -
           @staticmethod
           def _parse_tensor_domain(
               domain: Optional[Tuple[float, float]]
      @@ -826,7 +756,9 @@ 

      Source code for torch_tensorrt._Input

                       )
                   else:
                       if isinstance(self.shape, tuple):
      -                    return torch.rand(self.shape).to(dtype=self.torch_dtype)
      +                    return torch.rand(self.shape).to(
      +                        dtype=self.dtype.to(torch.dtype, use_default=True)
      +                    )
                       else:
                           RuntimeError(
                               f"Input shape is dynamic but shapes are not provided as sequence (found: {self.shape})"
      @@ -845,7 +777,7 @@ 

      Source code for torch_tensorrt._Input

       
                       if isinstance(self.shape, dict):
                           return torch.rand(self.shape[optimization_profile_field]).to(
      -                        dtype=self.torch_dtype
      +                        dtype=self.dtype.to(torch.dtype, use_default=True)
                           )
                       else:
                           raise RuntimeError(
      @@ -908,6 +840,7 @@ 

      Source code for torch_tensorrt._Input

                
                
                
      +         
                
                
                
      diff --git a/docs/_modules/torch_tensorrt/_compile.html b/docs/_modules/torch_tensorrt/_compile.html
      index ed15bcb6b1..c6f58c7c47 100644
      --- a/docs/_modules/torch_tensorrt/_compile.html
      +++ b/docs/_modules/torch_tensorrt/_compile.html
      @@ -9,7 +9,7 @@
         
         
         
      -  torch_tensorrt._compile — Torch-TensorRT v2.3.0.dev0+85971ff documentation
      +  torch_tensorrt._compile — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation
         
       
         
      @@ -234,7 +234,7 @@
                     
                     
                       
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

      @@ -411,36 +412,40 @@

      Source code for torch_tensorrt._compile

       from __future__ import annotations
       
      +import collections.abc
       import logging
       from enum import Enum
       from typing import Any, Callable, List, Optional, Sequence, Set
       
       import torch
       import torch.fx
      -import torch_tensorrt.ts
       from torch_tensorrt._enums import dtype
      +from torch_tensorrt._features import ENABLED_FEATURES
       from torch_tensorrt._Input import Input
      -from torch_tensorrt._utils import sanitized_torch_version
      +from torch_tensorrt.dynamo import _defaults
       from torch_tensorrt.fx import InputTensorSpec
       from torch_tensorrt.fx.lower import compile as fx_compile
       from torch_tensorrt.fx.utils import LowerPrecision
      -from torch_tensorrt.ts._compiler import compile as torchscript_compile
       from typing_extensions import TypeGuard
       
      -from packaging import version
      -
      -DYNAMO_ENABLED = version.parse(sanitized_torch_version()) >= version.parse("2.1.dev")
      +if ENABLED_FEATURES.torchscript_frontend:
      +    import torch_tensorrt.ts
      +    from torch_tensorrt.ts._compiler import compile as torchscript_compile
      +    from torch_tensorrt.ts._compiler import (
      +        convert_method_to_trt_engine as ts_convert_method_to_trt_engine,
      +    )
       
      -if DYNAMO_ENABLED:
      +if ENABLED_FEATURES.dynamo_frontend:
           from torch._export import ExportedProgram
           from torch_tensorrt.dynamo._compiler import compile as dynamo_compile
      +    from torch_tensorrt.dynamo._compiler import (
      +        convert_module_to_trt_engine as dynamo_convert_module_to_trt_engine,
      +    )
      +    from torch_tensorrt.dynamo._tracer import trace as dynamo_trace
       
       logger = logging.getLogger(__name__)
       
      -__all__ = [
      -    "compile",
      -    "convert_method_to_trt_engine",
      -]
      +__all__ = ["compile", "convert_method_to_trt_engine", "save", "load"]
       
       
       def _non_fx_input_interface(
      @@ -482,7 +487,7 @@ 

      Source code for torch_tensorrt._compile

               return _ModuleType.ts
           elif isinstance(module, torch.fx.GraphModule):
               return _ModuleType.fx
      -    elif DYNAMO_ENABLED and isinstance(module, ExportedProgram):
      +    elif isinstance(module, ExportedProgram):
               return _ModuleType.ep
           elif isinstance(module, torch.nn.Module):
               return _ModuleType.nn
      @@ -490,7 +495,7 @@ 

      Source code for torch_tensorrt._compile

               raise RuntimeError("Module is an unknown format")
       
       
      -def _get_target_ir(module_type: _ModuleType, ir: str) -> _IRType:
      +def _get_target_fe(module_type: _ModuleType, ir: str) -> _IRType:
           module_is_tsable = any(module_type == t for t in [_ModuleType.nn, _ModuleType.ts])
           module_is_fxable = any(module_type == t for t in [_ModuleType.nn, _ModuleType.fx])
           module_is_exportable = module_type == _ModuleType.ep
      @@ -501,35 +506,52 @@ 

      Source code for torch_tensorrt._compile

           ir_targets_torch_compile = ir == "torch_compile"
       
           if module_is_tsable and ir_targets_torchscript:
      -        return _IRType.ts
      +        if ENABLED_FEATURES.torchscript_frontend:
      +            return _IRType.ts
      +        else:
      +            raise ValueError(
      +                "Requested using the TS frontend but the TS frontend is not available in this build of Torch-TensorRT"
      +            )
           elif module_is_fxable and ir_targets_fx:
      -        return _IRType.fx
      -    elif module_is_fxable and ir_targets_dynamo:
      -        return _IRType.dynamo
      +        if ENABLED_FEATURES.fx_frontend:
      +            return _IRType.fx
      +        else:
      +            raise ValueError(
      +                "Requested using the FX frontend but the FX frontend is not available in this build of Torch-TensorRT"
      +            )
      +    elif (module_is_fxable or module_is_exportable) and ir_targets_dynamo:
      +        if ENABLED_FEATURES.dynamo_frontend:
      +            return _IRType.dynamo
      +        else:
      +            raise ValueError(
      +                "Requested using the Dynamo frontend but the Dynamo frontend is not available in this build of Torch-TensorRT"
      +            )
           elif module_is_fxable and ir_targets_torch_compile:
      -        return _IRType.torch_compile
      +        if ENABLED_FEATURES.dynamo_frontend:
      +            return _IRType.torch_compile
      +        else:
      +            raise ValueError(
      +                "Requested using the Torch-TensorRT torch.compile backend but the Torch-TensorRT torch.compile backend is not available in this build of Torch-TensorRT"
      +            )
           else:
               if ir == "default":
                   # Options are listed in order of preference
      -            if DYNAMO_ENABLED and module_is_fxable:
      -                logger.info("ir was set to default, using dynamo as ir")
      +            if ENABLED_FEATURES.dynamo_frontend and module_is_fxable:
      +                logger.info("ir was set to default, using dynamo frontend")
                       return _IRType.dynamo
      -            elif module_is_tsable:
      -                if DYNAMO_ENABLED:
      +            elif ENABLED_FEATURES.torchscript_frontend and module_is_tsable:
      +                if ENABLED_FEATURES.dynamo_frontend:
                           logger.warning(
      -                        "Input graph is a Torchscript module but the ir provided is default (dynamo). Please set ir=torchscript to suppress the warning. Compiling the module with ir=torchscript"
      +                        "Input is a torchscript module but the ir was not specified (default=dynamo), please set ir=torchscript to suppress the warning."
                           )
                       return _IRType.ts
      -            elif module_is_exportable:
      +            elif ENABLED_FEATURES.dynamo_frontend and module_is_exportable:
      +                logger.info("ir was set to default, using dynamo frontend")
      +                return _IRType.dynamo
      +            else:
                       raise ValueError(
      -                    "Input graph is an ExportedProgram which is not currently supported. Please provide torch.nn.Module or torch.fx.GraphModule as input."
      +                    f"Module was provided in an unsupported format\nInstalled frontends:\n\tDynamo - {ENABLED_FEATURES.dynamo_frontend}\n\tTorchScript - {ENABLED_FEATURES.torchscript_frontend}\n\tFX - {ENABLED_FEATURES.fx_frontend})"
                       )
      -            else:
      -                raise ValueError("Module was provided in an unsupported format")
      -        elif ir == "exported_program":
      -            raise ValueError(
      -                "ir=exported_program is not currently supported. Supported ir options : ts|fx|dynamo"
      -            )
               else:
                   raise ValueError("Unknown ir was requested")
       
      @@ -579,12 +601,14 @@ 

      Source code for torch_tensorrt._compile

               torch.nn.Module: Compiled Module, when run it will execute via TensorRT
           """
           input_list = inputs if inputs is not None else []
      -    enabled_precisions_set = (
      -        enabled_precisions if enabled_precisions is not None else {torch.float}
      +    enabled_precisions_set: Set[dtype | torch.dtype] = (
      +        enabled_precisions
      +        if enabled_precisions is not None
      +        else _defaults.ENABLED_PRECISIONS
           )
       
           module_type = _parse_module_type(module)
      -    target_ir = _get_target_ir(module_type, ir)
      +    target_ir = _get_target_fe(module_type, ir)
           if target_ir == _IRType.ts:
               ts_mod = module
               if module_type == _ModuleType.nn:
      @@ -626,8 +650,6 @@ 

      Source code for torch_tensorrt._compile

               return compiled_fx_module
           elif target_ir == _IRType.dynamo:
               # Prepare torch and torchtrt inputs
      -        import collections.abc
      -
               from torch_tensorrt.dynamo.utils import prepare_inputs
       
               if not isinstance(input_list, collections.abc.Sequence):
      @@ -635,7 +657,7 @@ 

      Source code for torch_tensorrt._compile

       
               # Export the module
               torchtrt_inputs = prepare_inputs(input_list)
      -        exp_program = torch_tensorrt.dynamo.trace(module, torchtrt_inputs, **kwargs)
      +        exp_program = dynamo_trace(module, torchtrt_inputs, **kwargs)
               trt_graph_module = dynamo_compile(
                   exp_program,
                   inputs=torchtrt_inputs,
      @@ -710,7 +732,7 @@ 

      Source code for torch_tensorrt._compile

           )
       
           module_type = _parse_module_type(module)
      -    target_ir = _get_target_ir(module_type, ir)
      +    target_ir = _get_target_fe(module_type, ir)
           if target_ir == _IRType.ts:
               ts_mod = module
               if module_type == _ModuleType.nn:
      @@ -718,20 +740,34 @@ 

      Source code for torch_tensorrt._compile

                       "Module was provided as a torch.nn.Module, trying to script the module with torch.jit.script. In the event of a failure please preconvert your module to TorchScript"
                   )
                   ts_mod = torch.jit.script(module)
      -        return torch_tensorrt.ts.convert_method_to_trt_engine(  # type: ignore[no-any-return]
      +        serialized_engine: bytes = ts_convert_method_to_trt_engine(
                   ts_mod,
                   inputs=inputs,
                   method_name=method_name,
                   enabled_precisions=enabled_precisions_set,
                   **kwargs,
               )
      +        return serialized_engine
           elif target_ir == _IRType.fx:
               raise RuntimeError(
                   "convert_method_to_trt_engine call is not supported for ir=fx"
               )
           elif target_ir == _IRType.dynamo:
      -        raise RuntimeError(
      -            "convert_method_to_trt_engine call is not supported for ir=dynamo."
      +        # Prepare torch and torchtrt inputs
      +        from torch_tensorrt.dynamo.utils import prepare_inputs
      +
      +        if not isinstance(inputs, collections.abc.Sequence):
      +            inputs = [inputs]
      +
      +        # Export the module
      +        torchtrt_inputs = prepare_inputs(inputs)
      +        exp_program = torch_tensorrt.dynamo.trace(module, torchtrt_inputs, **kwargs)
      +
      +        return dynamo_convert_module_to_trt_engine(  # type: ignore[no-any-return]
      +            exp_program,
      +            inputs=inputs,
      +            enabled_precisions=enabled_precisions_set,
      +            **kwargs,
               )
           elif target_ir == _IRType.torch_compile:
               raise RuntimeError(
      @@ -739,6 +775,111 @@ 

      Source code for torch_tensorrt._compile

               )
           else:
               raise RuntimeError("Module is an unknown format or the ir requested is unknown")
      + + +def load(file_path: str = "") -> Any: + """ + Load either a Torchscript model or ExportedProgram. Autodetect the type using + try, except + """ + try: + logger.debug(f"Loading the provided file {file_path} using torch.jit.load()") + ts_module = torch.jit.load(file_path) + return ts_module + except Exception: + logger.info( + f"Loading the provided file {file_path} via torch.jit.load() failed with the following error", + exc_info=True, + ) + pass + + try: + logger.debug(f"Loading the provided file {file_path} using torch.export.load()") + exp_program = torch.export.load(file_path) + return exp_program + except Exception: + logger.info( + f"Loading the provided file {file_path} via torch.export.load() failed with the following error", + exc_info=True, + ) + raise ValueError( + f"The file {file_path} doesn't correspond to a valid Torchscript module or ExportedProgram. Please verify the file path." + ) + + +def save( + module: Any, + file_path: str = "", + *, + output_format: str = "exported_program", + inputs: Optional[Sequence[torch.Tensor]] = None, + retrace: bool = False, +) -> None: + """ + Save the model to disk in the specified output format. + Arguments: + module : Compiled Torch-TensorRT module (Options include torch.jit.ScriptModule | torch.export.ExportedProgram | torch.fx.GraphModule) + inputs (torch.Tensor): Torch input tensors + output_format: Format to save the model. Options include exported_program | torchscript. + retrace: When the module type is a fx.GraphModule, this option re-exports the graph using torch.export.export(strict=False) to save it. + This flag is experimental for now. + """ + module_type = _parse_module_type(module) + accepted_formats = {"exported_program", "torchscript"} + if inputs is not None and not all( + isinstance(input, torch.Tensor) for input in inputs + ): + raise ValueError( + "Not all inputs provided are torch.tensors. Please provide torch.tensors as inputs" + ) + if output_format not in accepted_formats: + raise ValueError( + f"Provided output_format {output_format} is not supported. Supported options are exported_program | torchscript" + ) + if not file_path: + raise ValueError("File path cannot be empty. Please provide a valid file path") + + if module_type == _ModuleType.nn: + raise ValueError( + "Input model is of type nn.Module. Saving nn.Module directly is not supported. Supported model types torch.jit.ScriptModule | torch.fx.GraphModule | torch.export.ExportedProgram." + ) + elif module_type == _ModuleType.ts: + if output_format == "exported_program": + raise ValueError( + "Provided model is a torch.jit.ScriptModule but the output_format specified is exported_program. Please verify the output_format" + ) + else: + torch.jit.save(module, file_path) + elif module_type == _ModuleType.ep: + if output_format == "torchscript": + raise ValueError( + "Provided model is a torch.export.ExportedProgram but the output_format specified is torchscript. Please verify the output_format" + ) + else: + torch.export.save(module, file_path) + elif module_type == _ModuleType.fx: + if inputs is None: + raise ValueError( + "Provided model is a torch.fx.GraphModule however the inputs are empty. Please provide valid torch.tensors as inputs to trace and save the model" + ) + # The module type is torch.fx.GraphModule + if output_format == "torchscript": + module_ts = torch.jit.trace(module, inputs) + torch.jit.save(module_ts, file_path) + else: + if not retrace: + from torch_tensorrt.dynamo._exporter import export + + exp_program = export(module, inputs) + torch.export.save(exp_program, file_path) + else: + from torch._higher_order_ops.torchbind import enable_torchbind_tracing + + with enable_torchbind_tracing(): + exp_program = torch.export.export( + module, tuple(inputs), strict=False + ) + torch.export.save(exp_program, file_path)
      @@ -790,6 +931,7 @@

      Source code for torch_tensorrt._compile

                
                
                
      +         
                
                
                
      diff --git a/docs/_modules/torch_tensorrt/dynamo/_SourceIR.html b/docs/_modules/torch_tensorrt/dynamo/_SourceIR.html
      index 43efe8b3f0..b1663a8c17 100644
      --- a/docs/_modules/torch_tensorrt/dynamo/_SourceIR.html
      +++ b/docs/_modules/torch_tensorrt/dynamo/_SourceIR.html
      @@ -9,7 +9,7 @@
         
         
         
      -  torch_tensorrt.dynamo._SourceIR — Torch-TensorRT v2.3.0.dev0+85971ff documentation
      +  torch_tensorrt.dynamo._SourceIR — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation
         
       
         
      @@ -234,7 +234,7 @@
                     
                     
                       
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

    Python API Documenation

      @@ -413,43 +414,16 @@

      Source code for torch_tensorrt.dynamo._compiler

      < import collections.abc import logging +import warnings from typing import Any, Collection, List, Optional, Sequence, Set, Tuple, Union import torch from torch.export import ExportedProgram from torch.fx.node import Target from torch_tensorrt._Device import Device -from torch_tensorrt._enums import ( # TODO: Should probabably be the TRT EngineCapability Enum - EngineCapability, -) +from torch_tensorrt._enums import EngineCapability, dtype from torch_tensorrt._Input import Input -from torch_tensorrt.dynamo import partitioning -from torch_tensorrt.dynamo._defaults import ( - DEBUG, - DEVICE, - DISABLE_TF32, - DLA_GLOBAL_DRAM_SIZE, - DLA_LOCAL_DRAM_SIZE, - DLA_SRAM_SIZE, - DRYRUN, - ENABLE_EXPERIMENTAL_DECOMPOSITIONS, - ENGINE_CAPABILITY, - HARDWARE_COMPATIBLE, - MAX_AUX_STREAMS, - MIN_BLOCK_SIZE, - NUM_AVG_TIMING_ITERS, - OPTIMIZATION_LEVEL, - PASS_THROUGH_BUILD_FAILURES, - PRECISION, - REFIT, - REQUIRE_FULL_COMPILATION, - SPARSE_WEIGHTS, - TRUNCATE_LONG_AND_DOUBLE, - USE_FAST_PARTITIONER, - USE_PYTHON_RUNTIME, - VERSION_COMPATIBLE, - WORKSPACE_SIZE, -) +from torch_tensorrt.dynamo import _defaults, partitioning from torch_tensorrt.dynamo._DryRunTracker import ( DryRunTracker, PerSubgraphData, @@ -458,8 +432,10 @@

      Source code for torch_tensorrt.dynamo._compiler

      < ) from torch_tensorrt.dynamo.conversion import ( CompilationSettings, + UnsupportedOperatorException, convert_module, - repair_long_or_double_inputs, + interpret_module_to_result, + repair_double_inputs, ) from torch_tensorrt.dynamo.conversion._ConverterRegistry import ( DYNAMO_CONVERTERS as CONVERTERS, @@ -474,8 +450,6 @@

      Source code for torch_tensorrt.dynamo._compiler

      < to_torch_tensorrt_device, ) -import torch_tensorrt - logger = logging.getLogger(__name__) @@ -483,37 +457,37 @@

      Source code for torch_tensorrt.dynamo._compiler

      < exported_program: ExportedProgram, inputs: Tuple[Any, ...], *, - device: Optional[Union[Device, torch.device, str]] = DEVICE, - disable_tf32: bool = DISABLE_TF32, - sparse_weights: bool = SPARSE_WEIGHTS, - enabled_precisions: Set[torch.dtype] | Tuple[torch.dtype] = (torch.float32,), - engine_capability: EngineCapability = ENGINE_CAPABILITY, - refit: bool = REFIT, - debug: bool = DEBUG, - capability: EngineCapability = EngineCapability.default, - num_avg_timing_iters: int = NUM_AVG_TIMING_ITERS, - workspace_size: int = WORKSPACE_SIZE, - dla_sram_size: int = DLA_SRAM_SIZE, - dla_local_dram_size: int = DLA_LOCAL_DRAM_SIZE, - dla_global_dram_size: int = DLA_GLOBAL_DRAM_SIZE, - calibrator: object = None, - truncate_long_and_double: bool = TRUNCATE_LONG_AND_DOUBLE, - require_full_compilation: bool = REQUIRE_FULL_COMPILATION, - min_block_size: int = MIN_BLOCK_SIZE, + device: Optional[Union[Device, torch.device, str]] = _defaults.DEVICE, + disable_tf32: bool = _defaults.DISABLE_TF32, + sparse_weights: bool = _defaults.SPARSE_WEIGHTS, + enabled_precisions: ( + Set[torch.dtype | dtype] | Tuple[torch.dtype | dtype] + ) = _defaults.ENABLED_PRECISIONS, + engine_capability: EngineCapability = _defaults.ENGINE_CAPABILITY, + refit: bool = _defaults.REFIT, + debug: bool = _defaults.DEBUG, + num_avg_timing_iters: int = _defaults.NUM_AVG_TIMING_ITERS, + workspace_size: int = _defaults.WORKSPACE_SIZE, + dla_sram_size: int = _defaults.DLA_SRAM_SIZE, + dla_local_dram_size: int = _defaults.DLA_LOCAL_DRAM_SIZE, + dla_global_dram_size: int = _defaults.DLA_GLOBAL_DRAM_SIZE, + truncate_double: bool = _defaults.TRUNCATE_DOUBLE, + require_full_compilation: bool = _defaults.REQUIRE_FULL_COMPILATION, + min_block_size: int = _defaults.MIN_BLOCK_SIZE, torch_executed_ops: Optional[Collection[Target]] = None, torch_executed_modules: Optional[List[str]] = None, - pass_through_build_failures: bool = PASS_THROUGH_BUILD_FAILURES, - max_aux_streams: Optional[int] = MAX_AUX_STREAMS, - version_compatible: bool = VERSION_COMPATIBLE, - optimization_level: Optional[int] = OPTIMIZATION_LEVEL, - use_python_runtime: bool = USE_PYTHON_RUNTIME, - use_fast_partitioner: bool = USE_FAST_PARTITIONER, - enable_experimental_decompositions: bool = ENABLE_EXPERIMENTAL_DECOMPOSITIONS, - dryrun: bool = DRYRUN, - hardware_compatible: bool = HARDWARE_COMPATIBLE, + pass_through_build_failures: bool = _defaults.PASS_THROUGH_BUILD_FAILURES, + max_aux_streams: Optional[int] = _defaults.MAX_AUX_STREAMS, + version_compatible: bool = _defaults.VERSION_COMPATIBLE, + optimization_level: Optional[int] = _defaults.OPTIMIZATION_LEVEL, + use_python_runtime: bool = _defaults.USE_PYTHON_RUNTIME, + use_fast_partitioner: bool = _defaults.USE_FAST_PARTITIONER, + enable_experimental_decompositions: bool = _defaults.ENABLE_EXPERIMENTAL_DECOMPOSITIONS, + dryrun: bool = _defaults.DRYRUN, + hardware_compatible: bool = _defaults.HARDWARE_COMPATIBLE, **kwargs: Any, ) -> torch.fx.GraphModule: - """Compile a TorchScript module for NVIDIA GPUs using TensorRT + """Compile an ExportedProgram module for NVIDIA GPUs using TensorRT Takes a existing TorchScript module and a set of settings to configure the compiler and will convert methods to JIT Graphs which call equivalent TensorRT engines @@ -554,7 +528,7 @@

      Source code for torch_tensorrt.dynamo._compiler

      < dla_sram_size (int): Fast software managed RAM used by DLA to communicate within a layer. dla_local_dram_size (int): Host RAM used by DLA to share intermediate tensor data across operations dla_global_dram_size (int): Host RAM used by DLA to store weights and metadata for execution - truncate_long_and_double (bool): Truncate weights provided in int64 or double (float64) to int32 and float32 + truncate_double (bool): Truncate weights provided in double (float64) to float32 calibrator (Union(torch_tensorrt._C.IInt8Calibrator, tensorrt.IInt8Calibrator)): Calibrator object which will provide data to the PTQ system for INT8 Calibration require_full_compilation (bool): Require modules to be compiled end to end or return an error as opposed to returning a hybrid graph where operations that cannot be run in TensorRT are run in PyTorch min_block_size (int): The minimum number of contiguous TensorRT convertable operations in order to run a set of operations in TensorRT @@ -577,12 +551,34 @@

      Source code for torch_tensorrt.dynamo._compiler

      < if debug: set_log_level(logger.parent, logging.DEBUG) + if "truncate_long_and_double" in kwargs.keys(): + if truncate_double is not _defaults.TRUNCATE_DOUBLE: + raise ValueError( + 'Provided configuration for "truncate_double" and deprecated API "truncate_long_and_double", please only use "truncate_double"' + ) + else: + truncate_double = kwargs["truncate_long_and_double"] + warnings.warn( + 'Compiler option "truncate_long_and_double" is deprecated in favor of "truncate_double" as int64 is now natively supported, this option will be removed in the next version', + DeprecationWarning, + stacklevel=2, + ) + + engine_capability = EngineCapability._from(engine_capability) + + if torch_executed_modules is not None and torch_executed_modules: + logger.warning( + f"Detected torch_executed_modules was non-empty: {torch_executed_modules}" + "\nThis feature is unimplemented in Torch-TRT Dynamo currently." + ) + if not isinstance(inputs, collections.abc.Sequence): inputs = [inputs] # Prepare torch_trt inputs inputs = prepare_inputs(inputs) device = to_torch_tensorrt_device(device) + enabled_precisions = {dtype._from(p) for p in enabled_precisions} if not isinstance(exported_program, ExportedProgram): raise AssertionError( @@ -593,48 +589,31 @@

      Source code for torch_tensorrt.dynamo._compiler

      < ) gm = exported_program.module() logger.debug("Input graph: " + str(gm.graph)) - # Apply lowering on the graph module torch_inputs = get_torch_inputs(inputs, device) gm = apply_lowering_passes(gm, torch_inputs) - logger.debug("Lowered Input graph: " + str(gm.graph)) - - enabled_precisions = set(enabled_precisions) - if ( - torch.float16 in enabled_precisions - or torch_tensorrt.dtype.half in enabled_precisions - ): - precision = torch.float16 - elif ( - torch.float32 in enabled_precisions - or torch_tensorrt.dtype.float in enabled_precisions - ): - precision = torch.float32 - elif len(enabled_precisions) == 0: - logger.info(f"No precision specified, defaulting to {PRECISION}") - precision = PRECISION - else: - raise ValueError( - f"Precision {enabled_precisions} not supported in the Dynamo Path" - ) + logger.debug("Lowered Input graph: " + str(gm.graph)) compilation_options = { - "precision": precision, + "enabled_precisions": ( + enabled_precisions if enabled_precisions else _defaults.ENABLED_PRECISIONS + ), "debug": debug, "device": device, "workspace_size": workspace_size, "min_block_size": min_block_size, - "torch_executed_ops": torch_executed_ops - if torch_executed_ops is not None - else set(), + "torch_executed_ops": ( + torch_executed_ops if torch_executed_ops is not None else set() + ), "pass_through_build_failures": pass_through_build_failures, "max_aux_streams": max_aux_streams, "version_compatible": version_compatible, "optimization_level": optimization_level, "use_python_runtime": use_python_runtime, - "truncate_long_and_double": truncate_long_and_double, + "truncate_double": truncate_double, "use_fast_partitioner": use_fast_partitioner, + "num_avg_timing_iters": num_avg_timing_iters, "enable_experimental_decompositions": enable_experimental_decompositions, "require_full_compilation": require_full_compilation, "disable_tf32": disable_tf32, @@ -650,7 +629,8 @@

      Source code for torch_tensorrt.dynamo._compiler

      < settings = CompilationSettings(**compilation_options) logger.info("Compilation Settings: %s\n", settings) - return compile_module(gm, inputs, settings)
      + trt_gm = compile_module(gm, inputs, settings) + return trt_gm
      def compile_module( @@ -685,7 +665,7 @@

      Source code for torch_tensorrt.dynamo._compiler

      < sample_inputs, "shape", lambda x: dict(x) if isinstance(x, dict) else tuple(x) ) dryrun_tracker.graph_input_dtypes = parse_complex_tensor_structs( - sample_inputs, "torch_dtype" + sample_inputs, "dtype", lambda t: t.to(torch.dtype, use_default=True) ) dryrun_tracker.compilation_settings = settings @@ -710,6 +690,22 @@

      Source code for torch_tensorrt.dynamo._compiler

      < f"Detected support for {num_supported_ops} operators out of {total_ops} in subgraph." ) + def contains_metadata(gm: torch.fx.GraphModule) -> bool: + for node in gm.graph.nodes: + if node.op != "output" and (not node.meta) and "val" not in node.meta: + logger.warning( + f"Node {node.name} of op type {node.op} does not have metadata. This could sometimes lead to undefined behavior." + ) + return False + return True + + # Check if the module has metadata (shape, dtype). + if not contains_metadata(gm): + # TODO: For future, explore when nodes don't have metadata and if fake_tensor_prop can resolve this. + logger.warning( + "Some nodes do not have metadata (shape and dtype information). This could lead to problems sometimes if the graph has PyTorch and TensorRT segments." + ) + # Partition module into components that can be TRT-accelerated fast_partitioner_failed = False @@ -768,12 +764,7 @@

      Source code for torch_tensorrt.dynamo._compiler

      < ) # Get the submodule inputs for min, opt, max shapes of the graph inputs - submodule_inputs = partitioning.get_submod_inputs( - partitioned_module, - submodule, - sample_inputs, - to_torch_device(settings.device), - ) + submodule_inputs = partitioning.construct_submodule_inputs(submodule) logger.debug( "Submodule name: %s\n Input shapes: %s\n %s", @@ -784,8 +775,8 @@

      Source code for torch_tensorrt.dynamo._compiler

      < assert submodule_inputs is not None # Handle long/double inputs if requested by the user - if settings.truncate_long_and_double: - submodule_inputs = repair_long_or_double_inputs( + if settings.truncate_double: + submodule_inputs = repair_double_inputs( partitioned_module, submodule, submodule_inputs, @@ -799,7 +790,7 @@

      Source code for torch_tensorrt.dynamo._compiler

      < lambda x: dict(x) if isinstance(x, dict) else tuple(x), ) subgraph_data.subgraph_input_dtypes = parse_complex_tensor_structs( - submodule_inputs, "torch_dtype" + submodule_inputs, "dtype", lambda t: t.to(torch.dtype) ) submodule_outputs = submodule( @@ -854,6 +845,182 @@

      Source code for torch_tensorrt.dynamo._compiler

      < dryrun_stats_display(dryrun_tracker, settings.dryrun) return partitioned_module + + +
      [docs]def convert_module_to_trt_engine( + exported_program: ExportedProgram, + inputs: Tuple[Any, ...], + *, + enabled_precisions: ( + Set[torch.dtype | dtype] | Tuple[torch.dtype | dtype] + ) = _defaults.ENABLED_PRECISIONS, + debug: bool = _defaults.DEBUG, + workspace_size: int = _defaults.WORKSPACE_SIZE, + min_block_size: int = _defaults.MIN_BLOCK_SIZE, + torch_executed_ops: Optional[Set[str]] = None, + pass_through_build_failures: bool = _defaults.PASS_THROUGH_BUILD_FAILURES, + max_aux_streams: Optional[int] = _defaults.MAX_AUX_STREAMS, + version_compatible: bool = _defaults.VERSION_COMPATIBLE, + optimization_level: Optional[int] = _defaults.OPTIMIZATION_LEVEL, + use_python_runtime: Optional[bool] = _defaults.USE_PYTHON_RUNTIME, + truncate_double: bool = _defaults.TRUNCATE_DOUBLE, + use_fast_partitioner: bool = _defaults.USE_FAST_PARTITIONER, + enable_experimental_decompositions: bool = _defaults.ENABLE_EXPERIMENTAL_DECOMPOSITIONS, + device: Device = Device._current_device(), + require_full_compilation: bool = _defaults.REQUIRE_FULL_COMPILATION, + disable_tf32: bool = _defaults.DISABLE_TF32, + sparse_weights: bool = _defaults.SPARSE_WEIGHTS, + refit: bool = _defaults.REFIT, + engine_capability: EngineCapability = _defaults.ENGINE_CAPABILITY, + num_avg_timing_iters: int = _defaults.NUM_AVG_TIMING_ITERS, + dla_sram_size: int = _defaults.DLA_SRAM_SIZE, + dla_local_dram_size: int = _defaults.DLA_LOCAL_DRAM_SIZE, + dla_global_dram_size: int = _defaults.DLA_GLOBAL_DRAM_SIZE, + calibrator: object = None, + allow_shape_tensors: bool = False, + **kwargs: Any, +) -> bytes: + """Convert an ExportedProgram to a serialized TensorRT engine + + Converts an ExportedProgram to a serialized TensorRT engine given a dictionary of conversion settings + + Arguments: + exported_program (torch.export.ExportedProgram): Source module + + Keyword Args: + inputs (Optional[Sequence[torch_tensorrt.Input | torch.Tensor]]): **Required** List of specifications of input shape, dtype and memory layout for inputs to the module. This argument is required. Input Sizes can be specified as torch sizes, tuples or lists. dtypes can be specified using + torch datatypes or torch_tensorrt datatypes and you can use either torch devices or the torch_tensorrt device type enum + to select device type. :: + + input=[ + torch_tensorrt.Input((1, 3, 224, 224)), # Static NCHW input shape for input #1 + torch_tensorrt.Input( + min_shape=(1, 224, 224, 3), + opt_shape=(1, 512, 512, 3), + max_shape=(1, 1024, 1024, 3), + dtype=torch.int32 + format=torch.channel_last + ), # Dynamic input shape for input #2 + torch.randn((1, 3, 224, 244)) # Use an example tensor and let torch_tensorrt infer settings + ] + enabled_precisions (Optional[Set[torch.dtype | _enums.dtype]]): The set of datatypes that TensorRT can use + debug (bool): Whether to print out verbose debugging information + workspace_size (int): Workspace TRT is allowed to use for the module (0 is default) + min_block_size (int): Minimum number of operators per TRT-Engine Block + torch_executed_ops (Set[str]): Set of operations to run in Torch, regardless of converter coverage + pass_through_build_failures (bool): Whether to fail on TRT engine build errors (True) or not (False) + max_aux_streams (Optional[int]): Maximum number of allowed auxiliary TRT streams for each engine + version_compatible (bool): Provide version forward-compatibility for engine plan files + optimization_level (Optional[int]): Builder optimization 0-5, higher levels imply longer build time, + searching for more optimization options. TRT defaults to 3 + use_python_runtime (Optional[bool]): Whether to strictly use Python runtime or C++ runtime. To auto-select a runtime + based on C++ dependency presence (preferentially choosing C++ runtime if available), leave the + argument as None + truncate_double (bool): Whether to truncate float64 TRT engine inputs or weights to float32 + use_fast_partitioner (bool): Whether to use the fast or global graph partitioning system + enable_experimental_decompositions (bool): Whether to enable all core aten decompositions + or only a selected subset of them + device (Device): GPU to compile the model on + require_full_compilation (bool): Whether to require the graph is fully compiled in TensorRT. + Only applicable for `ir="dynamo"`; has no effect for `torch.compile` path + disable_tf32 (bool): Whether to disable TF32 computation for TRT layers + sparse_weights (bool): Whether to allow the builder to use sparse weights + refit (bool): Whether to build a refittable engine + engine_capability (trt.EngineCapability): Restrict kernel selection to safe gpu kernels or safe dla kernels + num_avg_timing_iters (int): Number of averaging timing iterations used to select kernels + dla_sram_size (int): Fast software managed RAM used by DLA to communicate within a layer. + dla_local_dram_size (int): Host RAM used by DLA to share intermediate tensor data across operations + dla_global_dram_size (int): Host RAM used by DLA to store weights and metadata for execution + calibrator (Union(torch_tensorrt._C.IInt8Calibrator, tensorrt.IInt8Calibrator)): Calibrator object which will provide data to the PTQ system for INT8 Calibration + allow_shape_tensors: (Experimental) Allow aten::size to output shape tensors using IShapeLayer in TensorRT + + Returns: + bytes: Serialized TensorRT engine, can either be saved to a file or deserialized via TensorRT APIs + """ + if debug: + set_log_level(logger.parent, logging.DEBUG) + + if "truncate_long_and_double" in kwargs.keys(): + if truncate_double is not _defaults.TRUNCATE_DOUBLE: + raise ValueError( + 'Provided configuration for "truncate_double" and deprecated API "truncate_long_and_double", please only use "truncate_double"' + ) + else: + truncate_double = kwargs["truncate_long_and_double"] + warnings.warn( + 'Compiler option "truncate_long_and_double" is deprecated in favor of "truncate_double" as int64 is now natively supported, this option will be removed in the next version', + DeprecationWarning, + stacklevel=2, + ) + + input_list = list(inputs) if inputs is not None else [] + torch_executed_ops = torch_executed_ops if torch_executed_ops is not None else set() + # Prepare torch_trt inputs + input_list = prepare_inputs(input_list) + device = to_torch_tensorrt_device(device) + + enabled_precisions = {dtype._from(e) for e in enabled_precisions} + + compilation_options = { + "enabled_precisions": enabled_precisions, + "debug": debug, + "workspace_size": workspace_size, + "min_block_size": min_block_size, + "torch_executed_ops": torch_executed_ops, + "pass_through_build_failures": pass_through_build_failures, + "max_aux_streams": max_aux_streams, + "version_compatible": version_compatible, + "optimization_level": optimization_level, + "use_python_runtime": use_python_runtime, + "truncate_double": truncate_double, + "use_fast_partitioner": use_fast_partitioner, + "enable_experimental_decompositions": enable_experimental_decompositions, + "device": device, + "require_full_compilation": require_full_compilation, + "disable_tf32": disable_tf32, + "sparse_weights": sparse_weights, + "refit": refit, + "engine_capability": engine_capability, + "num_avg_timing_iters": num_avg_timing_iters, + "dla_sram_size": dla_sram_size, + "dla_local_dram_size": dla_local_dram_size, + "dla_global_dram_size": dla_global_dram_size, + } + + # Decompose the exported program + exported_program = exported_program.run_decompositions( + get_decompositions(enable_experimental_decompositions) + ) + gm = exported_program.module() + logger.debug("Input graph: " + str(gm.graph)) + + # Apply lowering on the graph module + torch_inputs = get_torch_inputs(input_list, device) + gm = apply_lowering_passes(gm, torch_inputs) + logger.debug("Lowered Input graph: " + str(gm.graph)) + + settings = CompilationSettings(**compilation_options) + logger.info("Compilation Settings: %s\n", settings) + try: + interpreter_result = interpret_module_to_result(gm, input_list, settings) + except UnsupportedOperatorException: + logger.error( + f"Conversion of module {gm} not currently fully supported or convertible!", + exc_info=True, + ) + except Exception as e: + logger.error( + f"While interpreting the module got an error: {e}", + exc_info=True, + ) + + import io + + with io.BytesIO() as engine_bytes: + engine_bytes.write(interpreter_result.engine) + engine_bytearray = engine_bytes.getvalue() + + return engine_bytearray
      @@ -905,6 +1072,7 @@

      Source code for torch_tensorrt.dynamo._compiler

      < + diff --git a/docs/_modules/torch_tensorrt/dynamo/_exporter.html b/docs/_modules/torch_tensorrt/dynamo/_exporter.html index d19fc4413b..3682f54428 100644 --- a/docs/_modules/torch_tensorrt/dynamo/_exporter.html +++ b/docs/_modules/torch_tensorrt/dynamo/_exporter.html @@ -9,7 +9,7 @@ - torch_tensorrt.dynamo._exporter — Torch-TensorRT v2.3.0.dev0+85971ff documentation + torch_tensorrt.dynamo._exporter — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -234,7 +234,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

      @@ -418,8 +419,11 @@

      Source code for torch_tensorrt.dynamo._exporter

      < from torch._subclasses.fake_tensor import FakeTensor from torch.export import ExportedProgram, ExportGraphSignature from torch.export.exported_program import ( + CustomObjArgument, InputKind, InputSpec, + ModuleCallEntry, + ModuleCallSignature, OutputKind, OutputSpec, TensorArgument, @@ -430,50 +434,36 @@

      Source code for torch_tensorrt.dynamo._exporter

      <
      [docs]def export( gm: torch.fx.GraphModule, inputs: Sequence[torch.Tensor], - *, - ir: str = "torchscript", ) -> ExportedProgram: - """Export a program (``torch.fx.GraphModule``) for serialization with the TensorRT engines embedded. - - > Note: When ExportedProgram becomes stable, this function will get merged into ``torch_tensorrt.dynamo.compile`` + """Export the result of TensorRT compilation into the desired output format. Arguments: - src_gm (torch.fx.GraphModule): Source module, generated by torch.export (The module provided to ``torch_tensorrt.dynamo.compile``) gm (torch.fx.GraphModule): Compiled Torch-TensorRT module, generated by ``torch_tensorrt.dynamo.compile`` - - Keyword Arguments: - inputs (Any): **Required** List of specifications of input shape, dtype and memory layout for inputs to the module. This argument is required. Input Sizes can be specified as torch sizes, tuples or lists. dtypes can be specified using - torch datatypes or torch_tensorrt datatypes and you can use either torch devices or the torch_tensorrt device type enum - to select device type. :: - - input=[ - torch_tensorrt.Input((1, 3, 224, 224)), # Static NCHW input shape for input #1 - torch_tensorrt.Input( - min_shape=(1, 224, 224, 3), - opt_shape=(1, 512, 512, 3), - max_shape=(1, 1024, 1024, 3), - dtype=torch.int32 - format=torch.channel_last - ), # Dynamic input shape for input #2 - torch.randn((1, 3, 224, 244)) # Use an example tensor and let torch_tensorrt infer settings - ir (str): torchscript | exported_program. Based on the provided ir, the output type would be a torchscript or exported program. + inputs (torch.Tensor): Torch input tensors """ - if ir == "torchscript": - return torch.jit.trace(gm, inputs) - elif ir == "exported_program": - patched_module = transform(gm, inputs) - exp_program = create_trt_exp_program(patched_module) - - return exp_program - else: - raise ValueError( - f"Invalid ir : {ir} provided for serialization. Options include torchscript | exported_program" - )
      + patched_module = transform(gm, inputs) + exp_program = create_trt_exp_program(patched_module) + return exp_program
      def transform( gm: torch.fx.GraphModule, inputs: Sequence[torch.Tensor] ) -> torch.fx.GraphModule: + """ + Transforms the graphmodule by inlining Pytorch and TensorRT submodules. + Inlining collapses submodules into nodes which is necessary for torch.export + serialization. + + Arguments: + gm (torch.fx.GraphModule): Compiled Torch-TensorRT module, generated by ``torch_tensorrt.dynamo.compile`` + inputs (torch.Tensor): Torch input tensors + + Returns an inlined torch.fx.GraphModule + """ + # Make a copy the graph since this function transforms the input graph and changes it's attributes. + # This transformed graph is meant to be consumed by `create_trt_exp_program` + gm = copy.deepcopy(gm) + # Run shape analysis _, outputs_map = partitioning.run_shape_analysis(gm, inputs) @@ -483,10 +473,6 @@

      Source code for torch_tensorrt.dynamo._exporter

      < # Inline pytorch submodules inline_torch_modules(gm) - # Lift constant buffers and parameters in the graph - # torch.export serialization expects them to be lifted - lift_constant_pass(gm) - # Clean the graph gm.delete_all_unused_submodules() gm.graph.eliminate_dead_code() @@ -495,34 +481,109 @@

      Source code for torch_tensorrt.dynamo._exporter

      < return gm -def lift_constant_pass(trt_gm: torch.fx.GraphModule) -> torch.fx.GraphModule: +def lift( + gm: torch.fx.GraphModule, graph_signature: Any +) -> Tuple[torch.fx.GraphModule, ExportGraphSignature, Dict[str, Any], Dict[str, Any]]: + """ + Given an unlifted fx.GraphModule, lift all parameters, buffers into placeholders. + Arguments: + gm (torch.fx.GraphModule): Unlifted GraphModule which contains parameters and buffers as get_attr nodes. + graph_signature (torch.export.ExportGraphSignature): Instance of ExportGraphSignature class created for the output ExportedProgram. + After lifting, this graph_signature will be modified with the parameters and buffers added appropriately. + Returns: + A lifted fx.GraphModule, modified graph_signature and a new state_dict + """ + # Get the state_dict of graph_module. This is different from exported_program.state_dict + # exp_program.state_dict contains parameters and buffers whereas a graph_module's state_dict + # has all parameters registered as torch.tensors. + state_dict = gm.state_dict() + constants = {} + fake_mode = detect_fake_mode( - tuple( - node.meta["val"] for node in trt_gm.graph.nodes if node.op == "placeholder" - ) + tuple(node.meta["val"] for node in gm.graph.nodes if node.op == "placeholder") ) + assert fake_mode is not None + # Locate the user input to insert new placeholders before them first_user_input = None - for node in trt_gm.graph.nodes: - if node.op == "placeholder": + for node in gm.graph.nodes: + if node.op == "placeholder" and node.name in graph_signature.user_inputs: first_user_input = node break - for node in trt_gm.graph.nodes: + # At first the user_inputs are only present in the graph_signature.input_specs and hence non_user_input_idx=0 + # The input_specs should be of the form [params, buffers, constant_tensors, custom_obj, user_inputs] + non_user_input_idx = 0 + for node in gm.graph.nodes: if node.op == "get_attr": - constant_tensor = getattr(trt_gm, node.target) - with trt_gm.graph.inserting_before(first_user_input): - const_placeholder_node = trt_gm.graph.placeholder(node.target) - const_placeholder_node.meta = copy.deepcopy(node.meta) - const_placeholder_node.meta["val"] = fake_mode.from_tensor( - constant_tensor + + lift_val = None + input_kind = None + + if node.target not in state_dict: + constants[node.target] = getattr(gm, node.target) + input_kind = InputKind.CUSTOM_OBJ + lift_val = constants[node.target] + else: + lift_val = state_dict[node.target] + + input_kind = InputKind.CONSTANT_TENSOR + + # state_dict has these parameters/buffers as torch.Tensors. We override them as torch.nn.Parameter/torch.Tensors respectively. + for name, _ in gm.named_parameters(): + if node.target == name: + input_kind = InputKind.PARAMETER + state_dict[name] = torch.nn.Parameter(state_dict[name]) + break + for name, _ in gm.named_buffers(): + if node.target == name: + input_kind = InputKind.BUFFER + break + + assert lift_val is not None and input_kind is not None + + # Replace get_attr nodes with placeholder nodes and copy metadata. + with gm.graph.inserting_before(first_user_input): + # Ensure name doesn't contain period as it is used for submodules + const_placeholder_node = gm.graph.placeholder( + node.target.replace(".", "_") ) + # Copy the node meta into this new placeholder node + const_placeholder_node.meta = node.meta + + if isinstance(lift_val, torch.Tensor): + const_placeholder_node.meta["val"] = cast( + FakeTensor, + torch.empty_strided( + tuple(lift_val.shape), + tuple([1] * len(lift_val.shape)), + ), + ) + node.replace_all_uses_with(const_placeholder_node) - trt_gm.graph.erase_node(node) + gm.graph.erase_node(node) + + # Add these parameters/buffers/constants to the existing graph signature + # before user inputs. These specs are looked up in the state_dict during ExportedProgram creation. + input_spec_arg = TensorArgument(name=const_placeholder_node.name) + if input_kind == InputKind.CUSTOM_OBJ: + input_spec_arg = CustomObjArgument( + name=const_placeholder_node.name, class_fqn="" + ) + graph_signature.input_specs.insert( + non_user_input_idx, + InputSpec( + kind=input_kind, + arg=input_spec_arg, + target=node.target, + ), + ) + non_user_input_idx += 1 - trt_gm.graph.eliminate_dead_code() - trt_gm.graph.lint() - return trt_gm + gm.graph.eliminate_dead_code() + gm.graph.lint() + + return gm, graph_signature, state_dict, constants def get_duplicate_nodes( @@ -551,7 +612,7 @@

      Source code for torch_tensorrt.dynamo._exporter

      < def inline_torch_modules(gm: torch.fx.GraphModule) -> torch.fx.GraphModule: """ Inline a submodule within the parent graph (gm). All `call_module` nodes - should be replaced by their submodule nodes. + should be replaced by their nodes in the submodule. """ # Clean the graph gm.graph.eliminate_dead_code() @@ -576,7 +637,6 @@

      Source code for torch_tensorrt.dynamo._exporter

      < # Copy all nodes in the submodule into gm and # store the output node of this submodule which is now present in gm - submodule_output = gm.graph.graph_copy(submodule.graph, val_map) # Get their references (since we copied) in the parent graph (gm) @@ -608,7 +668,7 @@

      Source code for torch_tensorrt.dynamo._exporter

      < gm_node.replace_all_uses_with(submodule_output) # copy the attributes of the submodule into gm (graph_copy doesn't do this) - copy_submodule_attributes(gm, gm_node.name) + copy_submodule_attributes(gm, submodule, gm_node.name) # Erase the pytorch submodule (call_module) node gm.graph.erase_node(gm_node) @@ -616,20 +676,24 @@

      Source code for torch_tensorrt.dynamo._exporter

      < return gm -def copy_submodule_attributes(gm: torch.fx.GraphModule, submod_name: str) -> None: +def copy_submodule_attributes( + gm: torch.fx.GraphModule, submodule: torch.fx.GraphModule, submodule_name: str +) -> None: """ - Copy the getattr attriibutes from submodule to parent module gm. - The graph_copy call doesn't do this for us unfortunately. + The submodule parameters are available in the parent gm's state_dict, but they have + the submodule name as a prefix in their keys. For eg: gm.state_dict() would have + _run_on_gpu_0.conv.weight etc. Since we graph copied the submodule into gm, we should + also copy it's parameters and buffers into gm without the submodule namespace as prefix. + _assign_attr does exactly that. It creates a module for eg: conv, adds an attribute weight + to it and adds this conv module as an attribute to parent gm. """ - for param in gm.named_parameters(): - if param[0].startswith(submod_name + "."): - attr_name = param[0].replace(submod_name + ".", "") - gm.register_parameter(attr_name, param[1]) + from torch.export.unflatten import _assign_attr, _AttrKind - for buffer in gm.named_buffers(): - if buffer[0].startswith(submod_name + "."): - attr_name = buffer[0].replace(submod_name + ".", "") - gm.register_buffer(attr_name, buffer[1]) + for key, value in submodule.named_parameters(): + _assign_attr(value, gm, key, _AttrKind.PARAMETER) + + for key, value in submodule.named_buffers(): + _assign_attr(value, gm, key, _AttrKind.BUFFER) def create_trt_exp_program( @@ -638,6 +702,7 @@

      Source code for torch_tensorrt.dynamo._exporter

      < """Creates a new Exported Program. This function takes an torch.fx.GraphModule which has TRT engines and constructs an Exported Program object with the new IO node names and state_dict """ + input_nodes = [node for node in gm.graph.nodes if node.op == "placeholder"] output_nodes = [node for node in gm.graph.nodes if node.op == "output"] assert output_nodes @@ -656,8 +721,30 @@

      Source code for torch_tensorrt.dynamo._exporter

      < input_specs=input_specs, output_specs=output_specs ) + module_call_graph = [ + ModuleCallEntry( + "", + ModuleCallSignature( + inputs=[], + outputs=[], + in_spec=gm.graph._codegen.pytree_info.in_spec, + out_spec=gm.graph._codegen.pytree_info.out_spec, + ), + ) + ] + + # Lift parameters/buffers/constants in the graph + # torch.export serialization expects them to be lifted + gm, trt_graph_signature, state_dict, constants = lift(gm, trt_graph_signature) + trt_exp_program = ExportedProgram( - gm, gm.graph, trt_graph_signature, gm.state_dict(), {}, [], [], [] + root=gm, + graph=gm.graph, + graph_signature=trt_graph_signature, + state_dict=state_dict, + range_constraints={}, + module_call_graph=module_call_graph, + constants=constants, ) return trt_exp_program @@ -684,9 +771,13 @@

      Source code for torch_tensorrt.dynamo._exporter

      < num_outputs = len(outputs_map[trt_module_node.name]) # Insert a call_function node to perform inference on TRT engine with gm.graph.inserting_before(trt_module_node): + engine_name = f"{name}_engine" + setattr(gm, engine_name, trt_module.engine) + engine_node = gm.graph.get_attr(engine_name) + trt_node = gm.graph.call_function( torch.ops.tensorrt.execute_engine.default, - (trt_module_node.args, trt_module.engine), + (trt_module_node.args, engine_node), ) trt_node.meta["val"] = [] assert num_outputs > 0 @@ -702,6 +793,13 @@

      Source code for torch_tensorrt.dynamo._exporter

      < ) ) + # meta["val"] should be a lighter version of a tensor. For eg: it should be a FakeTensor (with output shape and dtype properties) + # Lighter version of a custom_obj is not defined clearly. meta["val"] does not have any type expectations but + # for custom object nodes, it should be CustomObjArgument + engine_node.meta["val"] = CustomObjArgument( + name=engine_node.name, class_fqn="" + ) + if num_outputs == 1: # Insert getitem nodes as outputs (for export serialization to work) with gm.graph.inserting_after(trt_node): @@ -772,6 +870,7 @@

      Source code for torch_tensorrt.dynamo._exporter

      < + diff --git a/docs/_modules/torch_tensorrt/dynamo/_settings.html b/docs/_modules/torch_tensorrt/dynamo/_settings.html index fdaeb0995c..d9a6da18a8 100644 --- a/docs/_modules/torch_tensorrt/dynamo/_settings.html +++ b/docs/_modules/torch_tensorrt/dynamo/_settings.html @@ -9,7 +9,7 @@ - torch_tensorrt.dynamo._settings — Torch-TensorRT v2.3.0.dev0+85971ff documentation + torch_tensorrt.dynamo._settings — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -234,7 +234,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

      @@ -410,12 +411,11 @@

      Source code for torch_tensorrt.dynamo._settings

       from dataclasses import dataclass, field
      -from typing import Collection, Optional, Union
      +from typing import Collection, Optional, Set, Union
       
      -import torch
      -from tensorrt import EngineCapability
       from torch.fx.node import Target
       from torch_tensorrt._Device import Device
      +from torch_tensorrt._enums import EngineCapability, dtype
       from torch_tensorrt.dynamo._defaults import (
           DEBUG,
           DISABLE_TF32,
      @@ -424,6 +424,7 @@ 

      Source code for torch_tensorrt.dynamo._settings

      < DLA_SRAM_SIZE, DRYRUN, ENABLE_EXPERIMENTAL_DECOMPOSITIONS, + ENABLED_PRECISIONS, ENGINE_CAPABILITY, HARDWARE_COMPATIBLE, MAX_AUX_STREAMS, @@ -431,11 +432,10 @@

      Source code for torch_tensorrt.dynamo._settings

      < NUM_AVG_TIMING_ITERS, OPTIMIZATION_LEVEL, PASS_THROUGH_BUILD_FAILURES, - PRECISION, REFIT, REQUIRE_FULL_COMPILATION, SPARSE_WEIGHTS, - TRUNCATE_LONG_AND_DOUBLE, + TRUNCATE_DOUBLE, USE_FAST_PARTITIONER, USE_PYTHON_RUNTIME, VERSION_COMPATIBLE, @@ -449,7 +449,7 @@

      Source code for torch_tensorrt.dynamo._settings

      < """Compilation settings for Torch-TensorRT Dynamo Paths Args: - precision (torch.dtype): Model Layer precision + enabled_precisions (Set[dtype]): Available kernel dtype precisions debug (bool): Whether to print out verbose debugging information workspace_size (int): Workspace TRT is allowed to use for the module (0 is default) min_block_size (int): Minimum number of operators per TRT-Engine Block @@ -462,7 +462,7 @@

      Source code for torch_tensorrt.dynamo._settings

      < use_python_runtime (Optional[bool]): Whether to strictly use Python runtime or C++ runtime. To auto-select a runtime based on C++ dependency presence (preferentially choosing C++ runtime if available), leave the argument as None - truncate_long_and_double (bool): Whether to truncate int64/float64 TRT engine inputs or weights to int32/float32 + truncate_double (bool): Whether to truncate float64 TRT engine inputs or weights to float32 use_fast_partitioner (bool): Whether to use the fast or global graph partitioning system enable_experimental_decompositions (bool): Whether to enable all core aten decompositions or only a selected subset of them @@ -483,7 +483,7 @@

      Source code for torch_tensorrt.dynamo._settings

      < hardware_compatible (bool): Build the TensorRT engines compatible with GPU architectures other than that of the GPU on which the engine was built (currently works for NVIDIA Ampere and newer) """ - precision: torch.dtype = PRECISION + enabled_precisions: Set[dtype] = field(default_factory=lambda: ENABLED_PRECISIONS) debug: bool = DEBUG workspace_size: int = WORKSPACE_SIZE min_block_size: int = MIN_BLOCK_SIZE @@ -493,7 +493,7 @@

      Source code for torch_tensorrt.dynamo._settings

      < version_compatible: bool = VERSION_COMPATIBLE optimization_level: Optional[int] = OPTIMIZATION_LEVEL use_python_runtime: Optional[bool] = USE_PYTHON_RUNTIME - truncate_long_and_double: bool = TRUNCATE_LONG_AND_DOUBLE + truncate_double: bool = TRUNCATE_DOUBLE use_fast_partitioner: bool = USE_FAST_PARTITIONER enable_experimental_decompositions: bool = ENABLE_EXPERIMENTAL_DECOMPOSITIONS device: Device = field(default_factory=default_device) @@ -501,7 +501,9 @@

      Source code for torch_tensorrt.dynamo._settings

      < disable_tf32: bool = DISABLE_TF32 sparse_weights: bool = SPARSE_WEIGHTS refit: bool = REFIT - engine_capability: EngineCapability = ENGINE_CAPABILITY + engine_capability: EngineCapability = field( + default_factory=lambda: ENGINE_CAPABILITY + ) num_avg_timing_iters: int = NUM_AVG_TIMING_ITERS dla_sram_size: int = DLA_SRAM_SIZE dla_local_dram_size: int = DLA_LOCAL_DRAM_SIZE @@ -559,6 +561,7 @@

      Source code for torch_tensorrt.dynamo._settings

      < + diff --git a/docs/_modules/torch_tensorrt/dynamo/_tracer.html b/docs/_modules/torch_tensorrt/dynamo/_tracer.html index 2983329890..3486e3e6ef 100644 --- a/docs/_modules/torch_tensorrt/dynamo/_tracer.html +++ b/docs/_modules/torch_tensorrt/dynamo/_tracer.html @@ -9,7 +9,7 @@ - torch_tensorrt.dynamo._tracer — Torch-TensorRT v2.3.0.dev0+85971ff documentation + torch_tensorrt.dynamo._tracer — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -234,7 +234,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

    Python API Documenation

      @@ -428,13 +429,13 @@

      Source code for torch_tensorrt.fx.fx2trt

       from .converter_registry import CONVERTERS
       from .input_tensor_spec import InputTensorSpec
       from .observer import Observer
      -from .utils import get_dynamic_dims, LowerPrecision, unified_dtype_converter, Frameworks
      +from .utils import Frameworks, LowerPrecision, get_dynamic_dims, unified_dtype_converter
       
       _LOGGER: logging.Logger = logging.getLogger(__name__)
       
      -TRT_INTERPRETER_CALL_PRE_OBSERVER: Observer[
      -    Callable[[torch.fx.GraphModule], None]
      -] = Observer("TRT_INTERPRETER_CALL_PRE_OBSERVER")
      +TRT_INTERPRETER_CALL_PRE_OBSERVER: Observer[Callable[[torch.fx.GraphModule], None]] = (
      +    Observer("TRT_INTERPRETER_CALL_PRE_OBSERVER")
      +)
       
       
       
      [docs]class TRTInterpreterResult(NamedTuple): @@ -486,9 +487,9 @@

      Source code for torch_tensorrt.fx.fx2trt

               self._cur_node_name: Optional[str] = None
               self._input_names: List[str] = []
               self._output_names: List[str] = []
      -        self._itensor_to_tensor_meta: Dict[
      -            trt.tensorrt.ITensor, TensorMetadata
      -        ] = dict()
      +        self._itensor_to_tensor_meta: Dict[trt.tensorrt.ITensor, TensorMetadata] = (
      +            dict()
      +        )
       
           def validate_input_specs(self):
               for shape, _, _, shape_ranges, has_batch_dim in self.input_specs:
      @@ -848,6 +849,7 @@ 

      Source code for torch_tensorrt.fx.fx2trt

                
                
                
      +         
                
                
                
      diff --git a/docs/_modules/torch_tensorrt/fx/input_tensor_spec.html b/docs/_modules/torch_tensorrt/fx/input_tensor_spec.html
      index ef37670a09..1df6852694 100644
      --- a/docs/_modules/torch_tensorrt/fx/input_tensor_spec.html
      +++ b/docs/_modules/torch_tensorrt/fx/input_tensor_spec.html
      @@ -9,7 +9,7 @@
         
         
         
      -  torch_tensorrt.fx.input_tensor_spec — Torch-TensorRT v2.3.0.dev0+85971ff documentation
      +  torch_tensorrt.fx.input_tensor_spec — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation
         
       
         
      @@ -234,7 +234,7 @@
                     
                     
                       
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

    Python API Documenation

      @@ -427,7 +428,6 @@

      Source code for torch_tensorrt.fx.lower

       from .passes.pass_utils import PassFunc, validate_inference
       from .tools.timing_cache_utils import TimingCacheManager
       from .tools.trt_splitter import TRTSplitter, TRTSplitterSetting
      -
       from .tracer.acc_tracer import acc_tracer
       from .trt_module import TRTModule
       from .utils import LowerPrecision
      @@ -537,9 +537,11 @@ 

      Source code for torch_tensorrt.fx.lower

                   input_specs=self.lower_setting.input_specs,
                   explicit_batch_dimension=self.lower_setting.explicit_batch_dimension,
                   explicit_precision=self.lower_setting.explicit_precision,
      -            logger_level=trt.Logger.VERBOSE
      -            if self.lower_setting.verbose_log
      -            else trt.Logger.WARNING,
      +            logger_level=(
      +                trt.Logger.VERBOSE
      +                if self.lower_setting.verbose_log
      +                else trt.Logger.WARNING
      +            ),
               )
       
               interp_result: TRTInterpreterResult = interpreter.run(
      @@ -549,9 +551,11 @@ 

      Source code for torch_tensorrt.fx.lower

                   strict_type_constraints=self.lower_setting.strict_type_constraints,
                   algorithm_selector=algo_selector,
                   timing_cache=cache_data,
      -            profiling_verbosity=trt.ProfilingVerbosity.DETAILED
      -            if self.lower_setting.verbose_profile
      -            else trt.ProfilingVerbosity.LAYER_NAMES_ONLY,
      +            profiling_verbosity=(
      +                trt.ProfilingVerbosity.DETAILED
      +                if self.lower_setting.verbose_profile
      +                else trt.ProfilingVerbosity.LAYER_NAMES_ONLY
      +            ),
                   tactic_sources=self.lower_setting.tactic_sources,
               )
       
      @@ -708,10 +712,8 @@ 

      Source code for torch_tensorrt.fx.lower

                       # handle inputs with custom types. By default, just handle
                       # tensors and NoneType.
                       if fp16_conversion_fn is None:
      -                    conversion_fn = (
      -                        lambda x: x.half()
      -                        if x is not None and x.dtype == torch.float32
      -                        else x
      +                    conversion_fn = lambda x: (
      +                        x.half() if x is not None and x.dtype == torch.float32 else x
                           )
                       else:
                           conversion_fn = fp16_conversion_fn
      @@ -780,6 +782,7 @@ 

      Source code for torch_tensorrt.fx.lower

                
                
                
      +         
                
                
                
      diff --git a/docs/_modules/torch_tensorrt/fx/trt_module.html b/docs/_modules/torch_tensorrt/fx/trt_module.html
      index 7178ef3936..ec4b5e79fc 100644
      --- a/docs/_modules/torch_tensorrt/fx/trt_module.html
      +++ b/docs/_modules/torch_tensorrt/fx/trt_module.html
      @@ -9,7 +9,7 @@
         
         
         
      -  torch_tensorrt.fx.trt_module — Torch-TensorRT v2.3.0.dev0+85971ff documentation
      +  torch_tensorrt.fx.trt_module — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation
         
       
         
      @@ -234,7 +234,7 @@
                     
                     
                       
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

      @@ -415,7 +416,7 @@

      Source code for torch_tensorrt.fx.trt_module

      import tensorrt as trt
       import torch
       
      -from .utils import unified_dtype_converter, Frameworks
      +from .utils import Frameworks, unified_dtype_converter
       
       
       
      [docs]class TRTModule(torch.nn.Module): @@ -480,9 +481,11 @@

      Source code for torch_tensorrt.fx.trt_module

      for idx in self.output_binding_indices_in_order
               ]
               self.output_shapes = [
      -            tuple(self.engine.get_binding_shape(idx))
      -            if self.engine.has_implicit_batch_dimension
      -            else tuple()
      +            (
      +                tuple(self.engine.get_binding_shape(idx))
      +                if self.engine.has_implicit_batch_dimension
      +                else tuple()
      +            )
                   for idx in self.output_binding_indices_in_order
               ]
               self.hidden_output_dtypes: Sequence[torch.dtype] = [
      @@ -492,9 +495,11 @@ 

      Source code for torch_tensorrt.fx.trt_module

      for idx in self.hidden_output_binding_indices_in_order
               ]
               self.hidden_output_shapes = [
      -            tuple(self.engine.get_binding_shape(idx))
      -            if self.engine.has_implicit_batch_dimension
      -            else tuple()
      +            (
      +                tuple(self.engine.get_binding_shape(idx))
      +                if self.engine.has_implicit_batch_dimension
      +                else tuple()
      +            )
                   for idx in self.hidden_output_binding_indices_in_order
               ]
       
      @@ -705,6 +710,7 @@ 

      Source code for torch_tensorrt.fx.trt_module

      
                
                
      +         
                
                
                
      diff --git a/docs/_modules/torch_tensorrt/logging.html b/docs/_modules/torch_tensorrt/logging.html
      index dbabe043dd..cfc294cc84 100644
      --- a/docs/_modules/torch_tensorrt/logging.html
      +++ b/docs/_modules/torch_tensorrt/logging.html
      @@ -9,7 +9,7 @@
         
         
         
      -  torch_tensorrt.logging — Torch-TensorRT v2.3.0.dev0+85971ff documentation
      +  torch_tensorrt.logging — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation
         
       
         
      @@ -234,7 +234,7 @@
                     
                     
                       
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -301,6 +301,7 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Creating a plugin to use a custom kernel inside TensorRT engines

    Python API Documenation

  • Compiling a Transformer using torch.compile and TensorRT
    • @@ -409,155 +410,111 @@

      Source code for torch_tensorrt.logging

      -from enum import Enum
      +import logging
       from typing import Any
       
      -from torch_tensorrt._C import (
      -    LogLevel,
      -    _get_is_colored_output_on,
      -    _get_logging_prefix,
      -    _get_reportable_log_level,
      -    _log,
      -    _set_is_colored_output_on,
      -    _set_logging_prefix,
      -    _set_reportable_log_level,
      -)
      -
      -
      -
      [docs]class Level(Enum): - """Enum to set the minimum required logging level to print a message to stdout""" - - InternalError = LogLevel.INTERNAL_ERROR - Error = LogLevel.ERROR - Warning = LogLevel.WARNING - Info = LogLevel.INFO - Debug = LogLevel.DEBUG - Graph = LogLevel.GRAPH - - @staticmethod - def _to_internal_level(external: "Level") -> LogLevel: - if external == Level.InternalError: - return LogLevel.INTERNAL_ERROR - elif external == Level.Error: - return LogLevel.ERROR - elif external == Level.Warning: - return LogLevel.WARNING - elif external == Level.Info: - return LogLevel.INFO - elif external == Level.Debug: - return LogLevel.DEBUG - elif external == Level.Graph: - return LogLevel.GRAPH - else: - raise ValueError("Unknown log severity")
      - - -
      [docs]def get_logging_prefix() -> str: - """Get the prefix set for logging messages - - Returns: - str: Prefix used for logger - """ - return str(_get_logging_prefix())
      +import torch +from torch_tensorrt._features import ENABLED_FEATURES +import tensorrt as trt -
      [docs]def set_logging_prefix(prefix: str) -> None: - """Set the prefix used when logging messages +logging.captureWarnings(True) +_LOGGER = logging.getLogger("torch_tensorrt [TensorRT Conversion Context]") - Args: - prefix (str): Prefix to use for logging messages - """ - _set_logging_prefix(prefix)
      +class _TRTLogger(trt.ILogger): # type: ignore[misc] -
      [docs]def get_reportable_log_level() -> Level: - """Get the level required for a message to be printed in the log + def __init__(self) -> None: + trt.ILogger.__init__(self) - Returns: - torch_tensorrt.logging.Level: The enum representing the level required to print - """ - return Level(_get_reportable_log_level())
      + def log(self, severity: trt.ILogger.Severity, msg: str) -> None: + # TODO: Move to match once py39 reaches EoL + if severity == trt.ILogger.Severity.INTERNAL_ERROR: + _LOGGER.critical(msg) + raise RuntimeError(msg) + elif severity == trt.ILogger.Severity.ERROR: + _LOGGER.error(msg) + elif severity == trt.ILogger.Severity.WARNING: + _LOGGER.warning(msg) + elif severity == trt.ILogger.Severity.INFO: + _LOGGER.info(msg) + elif severity == trt.ILogger.Severity.VERBOSE: + _LOGGER.debug(msg) -
      [docs]def set_reportable_log_level(level: Level) -> None: - """Set the level required for a message to be printed to the log +TRT_LOGGER = _TRTLogger() - Args: - level (torch_tensorrt.logging.Level): The enum representing the level required to print - """ - _set_reportable_log_level(Level._to_internal_level(level))
      +
      [docs]class internal_errors: + """Context-manager to limit displayed log messages to just internal errors -
      [docs]def get_is_colored_output_on() -> bool: - """Get if colored output is enabled for logging + Example:: - Returns: - bool: If colored output is one + with torch_tensorrt.logging.internal_errors(): + outputs = model_torchtrt(inputs) """ - return bool(_get_is_colored_output_on())
      + def __enter__(self) -> None: + self.external_lvl = _LOGGER.getEffectiveLevel() + _LOGGER.setLevel(logging.CRITICAL) -
      [docs]def set_is_colored_output_on(colored_output_on: bool) -> None: - """Enable or disable color in the log output + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging - Args: - colored_output_on (bool): If colored output should be enabled or not - """ - _set_is_colored_output_on(colored_output_on)
      + self.ts_level = ts_logging.get_reportable_log_level() + ts_logging.set_reportable_log_level(ts_logging.Level.InternalError) + elif ENABLED_FEATURES.torch_tensorrt_runtime: + self.rt_level = torch.ops.tensorrt.get_logging_level() + torch.ops.tensorrt.set_logging_level( + int(trt.ILogger.Severity.INTERNAL_ERROR) + ) -
      [docs]def log(level: Level, msg: str) -> None: - """Add a new message to the log + def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None: + _LOGGER.setLevel(self.external_lvl) - Adds a new message to the log at a specified level. The message - will only get printed out if Level > reportable_log_level + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging - Args: - level (torch_tensorrt.logging.Level): Severity of the message - msg (str): Actual message text - """ - _log(Level._to_internal_level(level), msg) + ts_logging.set_reportable_log_level(self.ts_level) - InternalError = LogLevel.INTERNAL_ERROR - Error = LogLevel.ERROR - Warning = LogLevel.WARNING - Info = LogLevel.INFO - Debug = LogLevel.DEBUG - Graph = LogLevel.GRAPH
      + elif ENABLED_FEATURES.torch_tensorrt_runtime: + torch.ops.tensorrt.set_logging_level(self.rt_level)
      -
      [docs]class internal_errors: - """Context-manager to limit displayed log messages to just internal errors +
      [docs]class errors: + """Context-manager to limit displayed log messages to just errors and above Example:: - with torch_tensorrt.logging.internal_errors(): + with torch_tensorrt.logging.errors(): outputs = model_torchtrt(inputs) """ def __enter__(self) -> None: - self.external_lvl = get_reportable_log_level() - set_reportable_log_level(Level.InternalError) + self.external_lvl = _LOGGER.getEffectiveLevel() + _LOGGER.setLevel(logging.ERROR) - def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None: - set_reportable_log_level(self.external_lvl)
      + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging + self.ts_level = ts_logging.get_reportable_log_level() + ts_logging.set_reportable_log_level(ts_logging.Level.Error) -
      [docs]class errors: - """Context-manager to limit displayed log messages to just errors and above + elif ENABLED_FEATURES.torch_tensorrt_runtime: + self.rt_level = torch.ops.tensorrt.get_logging_level() + torch.ops.tensorrt.set_logging_level(int(trt.ILogger.Severity.ERROR)) - Example:: + def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None: + _LOGGER.setLevel(self.external_lvl) - with torch_tensorrt.logging.errors(): - outputs = model_torchtrt(inputs) - """ + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging - def __enter__(self) -> None: - self.external_lvl = get_reportable_log_level() - set_reportable_log_level(Level.Error) + ts_logging.set_reportable_log_level(self.ts_level) - def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None: - set_reportable_log_level(self.external_lvl)
      + elif ENABLED_FEATURES.torch_tensorrt_runtime: + torch.ops.tensorrt.set_logging_level(self.rt_level)
      [docs]class warnings: @@ -570,11 +527,29 @@

      Source code for torch_tensorrt.logging

           """
       
           def __enter__(self) -> None:
      -        self.external_lvl = get_reportable_log_level()
      -        set_reportable_log_level(Level.Warning)
      +        self.external_lvl = _LOGGER.getEffectiveLevel()
      +        _LOGGER.setLevel(logging.WARNING)
      +
      +        if ENABLED_FEATURES.torchscript_frontend:
      +            from torch_tensorrt.ts import logging as ts_logging
      +
      +            self.ts_level = ts_logging.get_reportable_log_level()
      +            ts_logging.set_reportable_log_level(ts_logging.Level.Warning)
      +
      +        elif ENABLED_FEATURES.torch_tensorrt_runtime:
      +            self.rt_level = torch.ops.tensorrt.get_logging_level()
      +            torch.ops.tensorrt.set_logging_level(int(trt.ILogger.Severity.WARNING))
       
           def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None:
      -        set_reportable_log_level(self.external_lvl)
      + _LOGGER.setLevel(self.external_lvl) + + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging + + ts_logging.set_reportable_log_level(self.ts_level) + + elif ENABLED_FEATURES.torch_tensorrt_runtime: + torch.ops.tensorrt.set_logging_level(self.rt_level)
      [docs]class info: @@ -587,11 +562,29 @@

      Source code for torch_tensorrt.logging

           """
       
           def __enter__(self) -> None:
      -        self.external_lvl = get_reportable_log_level()
      -        set_reportable_log_level(Level.Info)
      +        self.external_lvl = _LOGGER.getEffectiveLevel()
      +        _LOGGER.setLevel(logging.INFO)
      +
      +        if ENABLED_FEATURES.torchscript_frontend:
      +            from torch_tensorrt.ts import logging as ts_logging
      +
      +            self.ts_level = ts_logging.get_reportable_log_level()
      +            ts_logging.set_reportable_log_level(ts_logging.Level.Info)
      +
      +        elif ENABLED_FEATURES.torch_tensorrt_runtime:
      +            self.rt_level = torch.ops.tensorrt.get_logging_level()
      +            torch.ops.tensorrt.set_logging_level(int(trt.ILogger.Severity.INFO))
       
           def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None:
      -        set_reportable_log_level(self.external_lvl)
      + _LOGGER.setLevel(self.external_lvl) + + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging + + ts_logging.set_reportable_log_level(self.ts_level) + + elif ENABLED_FEATURES.torch_tensorrt_runtime: + torch.ops.tensorrt.set_logging_level(self.rt_level)
      [docs]class debug: @@ -604,11 +597,29 @@

      Source code for torch_tensorrt.logging

           """
       
           def __enter__(self) -> None:
      -        self.external_lvl = get_reportable_log_level()
      -        set_reportable_log_level(Level.Debug)
      +        self.external_lvl = _LOGGER.getEffectiveLevel()
      +        _LOGGER.setLevel(logging.DEBUG)
      +
      +        if ENABLED_FEATURES.torchscript_frontend:
      +            from torch_tensorrt.ts import logging as ts_logging
      +
      +            self.ts_level = ts_logging.get_reportable_log_level()
      +            ts_logging.set_reportable_log_level(ts_logging.Level.Debug)
      +
      +        elif ENABLED_FEATURES.torch_tensorrt_runtime:
      +            self.rt_level = torch.ops.tensorrt.get_logging_level()
      +            torch.ops.tensorrt.set_logging_level(int(trt.ILogger.Severity.VERBOSE))
       
           def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None:
      -        set_reportable_log_level(self.external_lvl)
      + _LOGGER.setLevel(self.external_lvl) + + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging + + ts_logging.set_reportable_log_level(self.ts_level) + + elif ENABLED_FEATURES.torch_tensorrt_runtime: + torch.ops.tensorrt.set_logging_level(self.rt_level)
      [docs]class graphs: @@ -622,11 +633,29 @@

      Source code for torch_tensorrt.logging

           """
       
           def __enter__(self) -> None:
      -        self.external_lvl = get_reportable_log_level()
      -        set_reportable_log_level(Level.Graph)
      +        self.external_lvl = _LOGGER.getEffectiveLevel()
      +        _LOGGER.setLevel(logging.NOTSET)
      +
      +        if ENABLED_FEATURES.torchscript_frontend:
      +            from torch_tensorrt.ts import logging as ts_logging
      +
      +            self.ts_level = ts_logging.get_reportable_log_level()
      +            ts_logging.set_reportable_log_level(ts_logging.Level.Graph)
      +
      +        elif ENABLED_FEATURES.torch_tensorrt_runtime:
      +            self.rt_level = torch.ops.tensorrt.get_logging_level()
      +            torch.ops.tensorrt.set_logging_level(int(trt.ILogger.Severity.VERBOSE) + 1)
       
           def __exit__(self, exc_type: Any, exc_value: Any, exc_tb: Any) -> None:
      -        set_reportable_log_level(self.external_lvl)
      + _LOGGER.setLevel(self.external_lvl) + + if ENABLED_FEATURES.torchscript_frontend: + from torch_tensorrt.ts import logging as ts_logging + + ts_logging.set_reportable_log_level(self.ts_level) + + elif ENABLED_FEATURES.torch_tensorrt_runtime: + torch.ops.tensorrt.set_logging_level(self.rt_level)
      @@ -678,6 +707,7 @@

      Source code for torch_tensorrt.logging

                
                
                
      +         
                
                
                
      diff --git a/docs/_sources/_cpp_api/dir_cpp_include.rst.txt b/docs/_sources/_cpp_api/dir_cpp_include.rst.txt
      index e262b4a9af..999b3507e6 100644
      --- a/docs/_sources/_cpp_api/dir_cpp_include.rst.txt
      +++ b/docs/_sources/_cpp_api/dir_cpp_include.rst.txt
      @@ -9,6 +9,7 @@ Directory include
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      +
       *Directory path:* ``cpp/include``
       
       Subdirectories
      diff --git a/docs/_sources/_cpp_api/dir_cpp_include_torch_tensorrt.rst.txt b/docs/_sources/_cpp_api/dir_cpp_include_torch_tensorrt.rst.txt
      index f1da041e12..e8d393eec1 100644
      --- a/docs/_sources/_cpp_api/dir_cpp_include_torch_tensorrt.rst.txt
      +++ b/docs/_sources/_cpp_api/dir_cpp_include_torch_tensorrt.rst.txt
      @@ -9,6 +9,7 @@ Directory torch_tensorrt
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      +
       *Directory path:* ``cpp/include/torch_tensorrt``
       
       
      diff --git a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_logging.h.rst.txt b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_logging.h.rst.txt
      index ad08c57148..9fd42ab20e 100644
      --- a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_logging.h.rst.txt
      +++ b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_logging.h.rst.txt
      @@ -8,6 +8,7 @@ File logging.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      +
       .. contents:: Contents
          :local:
          :backlinks: none
      @@ -52,3 +53,29 @@ Namespaces
       
       - :ref:`namespace_torch_tensorrt__logging`
       
      +
      +Enums
      +-----
      +
      +
      +- :ref:`exhale_enum_logging_8h_1a130f65408ad8cbaee060f05e8db69558`
      +
      +
      +Functions
      +---------
      +
      +
      +- :ref:`exhale_function_logging_8h_1a56e110feaaba2c3fd44bd201fd21a76a`
      +
      +- :ref:`exhale_function_logging_8h_1a0593f776f469c20469e2f729fc7861a3`
      +
      +- :ref:`exhale_function_logging_8h_1a0c012cb374addd90eb1f42eaec570650`
      +
      +- :ref:`exhale_function_logging_8h_1ac46ac0901cb97e3ae6e93b45f24e90b8`
      +
      +- :ref:`exhale_function_logging_8h_1ad2efd47b6c3689e58ccc595680579ae5`
      +
      +- :ref:`exhale_function_logging_8h_1af8f3443813315af7901903d25dd495cc`
      +
      +- :ref:`exhale_function_logging_8h_1a7cb50492421ea9de4e3db895819df6f2`
      +
      diff --git a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.rst.txt b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.rst.txt
      index 61447e1ada..4614128e90 100644
      --- a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.rst.txt
      +++ b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_macros.h.rst.txt
      @@ -8,6 +8,7 @@ File macros.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      +
       .. contents:: Contents
          :local:
          :backlinks: none
      diff --git a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_ptq.h.rst.txt b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_ptq.h.rst.txt
      index c6ef334842..8828de8b28 100644
      --- a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_ptq.h.rst.txt
      +++ b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_ptq.h.rst.txt
      @@ -8,6 +8,7 @@ File ptq.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      +
       .. contents:: Contents
          :local:
          :backlinks: none
      @@ -73,3 +74,12 @@ Classes
       
       - :ref:`exhale_class_classtorch__tensorrt_1_1ptq_1_1Int8Calibrator`
       
      +
      +Functions
      +---------
      +
      +
      +- :ref:`exhale_function_ptq_8h_1a226e3c83379d1012cde8578c1c86b16c`
      +
      +- :ref:`exhale_function_ptq_8h_1a6186e305f47c1d94b6130ef6c7f7e178`
      +
      diff --git a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt
      index 8893aa4438..a034ad05ff 100644
      --- a/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt
      +++ b/docs/_sources/_cpp_api/file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt
      @@ -8,6 +8,7 @@ File torch_tensorrt.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      +
       .. contents:: Contents
          :local:
          :backlinks: none
      @@ -77,3 +78,29 @@ Classes
       
       - :ref:`exhale_class_classtorch__tensorrt_1_1TensorFormat`
       
      +
      +Enums
      +-----
      +
      +
      +- :ref:`exhale_enum_torch__tensorrt_8h_1a3fbe5d72e4fc624dbd038853079620eb`
      +
      +
      +Functions
      +---------
      +
      +
      +- :ref:`exhale_function_torch__tensorrt_8h_1ad6a4ee8ca6c8f6e5519eb1128ec7f4a1`
      +
      +- :ref:`exhale_function_torch__tensorrt_8h_1ac4ab8313ae72c2c899ea31548b528528`
      +
      +- :ref:`exhale_function_torch__tensorrt_8h_1ad1acd06eaeaffbbcf6e7ebf426891384`
      +
      +- :ref:`exhale_function_torch__tensorrt_8h_1a5b405fd3bf3c8fc2e2a54cbbab979797`
      +
      +- :ref:`exhale_function_torch__tensorrt_8h_1a6e19490a08fb1553c9dd347a5ae79db9`
      +
      +- :ref:`exhale_function_torch__tensorrt_8h_1ae8d56472106eeef37fbe51ff7f40c9b2`
      +
      +- :ref:`exhale_function_torch__tensorrt_8h_1a81f9783517335dda877d8cfcf38987c9`
      +
      diff --git a/docs/_sources/_cpp_api/namespace_torch_tensorrt.rst.txt b/docs/_sources/_cpp_api/namespace_torch_tensorrt.rst.txt
      index 655ccbc045..43a0afb20d 100644
      --- a/docs/_sources/_cpp_api/namespace_torch_tensorrt.rst.txt
      +++ b/docs/_sources/_cpp_api/namespace_torch_tensorrt.rst.txt
      @@ -45,15 +45,15 @@ Enums
       -----
       
       
      -- :ref:`exhale_enum_namespacetorch__tensorrt_1a3fbe5d72e4fc624dbd038853079620eb`
      +- :ref:`exhale_enum_torch__tensorrt_8h_1a3fbe5d72e4fc624dbd038853079620eb`
       
       
       Functions
       ---------
       
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1ad6a4ee8ca6c8f6e5519eb1128ec7f4a1`
      +- :ref:`exhale_function_torch__tensorrt_8h_1ad6a4ee8ca6c8f6e5519eb1128ec7f4a1`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1ac4ab8313ae72c2c899ea31548b528528`
      +- :ref:`exhale_function_torch__tensorrt_8h_1ac4ab8313ae72c2c899ea31548b528528`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1ad1acd06eaeaffbbcf6e7ebf426891384`
      +- :ref:`exhale_function_torch__tensorrt_8h_1ad1acd06eaeaffbbcf6e7ebf426891384`
      diff --git a/docs/_sources/_cpp_api/namespace_torch_tensorrt__logging.rst.txt b/docs/_sources/_cpp_api/namespace_torch_tensorrt__logging.rst.txt
      index b390ba1bd2..49f946f937 100644
      --- a/docs/_sources/_cpp_api/namespace_torch_tensorrt__logging.rst.txt
      +++ b/docs/_sources/_cpp_api/namespace_torch_tensorrt__logging.rst.txt
      @@ -17,23 +17,23 @@ Enums
       -----
       
       
      -- :ref:`exhale_enum_namespacetorch__tensorrt_1_1logging_1a130f65408ad8cbaee060f05e8db69558`
      +- :ref:`exhale_enum_logging_8h_1a130f65408ad8cbaee060f05e8db69558`
       
       
       Functions
       ---------
       
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1logging_1a56e110feaaba2c3fd44bd201fd21a76a`
      +- :ref:`exhale_function_logging_8h_1a56e110feaaba2c3fd44bd201fd21a76a`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1logging_1a0593f776f469c20469e2f729fc7861a3`
      +- :ref:`exhale_function_logging_8h_1a0593f776f469c20469e2f729fc7861a3`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1logging_1a0c012cb374addd90eb1f42eaec570650`
      +- :ref:`exhale_function_logging_8h_1a0c012cb374addd90eb1f42eaec570650`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1logging_1ac46ac0901cb97e3ae6e93b45f24e90b8`
      +- :ref:`exhale_function_logging_8h_1ac46ac0901cb97e3ae6e93b45f24e90b8`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1logging_1ad2efd47b6c3689e58ccc595680579ae5`
      +- :ref:`exhale_function_logging_8h_1ad2efd47b6c3689e58ccc595680579ae5`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1logging_1af8f3443813315af7901903d25dd495cc`
      +- :ref:`exhale_function_logging_8h_1af8f3443813315af7901903d25dd495cc`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1logging_1a7cb50492421ea9de4e3db895819df6f2`
      +- :ref:`exhale_function_logging_8h_1a7cb50492421ea9de4e3db895819df6f2`
      diff --git a/docs/_sources/_cpp_api/namespace_torch_tensorrt__ptq.rst.txt b/docs/_sources/_cpp_api/namespace_torch_tensorrt__ptq.rst.txt
      index 70e6131d90..bdc39cb326 100644
      --- a/docs/_sources/_cpp_api/namespace_torch_tensorrt__ptq.rst.txt
      +++ b/docs/_sources/_cpp_api/namespace_torch_tensorrt__ptq.rst.txt
      @@ -26,6 +26,6 @@ Functions
       ---------
       
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1ptq_1a226e3c83379d1012cde8578c1c86b16c`
      +- :ref:`exhale_function_ptq_8h_1a226e3c83379d1012cde8578c1c86b16c`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1ptq_1a6186e305f47c1d94b6130ef6c7f7e178`
      +- :ref:`exhale_function_ptq_8h_1a6186e305f47c1d94b6130ef6c7f7e178`
      diff --git a/docs/_sources/_cpp_api/namespace_torch_tensorrt__torchscript.rst.txt b/docs/_sources/_cpp_api/namespace_torch_tensorrt__torchscript.rst.txt
      index 44cdc9fcba..fa21b92a73 100644
      --- a/docs/_sources/_cpp_api/namespace_torch_tensorrt__torchscript.rst.txt
      +++ b/docs/_sources/_cpp_api/namespace_torch_tensorrt__torchscript.rst.txt
      @@ -24,10 +24,10 @@ Functions
       ---------
       
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1torchscript_1a5b405fd3bf3c8fc2e2a54cbbab979797`
      +- :ref:`exhale_function_torch__tensorrt_8h_1a5b405fd3bf3c8fc2e2a54cbbab979797`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1torchscript_1a6e19490a08fb1553c9dd347a5ae79db9`
      +- :ref:`exhale_function_torch__tensorrt_8h_1a6e19490a08fb1553c9dd347a5ae79db9`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1torchscript_1ae8d56472106eeef37fbe51ff7f40c9b2`
      +- :ref:`exhale_function_torch__tensorrt_8h_1ae8d56472106eeef37fbe51ff7f40c9b2`
       
      -- :ref:`exhale_function_namespacetorch__tensorrt_1_1torchscript_1a81f9783517335dda877d8cfcf38987c9`
      +- :ref:`exhale_function_torch__tensorrt_8h_1a81f9783517335dda877d8cfcf38987c9`
      diff --git a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_logging.h.rst.txt b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_logging.h.rst.txt
      index ee6e30c65c..ec413cb1b7 100644
      --- a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_logging.h.rst.txt
      +++ b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_logging.h.rst.txt
      @@ -8,7 +8,7 @@ Program Listing for File logging.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      -.. code-block:: none
      +.. code-block:: cpp
       
          /*
           * Copyright (c) NVIDIA Corporation.
      diff --git a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.rst.txt b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.rst.txt
      index 17809e688f..8a310652a6 100644
      --- a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.rst.txt
      +++ b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_macros.h.rst.txt
      @@ -8,7 +8,7 @@ Program Listing for File macros.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      -.. code-block:: none
      +.. code-block:: cpp
       
          /*
           * Copyright (c) NVIDIA Corporation.
      @@ -36,7 +36,7 @@ Program Listing for File macros.h
          #define STR(x) XSTR(x)
          
          #define TORCH_TENSORRT_MAJOR_VERSION 2
      -   #define TORCH_TENSORRT_MINOR_VERSION 3
      +   #define TORCH_TENSORRT_MINOR_VERSION 4
          #define TORCH_TENSORRT_PATCH_VERSION 0
          #define TORCH_TENSORRT_VERSION      \
            STR(TORCH_TENSORRT_MAJOR_VERSION) \
      diff --git a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.rst.txt b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.rst.txt
      index 6b7894e7c4..88411c3ce5 100644
      --- a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.rst.txt
      +++ b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_ptq.h.rst.txt
      @@ -8,7 +8,7 @@ Program Listing for File ptq.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      -.. code-block:: none
      +.. code-block:: cpp
       
          /*
           * Copyright (c) NVIDIA Corporation.
      @@ -33,11 +33,6 @@ Program Listing for File ptq.h
          #include "torch_tensorrt/macros.h"
          
          #ifndef DOXYGEN_SHOULD_SKIP_THIS
      -   namespace nvinfer1 {
      -   class IInt8Calibrator;
      -   class IInt8EntropyCalibrator2;
      -   } // namespace nvinfer1
      -   
          namespace torch_tensorrt {
          namespace ptq {
          TORCHTRT_API bool get_batch_impl(void* bindings[], const char* names[], int nbBindings, torch::Tensor& data);
      diff --git a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt
      index 67848a40a0..9a34bf9d4f 100644
      --- a/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt
      +++ b/docs/_sources/_cpp_api/program_listing_file_cpp_include_torch_tensorrt_torch_tensorrt.h.rst.txt
      @@ -8,7 +8,7 @@ Program Listing for File torch_tensorrt.h
       
       .. |exhale_lsh| unicode:: U+021B0 .. UPWARDS ARROW WITH TIP LEFTWARDS
       
      -.. code-block:: none
      +.. code-block:: cpp
       
          /*
           * Copyright (c) NVIDIA Corporation.
      diff --git a/docs/_sources/_cpp_api/structtorch__tensorrt_1_1Input.rst.txt b/docs/_sources/_cpp_api/structtorch__tensorrt_1_1Input.rst.txt
      index 74c594cd94..64144c4dd7 100644
      --- a/docs/_sources/_cpp_api/structtorch__tensorrt_1_1Input.rst.txt
      +++ b/docs/_sources/_cpp_api/structtorch__tensorrt_1_1Input.rst.txt
      @@ -12,7 +12,7 @@ Inheritance Relationships
       Base Type
       *********
       
      -- ``public CustomClassHolder``
      +- ``public torch::CustomClassHolder``
       
       
       Struct Documentation
      diff --git a/docs/_sources/getting_started/installation.rst.txt b/docs/_sources/getting_started/installation.rst.txt
      index 9f0088c3b8..fdd44a20cd 100644
      --- a/docs/_sources/getting_started/installation.rst.txt
      +++ b/docs/_sources/getting_started/installation.rst.txt
      @@ -87,7 +87,7 @@ Dependencies for Compilation
           * Specify your CUDA version here if not the version used in the branch being built: https://github.com/pytorch/TensorRT/blob/4e5b0f6e860910eb510fa70a76ee3eb9825e7a4d/WORKSPACE#L46
       
       
      -* The correct **LibTorch** version will be pulled down for you by bazel.
      +* The correct **LibTorch**, **cuDNN** and **TensorRT** versions will be pulled down for you by bazel.
       
           NOTE: By default bazel will pull the latest nightly from pytorch.org. For building main, this is usually sufficient however if there is a specific PyTorch you are targeting,
           edit these locations with updated URLs/paths:
      @@ -95,7 +95,8 @@ Dependencies for Compilation
           * https://github.com/pytorch/TensorRT/blob/4e5b0f6e860910eb510fa70a76ee3eb9825e7a4d/WORKSPACE#L53C1-L53C1
       
       
      -* **cuDNN and TensorRT** are not required to be installed on the system to build Torch-TensorRT, in fact this is preferable to ensure reproducable builds. Download the tarballs for cuDNN and TensorRT from https://developer.nvidia.com and update the paths in the WORKSPACE file here https://github.com/pytorch/TensorRT/blob/4e5b0f6e860910eb510fa70a76ee3eb9825e7a4d/WORKSPACE#L71
      +* **cuDNN and TensorRT** are not required to be installed on the system to build Torch-TensorRT, in fact this is preferable to ensure reproducable builds. If versions other than the default are needed
      +  point the WORKSPACE file to the URL of the tarball or download the tarballs for cuDNN and TensorRT from https://developer.nvidia.com and update the paths in the WORKSPACE file here https://github.com/pytorch/TensorRT/blob/4e5b0f6e860910eb510fa70a76ee3eb9825e7a4d/WORKSPACE#L71
       
           For example:
       
      @@ -104,25 +105,29 @@ Dependencies for Compilation
               http_archive(
                   name = "cudnn",
                   build_file = "@//third_party/cudnn/archive:BUILD",
      -            sha256 = "79d77a769c7e7175abc7b5c2ed5c494148c0618a864138722c887f95c623777c",
      -            strip_prefix = "cudnn-linux-x86_64-8.8.1.3_cuda12-archive",
      +            sha256 = "", # Optional but recommended
      +            strip_prefix = "cudnn-linux-x86_64-_-archive",
                   urls = [
      -                #"https://developer.nvidia.com/downloads/compute/cudnn/secure/8.8.1/local_installers/12.0/cudnn-linux-x86_64-8.8.1.3_cuda12-archive.tar.xz",
      -                "file:////cudnn-linux-x86_64-8.8.1.3_cuda12-archive.tar.xz"
      +                "https://developer.nvidia.com/downloads/compute/cudnn/",
      +                # OR
      +                "file:////cudnn-linux-x86_64-_-archive.tar.xz"
                   ],
               )
       
               http_archive(
                   name = "tensorrt",
                   build_file = "@//third_party/tensorrt/archive:BUILD",
      -            sha256 = "0f8157a5fc5329943b338b893591373350afa90ca81239cdadd7580cd1eba254",
      -            strip_prefix = "TensorRT-8.6.1.6",
      +            sha256 = "", # Optional but recommended
      +            strip_prefix = "TensorRT-",
                   urls = [
      -                #"https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/tars/TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-12.0.tar.gz",
      -                "file:////TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-12.0.tar.gz"
      +                "https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/",
      +                # OR
      +                "file:////TensorRT-.Linux.x86_64-gnu.cuda-.tar.gz"
                   ],
               )
       
      +    Remember at runtime, these libraries must be added to your ``LD_LIBRARY_PATH`` explicity
      +
       If you have a local version of cuDNN and TensorRT installed, this can be used as well by commenting out the above lines and uncommenting the following lines https://github.com/pytorch/TensorRT/blob/4e5b0f6e860910eb510fa70a76ee3eb9825e7a4d/WORKSPACE#L114C1-L124C3
       
       
      diff --git a/docs/_sources/index.rst.txt b/docs/_sources/index.rst.txt
      index 455aeab8b3..175ab7e8ab 100644
      --- a/docs/_sources/index.rst.txt
      +++ b/docs/_sources/index.rst.txt
      @@ -111,6 +111,7 @@ Tutorials
          tutorials/_rendered_examples/dynamo/torch_compile_transformers_example
          tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage
          tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion
      +   tutorials/_rendered_examples/dynamo/custom_kernel_plugins
       
       Python API Documenation
       ------------------------
      diff --git a/docs/_sources/py_api/dynamo.rst.txt b/docs/_sources/py_api/dynamo.rst.txt
      index fce5372d0e..6b4a527663 100644
      --- a/docs/_sources/py_api/dynamo.rst.txt
      +++ b/docs/_sources/py_api/dynamo.rst.txt
      @@ -22,6 +22,8 @@ Functions
       
       .. autofunction:: export
       
      +.. autofunction:: convert_module_to_trt_engine
      +
       
       
       Classes
      diff --git a/docs/_sources/py_api/torch_tensorrt.rst.txt b/docs/_sources/py_api/torch_tensorrt.rst.txt
      index 22fda13ba2..eb8285e103 100644
      --- a/docs/_sources/py_api/torch_tensorrt.rst.txt
      +++ b/docs/_sources/py_api/torch_tensorrt.rst.txt
      @@ -37,10 +37,6 @@ Classes
          :members:
          :special-members: __init__
       
      -.. autoclass:: TRTModuleNext
      -   :members:
      -   :special-members: __init__
      -
       Enums
       -------
       
      @@ -50,7 +46,7 @@ Enums
       
       .. autoclass:: EngineCapability
       
      -.. autoclass:: TensorFormat
      +.. autoclass:: memory_format
       
       Submodules
       ----------
      diff --git a/docs/_sources/tutorials/_rendered_examples/dynamo/index.rst.txt b/docs/_sources/tutorials/_rendered_examples/dynamo/index.rst.txt
      index d0e4d6630e..6b0d398ad1 100644
      --- a/docs/_sources/tutorials/_rendered_examples/dynamo/index.rst.txt
      +++ b/docs/_sources/tutorials/_rendered_examples/dynamo/index.rst.txt
      @@ -14,6 +14,7 @@ a number of ways you can leverage this backend to accelerate inference.
       * :ref:`torch_compile_transformer`: Compiling a Transformer model using ``torch.compile``
       * :ref:`torch_compile_advanced_usage`: Advanced usage including making a custom backend to use directly with the ``torch.compile`` API
       * :ref:`torch_compile_stable_diffusion`: Compiling a Stable Diffusion model using ``torch.compile``
      +* :ref:`custom_kernel_plugins`: Creating a plugin to use a custom kernel inside TensorRT engines
       
       
       
      @@ -90,6 +91,23 @@ a number of ways you can leverage this backend to accelerate inference.
           
      +.. raw:: html + +
      + +.. only:: html + + .. image:: /tutorials/_rendered_examples/dynamo/images/thumb/sphx_glr_custom_kernel_plugins_thumb.png + :alt: + + :ref:`sphx_glr_tutorials__rendered_examples_dynamo_custom_kernel_plugins.py` + +.. raw:: html + +
      Using Custom Kernels within TensorRT Engines with Torch-TensorRT
      +
      + + .. raw:: html @@ -102,4 +120,5 @@ a number of ways you can leverage this backend to accelerate inference. /tutorials/_rendered_examples/dynamo/torch_compile_resnet_example /tutorials/_rendered_examples/dynamo/torch_compile_transformers_example /tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage + /tutorials/_rendered_examples/dynamo/custom_kernel_plugins diff --git a/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage.rst.txt b/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage.rst.txt index 6ddfff6241..9eaa970313 100644 --- a/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage.rst.txt +++ b/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_advanced_usage.rst.txt @@ -83,7 +83,7 @@ Compilation with `torch.compile` Using Default Settings # For the default settings, we can simply call torch.compile # with the backend "torch_tensorrt", and run the model on an # input to cause compilation, as so: - optimized_model = torch.compile(model, backend="torch_tensorrt") + optimized_model = torch.compile(model, backend="torch_tensorrt", dynamic=False) optimized_model(*sample_inputs) @@ -109,7 +109,7 @@ Compilation with `torch.compile` Using Custom Settings model_half = Model().eval().cuda() -.. GENERATED FROM PYTHON SOURCE LINES 65-88 +.. GENERATED FROM PYTHON SOURCE LINES 65-91 .. code-block:: python @@ -132,17 +132,20 @@ Compilation with `torch.compile` Using Custom Settings # Run the model on an input to cause compilation, as so: optimized_model_custom = torch.compile( - model_half, backend="torch_tensorrt", options=backend_kwargs + model_half, + backend="torch_tensorrt", + options=backend_kwargs, + dynamic=False, ) optimized_model_custom(*sample_inputs_half) -.. GENERATED FROM PYTHON SOURCE LINES 89-91 +.. GENERATED FROM PYTHON SOURCE LINES 92-94 Cleanup ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. GENERATED FROM PYTHON SOURCE LINES 91-95 +.. GENERATED FROM PYTHON SOURCE LINES 94-98 .. code-block:: python @@ -151,7 +154,7 @@ Cleanup torch._dynamo.reset() -.. GENERATED FROM PYTHON SOURCE LINES 96-105 +.. GENERATED FROM PYTHON SOURCE LINES 99-108 Cuda Driver Error Note ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion.rst.txt b/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion.rst.txt index 937eb5117e..eb8f53aa4b 100644 --- a/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion.rst.txt +++ b/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_stable_diffusion.rst.txt @@ -36,15 +36,14 @@ This interactive script is intended as a sample of the Torch-TensorRT workflow w Imports and Model Definition ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. GENERATED FROM PYTHON SOURCE LINES 19-47 +.. GENERATED FROM PYTHON SOURCE LINES 19-46 .. code-block:: python import torch - from diffusers import DiffusionPipeline - import torch_tensorrt + from diffusers import DiffusionPipeline model_id = "CompVis/stable-diffusion-v1-4" device = "cuda:0" @@ -63,18 +62,18 @@ Imports and Model Definition backend=backend, options={ "truncate_long_and_double": True, - "precision": torch.float16, + "enabled_precisions": {torch.float32, torch.float16}, }, dynamic=False, ) -.. GENERATED FROM PYTHON SOURCE LINES 48-50 +.. GENERATED FROM PYTHON SOURCE LINES 47-49 Inference ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. GENERATED FROM PYTHON SOURCE LINES 50-56 +.. GENERATED FROM PYTHON SOURCE LINES 49-55 .. code-block:: python diff --git a/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_transformers_example.rst.txt b/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_transformers_example.rst.txt index efaae1b622..b362e42447 100644 --- a/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_transformers_example.rst.txt +++ b/docs/_sources/tutorials/_rendered_examples/dynamo/torch_compile_transformers_example.rst.txt @@ -86,7 +86,7 @@ Optional Input Arguments to `torch_tensorrt.compile` Compilation with `torch.compile` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. GENERATED FROM PYTHON SOURCE LINES 50-68 +.. GENERATED FROM PYTHON SOURCE LINES 50-69 .. code-block:: python @@ -104,22 +104,23 @@ Compilation with `torch.compile` optimized_model = torch.compile( model, backend="torch_tensorrt", + dynamic=False, options=compilation_kwargs, ) optimized_model(*inputs) -.. GENERATED FROM PYTHON SOURCE LINES 69-71 +.. GENERATED FROM PYTHON SOURCE LINES 70-72 Equivalently, we could have run the above via the convenience frontend, as so: `torch_tensorrt.compile(model, ir="torch_compile", inputs=inputs, **compilation_kwargs)` -.. GENERATED FROM PYTHON SOURCE LINES 73-75 +.. GENERATED FROM PYTHON SOURCE LINES 74-76 Inference ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. GENERATED FROM PYTHON SOURCE LINES 75-83 +.. GENERATED FROM PYTHON SOURCE LINES 76-84 .. code-block:: python @@ -132,7 +133,7 @@ Inference new_outputs = optimized_model(*new_inputs) -.. GENERATED FROM PYTHON SOURCE LINES 84-92 +.. GENERATED FROM PYTHON SOURCE LINES 85-93 .. code-block:: python @@ -145,12 +146,12 @@ Inference new_outputs = optimized_model(*new_inputs) -.. GENERATED FROM PYTHON SOURCE LINES 93-95 +.. GENERATED FROM PYTHON SOURCE LINES 94-96 Cleanup ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. GENERATED FROM PYTHON SOURCE LINES 95-99 +.. GENERATED FROM PYTHON SOURCE LINES 96-100 .. code-block:: python @@ -159,7 +160,7 @@ Cleanup torch._dynamo.reset() -.. GENERATED FROM PYTHON SOURCE LINES 100-109 +.. GENERATED FROM PYTHON SOURCE LINES 101-110 Cuda Driver Error Note ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/docs/_sources/tutorials/_rendered_examples/index.rst.txt b/docs/_sources/tutorials/_rendered_examples/index.rst.txt index 067d1b3e00..bfc27d0b1e 100644 --- a/docs/_sources/tutorials/_rendered_examples/index.rst.txt +++ b/docs/_sources/tutorials/_rendered_examples/index.rst.txt @@ -30,6 +30,7 @@ a number of ways you can leverage this backend to accelerate inference. * :ref:`torch_compile_transformer`: Compiling a Transformer model using ``torch.compile`` * :ref:`torch_compile_advanced_usage`: Advanced usage including making a custom backend to use directly with the ``torch.compile`` API * :ref:`torch_compile_stable_diffusion`: Compiling a Stable Diffusion model using ``torch.compile`` +* :ref:`custom_kernel_plugins`: Creating a plugin to use a custom kernel inside TensorRT engines @@ -106,6 +107,23 @@ a number of ways you can leverage this backend to accelerate inference. +.. raw:: html + +
      + +.. only:: html + + .. image:: /tutorials/_rendered_examples/dynamo/images/thumb/sphx_glr_custom_kernel_plugins_thumb.png + :alt: + + :ref:`sphx_glr_tutorials__rendered_examples_dynamo_custom_kernel_plugins.py` + +.. raw:: html + +
      Using Custom Kernels within TensorRT Engines with Torch-TensorRT
      +
      + + .. raw:: html diff --git a/docs/_sources/user_guide/saving_models.rst.txt b/docs/_sources/user_guide/saving_models.rst.txt index 6d890d0450..73fee6e23c 100644 --- a/docs/_sources/user_guide/saving_models.rst.txt +++ b/docs/_sources/user_guide/saving_models.rst.txt @@ -9,19 +9,22 @@ Saving models compiled with Torch-TensorRT :undoc-members: :show-inheritance: -Saving models compiled with Torch-TensorRT varies slightly with the `ir` that has been used for compilation. +Saving models compiled with Torch-TensorRT can be done using `torch_tensorrt.save` API. Dynamo IR ------------- -Starting with 2.1 release of Torch-TensorRT, we are switching the default compilation to be dynamo based. -The output of `ir=dynamo` compilation is a `torch.fx.GraphModule` object. There are two ways to save these objects +The output type of `ir=dynamo` compilation of Torch-TensorRT is `torch.fx.GraphModule` object by default. +We can save this object in either `TorchScript` (`torch.jit.ScriptModule`) or `ExportedProgram` (`torch.export.ExportedProgram`) formats by +specifying the `output_format` flag. Here are the options `output_format` will accept -a) Converting to Torchscript +* `exported_program` : This is the default. We perform transformations on the graphmodule first and use `torch.export.save` to save the module. +* `torchscript` : We trace the graphmodule via `torch.jit.trace` and save it via `torch.jit.save`. + +a) ExportedProgram ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -`torch.fx.GraphModule` objects cannot be serialized directly. Hence we use `torch.jit.trace` to convert this into a `ScriptModule` object which can be saved to disk. -The following code illustrates this approach. +Here's an example usage .. code-block:: python @@ -30,20 +33,17 @@ The following code illustrates this approach. model = MyModel().eval().cuda() inputs = [torch.randn((1, 3, 224, 224)).cuda()] - trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs) # Output is a torch.fx.GraphModule - trt_traced_model = torch.jit.trace(trt_gm, inputs) - torch.jit.save(trt_traced_model, "trt_model.ts") + # trt_ep is a torch.fx.GraphModule object + trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs) + torchtrt.save(trt_gm, "trt.ep", inputs=inputs) # Later, you can load it and run inference - model = torch.jit.load("trt_model.ts").cuda() + model = torch.export.load("trt.ep").module() model(*inputs) -b) ExportedProgram +b) Torchscript ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -`torch.export.ExportedProgram` is a new format introduced in Pytorch 2.1. After we compile a Pytorch module using Torch-TensorRT, the resultant -`torch.fx.GraphModule` along with additional metadata can be used to create `ExportedProgram` which can be saved and loaded from disk. - .. code-block:: python import torch @@ -51,26 +51,20 @@ b) ExportedProgram model = MyModel().eval().cuda() inputs = [torch.randn((1, 3, 224, 224)).cuda()] - trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs) # Output is a torch.fx.GraphModule - # Transform and create an exported program - trt_exp_program = torch_tensorrt.dynamo.export(trt_gm, inputs) - torch.export.save(trt_exp_program, "trt_model.ep") + # trt_gm is a torch.fx.GraphModule object + trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs) + torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs) # Later, you can load it and run inference - model = torch.export.load("trt_model.ep") + model = torch.jit.load("trt.ts").cuda() model(*inputs) -`torch_tensorrt.dynamo.export` inlines the submodules within a GraphModule to their corresponding nodes and stiches all the nodes together. -This is needed as `torch._export` serialization cannot handle serializing and deserializing of submodules (`call_module` nodes). - -.. note:: This way of saving the models using `ExportedProgram` is experimental. Here is a known issue : https://github.com/pytorch/TensorRT/issues/2341 - Torchscript IR ------------- In Torch-TensorRT 1.X versions, the primary way to compile and run inference with Torch-TensorRT is using Torchscript IR. -This behavior stays the same in 2.X versions as well. +For `ir=ts`, this behavior stays the same in 2.X versions as well. .. code-block:: python @@ -86,3 +80,21 @@ This behavior stays the same in 2.X versions as well. model = torch.jit.load("trt_model.ts").cuda() model(*inputs) + +Loading the models +-------------------- + +We can load torchscript or exported_program models using `torch.jit.load` and `torch.export.load` APIs from PyTorch directly. +Alternatively, we provide a light wrapper `torch_tensorrt.load(file_path)` which can load either of the above model types. + +Here's an example usage + +.. code-block:: python + + import torch + import torch_tensorrt + + # file_path can be trt.ep or trt.ts file obtained via saving the model (refer to the above section) + inputs = [torch.randn((1, 3, 224, 224)).cuda()] + model = torch_tensorrt.load().module() + model(*inputs) \ No newline at end of file diff --git a/docs/_static/basic.css b/docs/_static/basic.css index bf18350b65..9039e027cd 100644 --- a/docs/_static/basic.css +++ b/docs/_static/basic.css @@ -222,7 +222,7 @@ table.modindextable td { /* -- general body styles --------------------------------------------------- */ div.body { - min-width: 450px; + min-width: 360px; max-width: 800px; } @@ -335,13 +335,13 @@ p.sidebar-title { font-weight: bold; } -div.admonition, div.topic, blockquote { +div.admonition, div.topic, aside.topic, blockquote { clear: left; } /* -- topics ---------------------------------------------------------------- */ -div.topic { +div.topic, aside.topic { border: 1px solid #ccc; padding: 7px; margin: 10px 0 10px 0; @@ -380,6 +380,7 @@ div.body p.centered { div.sidebar > :last-child, aside.sidebar > :last-child, div.topic > :last-child, +aside.topic > :last-child, div.admonition > :last-child { margin-bottom: 0; } @@ -387,6 +388,7 @@ div.admonition > :last-child { div.sidebar::after, aside.sidebar::after, div.topic::after, +aside.topic::after, div.admonition::after, blockquote::after { display: block; @@ -428,10 +430,6 @@ table.docutils td, table.docutils th { border-bottom: 1px solid #aaa; } -table.footnote td, table.footnote th { - border: 0 !important; -} - th { text-align: left; padding-right: 5px; @@ -615,6 +613,7 @@ ul.simple p { margin-bottom: 0; } +/* Docutils 0.17 and older (footnotes & citations) */ dl.footnote > dt, dl.citation > dt { float: left; @@ -632,6 +631,33 @@ dl.citation > dd:after { clear: both; } +/* Docutils 0.18+ (footnotes & citations) */ +aside.footnote > span, +div.citation > span { + float: left; +} +aside.footnote > span:last-of-type, +div.citation > span:last-of-type { + padding-right: 0.5em; +} +aside.footnote > p { + margin-left: 2em; +} +div.citation > p { + margin-left: 4em; +} +aside.footnote > p:last-of-type, +div.citation > p:last-of-type { + margin-bottom: 0em; +} +aside.footnote > p:last-of-type:after, +div.citation > p:last-of-type:after { + content: ""; + clear: both; +} + +/* Footnotes & citations ends */ + dl.field-list { display: grid; grid-template-columns: fit-content(30%) auto; diff --git a/docs/_static/collapsible-lists/LICENSE.md b/docs/_static/collapsible-lists/LICENSE.md index ef81a64535..21859bfd55 100644 --- a/docs/_static/collapsible-lists/LICENSE.md +++ b/docs/_static/collapsible-lists/LICENSE.md @@ -2,6 +2,15 @@ This code is the fruit of Kate Morley's labor, taken from here: - http://code.iamkate.com/javascript/collapsible-lists/ +which has been updated here: + +- https://iamkate.com/code/tree-views/ + +and there is a strong desire for this folder to update accordingly, when +possible: + +- https://github.com/svenevs/exhale/issues/180 + She includes a generous CC0 1.0 license for all materials on her site: -- http://code.iamkate.com/ +- https://iamkate.com/code/ diff --git a/docs/_static/css/theme.css b/docs/_static/css/theme.css index 43cf163a61..b8d69c23b0 100644 --- a/docs/_static/css/theme.css +++ b/docs/_static/css/theme.css @@ -10440,7 +10440,6 @@ h1 { font-size: 2rem; letter-spacing: 1.78px; line-height: 2.5rem; - text-transform: uppercase; margin: 1.375rem 0; } @@ -10494,7 +10493,9 @@ article.pytorch-article ol ul, article.pytorch-article ol ol { margin: 0; } -article.pytorch-article h1, +article.pytorch-article h1 { + font-weight: 600; +} article.pytorch-article h2, article.pytorch-article h3, article.pytorch-article h4, diff --git a/docs/_static/doctools.js b/docs/_static/doctools.js index e1bfd708b7..c3db08d1c3 100644 --- a/docs/_static/doctools.js +++ b/docs/_static/doctools.js @@ -2,357 +2,263 @@ * doctools.js * ~~~~~~~~~~~ * - * Sphinx JavaScript utilities for all documentation. + * Base JavaScript utilities for all Sphinx HTML documentation. * * :copyright: Copyright 2007-2022 by the Sphinx team, see AUTHORS. * :license: BSD, see LICENSE for details. * */ +"use strict"; -/** - * select a different prefix for underscore - */ -$u = _.noConflict(); - -/** - * make the code below compatible with browsers without - * an installed firebug like debugger -if (!window.console || !console.firebug) { - var names = ["log", "debug", "info", "warn", "error", "assert", "dir", - "dirxml", "group", "groupEnd", "time", "timeEnd", "count", "trace", - "profile", "profileEnd"]; - window.console = {}; - for (var i = 0; i < names.length; ++i) - window.console[names[i]] = function() {}; -} - */ - -/** - * small helper function to urldecode strings - * - * See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/decodeURIComponent#Decoding_query_parameters_from_a_URL - */ -jQuery.urldecode = function(x) { - if (!x) { - return x +const _ready = (callback) => { + if (document.readyState !== "loading") { + callback(); + } else { + document.addEventListener("DOMContentLoaded", callback); } - return decodeURIComponent(x.replace(/\+/g, ' ')); }; /** - * small helper function to urlencode strings + * highlight a given string on a node by wrapping it in + * span elements with the given class name. */ -jQuery.urlencode = encodeURIComponent; +const _highlight = (node, addItems, text, className) => { + if (node.nodeType === Node.TEXT_NODE) { + const val = node.nodeValue; + const parent = node.parentNode; + const pos = val.toLowerCase().indexOf(text); + if ( + pos >= 0 && + !parent.classList.contains(className) && + !parent.classList.contains("nohighlight") + ) { + let span; -/** - * This function returns the parsed url parameters of the - * current request. Multiple values per key are supported, - * it will always return arrays of strings for the value parts. - */ -jQuery.getQueryParameters = function(s) { - if (typeof s === 'undefined') - s = document.location.search; - var parts = s.substr(s.indexOf('?') + 1).split('&'); - var result = {}; - for (var i = 0; i < parts.length; i++) { - var tmp = parts[i].split('=', 2); - var key = jQuery.urldecode(tmp[0]); - var value = jQuery.urldecode(tmp[1]); - if (key in result) - result[key].push(value); - else - result[key] = [value]; - } - return result; -}; + const closestNode = parent.closest("body, svg, foreignObject"); + const isInSVG = closestNode && closestNode.matches("svg"); + if (isInSVG) { + span = document.createElementNS("http://www.w3.org/2000/svg", "tspan"); + } else { + span = document.createElement("span"); + span.classList.add(className); + } -/** - * highlight a given string on a jquery object by wrapping it in - * span elements with the given class name. - */ -jQuery.fn.highlightText = function(text, className) { - function highlight(node, addItems) { - if (node.nodeType === 3) { - var val = node.nodeValue; - var pos = val.toLowerCase().indexOf(text); - if (pos >= 0 && - !jQuery(node.parentNode).hasClass(className) && - !jQuery(node.parentNode).hasClass("nohighlight")) { - var span; - var isInSVG = jQuery(node).closest("body, svg, foreignObject").is("svg"); - if (isInSVG) { - span = document.createElementNS("http://www.w3.org/2000/svg", "tspan"); - } else { - span = document.createElement("span"); - span.className = className; - } - span.appendChild(document.createTextNode(val.substr(pos, text.length))); - node.parentNode.insertBefore(span, node.parentNode.insertBefore( + span.appendChild(document.createTextNode(val.substr(pos, text.length))); + parent.insertBefore( + span, + parent.insertBefore( document.createTextNode(val.substr(pos + text.length)), - node.nextSibling)); - node.nodeValue = val.substr(0, pos); - if (isInSVG) { - var rect = document.createElementNS("http://www.w3.org/2000/svg", "rect"); - var bbox = node.parentElement.getBBox(); - rect.x.baseVal.value = bbox.x; - rect.y.baseVal.value = bbox.y; - rect.width.baseVal.value = bbox.width; - rect.height.baseVal.value = bbox.height; - rect.setAttribute('class', className); - addItems.push({ - "parent": node.parentNode, - "target": rect}); - } + node.nextSibling + ) + ); + node.nodeValue = val.substr(0, pos); + + if (isInSVG) { + const rect = document.createElementNS( + "http://www.w3.org/2000/svg", + "rect" + ); + const bbox = parent.getBBox(); + rect.x.baseVal.value = bbox.x; + rect.y.baseVal.value = bbox.y; + rect.width.baseVal.value = bbox.width; + rect.height.baseVal.value = bbox.height; + rect.setAttribute("class", className); + addItems.push({ parent: parent, target: rect }); } } - else if (!jQuery(node).is("button, select, textarea")) { - jQuery.each(node.childNodes, function() { - highlight(this, addItems); - }); - } + } else if (node.matches && !node.matches("button, select, textarea")) { + node.childNodes.forEach((el) => _highlight(el, addItems, text, className)); } - var addItems = []; - var result = this.each(function() { - highlight(this, addItems); - }); - for (var i = 0; i < addItems.length; ++i) { - jQuery(addItems[i].parent).before(addItems[i].target); - } - return result; }; - -/* - * backward compatibility for jQuery.browser - * This will be supported until firefox bug is fixed. - */ -if (!jQuery.browser) { - jQuery.uaMatch = function(ua) { - ua = ua.toLowerCase(); - - var match = /(chrome)[ \/]([\w.]+)/.exec(ua) || - /(webkit)[ \/]([\w.]+)/.exec(ua) || - /(opera)(?:.*version|)[ \/]([\w.]+)/.exec(ua) || - /(msie) ([\w.]+)/.exec(ua) || - ua.indexOf("compatible") < 0 && /(mozilla)(?:.*? rv:([\w.]+)|)/.exec(ua) || - []; - - return { - browser: match[ 1 ] || "", - version: match[ 2 ] || "0" - }; - }; - jQuery.browser = {}; - jQuery.browser[jQuery.uaMatch(navigator.userAgent).browser] = true; -} +const _highlightText = (thisNode, text, className) => { + let addItems = []; + _highlight(thisNode, addItems, text, className); + addItems.forEach((obj) => + obj.parent.insertAdjacentElement("beforebegin", obj.target) + ); +}; /** * Small JavaScript module for the documentation. */ -var Documentation = { - - init : function() { - this.fixFirefoxAnchorBug(); - this.highlightSearchWords(); - this.initIndexTable(); - this.initOnKeyListeners(); +const Documentation = { + init: () => { + Documentation.highlightSearchWords(); + Documentation.initDomainIndexTable(); + Documentation.initOnKeyListeners(); }, /** * i18n support */ - TRANSLATIONS : {}, - PLURAL_EXPR : function(n) { return n === 1 ? 0 : 1; }, - LOCALE : 'unknown', + TRANSLATIONS: {}, + PLURAL_EXPR: (n) => (n === 1 ? 0 : 1), + LOCALE: "unknown", // gettext and ngettext don't access this so that the functions // can safely bound to a different name (_ = Documentation.gettext) - gettext : function(string) { - var translated = Documentation.TRANSLATIONS[string]; - if (typeof translated === 'undefined') - return string; - return (typeof translated === 'string') ? translated : translated[0]; - }, - - ngettext : function(singular, plural, n) { - var translated = Documentation.TRANSLATIONS[singular]; - if (typeof translated === 'undefined') - return (n == 1) ? singular : plural; - return translated[Documentation.PLURALEXPR(n)]; - }, - - addTranslations : function(catalog) { - for (var key in catalog.messages) - this.TRANSLATIONS[key] = catalog.messages[key]; - this.PLURAL_EXPR = new Function('n', 'return +(' + catalog.plural_expr + ')'); - this.LOCALE = catalog.locale; + gettext: (string) => { + const translated = Documentation.TRANSLATIONS[string]; + switch (typeof translated) { + case "undefined": + return string; // no translation + case "string": + return translated; // translation exists + default: + return translated[0]; // (singular, plural) translation tuple exists + } }, - /** - * add context elements like header anchor links - */ - addContextElements : function() { - $('div[id] > :header:first').each(function() { - $('\u00B6'). - attr('href', '#' + this.id). - attr('title', _('Permalink to this headline')). - appendTo(this); - }); - $('dt[id]').each(function() { - $('\u00B6'). - attr('href', '#' + this.id). - attr('title', _('Permalink to this definition')). - appendTo(this); - }); + ngettext: (singular, plural, n) => { + const translated = Documentation.TRANSLATIONS[singular]; + if (typeof translated !== "undefined") + return translated[Documentation.PLURAL_EXPR(n)]; + return n === 1 ? singular : plural; }, - /** - * workaround a firefox stupidity - * see: https://bugzilla.mozilla.org/show_bug.cgi?id=645075 - */ - fixFirefoxAnchorBug : function() { - if (document.location.hash && $.browser.mozilla) - window.setTimeout(function() { - document.location.href += ''; - }, 10); + addTranslations: (catalog) => { + Object.assign(Documentation.TRANSLATIONS, catalog.messages); + Documentation.PLURAL_EXPR = new Function( + "n", + `return (${catalog.plural_expr})` + ); + Documentation.LOCALE = catalog.locale; }, /** * highlight the search words provided in the url in the text */ - highlightSearchWords : function() { - var params = $.getQueryParameters(); - var terms = (params.highlight) ? params.highlight[0].split(/\s+/) : []; - if (terms.length) { - var body = $('div.body'); - if (!body.length) { - body = $('body'); - } - window.setTimeout(function() { - $.each(terms, function() { - body.highlightText(this.toLowerCase(), 'highlighted'); - }); - }, 10); - $('') - .appendTo($('#searchbox')); - } - }, + highlightSearchWords: () => { + const highlight = + new URLSearchParams(window.location.search).get("highlight") || ""; + const terms = highlight.toLowerCase().split(/\s+/).filter(x => x); + if (terms.length === 0) return; // nothing to do - /** - * init the domain index toggle buttons - */ - initIndexTable : function() { - var togglers = $('img.toggler').click(function() { - var src = $(this).attr('src'); - var idnum = $(this).attr('id').substr(7); - $('tr.cg-' + idnum).toggle(); - if (src.substr(-9) === 'minus.png') - $(this).attr('src', src.substr(0, src.length-9) + 'plus.png'); - else - $(this).attr('src', src.substr(0, src.length-8) + 'minus.png'); - }).css('display', ''); - if (DOCUMENTATION_OPTIONS.COLLAPSE_INDEX) { - togglers.click(); - } + // There should never be more than one element matching "div.body" + const divBody = document.querySelectorAll("div.body"); + const body = divBody.length ? divBody[0] : document.querySelector("body"); + window.setTimeout(() => { + terms.forEach((term) => _highlightText(body, term, "highlighted")); + }, 10); + + const searchBox = document.getElementById("searchbox"); + if (searchBox === null) return; + searchBox.appendChild( + document + .createRange() + .createContextualFragment( + '" + ) + ); }, /** * helper function to hide the search marks again */ - hideSearchWords : function() { - $('#searchbox .highlight-link').fadeOut(300); - $('span.highlighted').removeClass('highlighted'); - var url = new URL(window.location); - url.searchParams.delete('highlight'); - window.history.replaceState({}, '', url); + hideSearchWords: () => { + document + .querySelectorAll("#searchbox .highlight-link") + .forEach((el) => el.remove()); + document + .querySelectorAll("span.highlighted") + .forEach((el) => el.classList.remove("highlighted")); + const url = new URL(window.location); + url.searchParams.delete("highlight"); + window.history.replaceState({}, "", url); }, - /** + /** * helper function to focus on search bar */ - focusSearchBar : function() { - $('input[name=q]').first().focus(); + focusSearchBar: () => { + document.querySelectorAll("input[name=q]")[0]?.focus(); }, /** - * make the url absolute + * Initialise the domain index toggle buttons */ - makeURL : function(relativeURL) { - return DOCUMENTATION_OPTIONS.URL_ROOT + '/' + relativeURL; - }, + initDomainIndexTable: () => { + const toggler = (el) => { + const idNumber = el.id.substr(7); + const toggledRows = document.querySelectorAll(`tr.cg-${idNumber}`); + if (el.src.substr(-9) === "minus.png") { + el.src = `${el.src.substr(0, el.src.length - 9)}plus.png`; + toggledRows.forEach((el) => (el.style.display = "none")); + } else { + el.src = `${el.src.substr(0, el.src.length - 8)}minus.png`; + toggledRows.forEach((el) => (el.style.display = "")); + } + }; - /** - * get the current relative url - */ - getCurrentURL : function() { - var path = document.location.pathname; - var parts = path.split(/\//); - $.each(DOCUMENTATION_OPTIONS.URL_ROOT.split(/\//), function() { - if (this === '..') - parts.pop(); - }); - var url = parts.join('/'); - return path.substring(url.lastIndexOf('/') + 1, path.length - 1); + const togglerElements = document.querySelectorAll("img.toggler"); + togglerElements.forEach((el) => + el.addEventListener("click", (event) => toggler(event.currentTarget)) + ); + togglerElements.forEach((el) => (el.style.display = "")); + if (DOCUMENTATION_OPTIONS.COLLAPSE_INDEX) togglerElements.forEach(toggler); }, - initOnKeyListeners: function() { + initOnKeyListeners: () => { // only install a listener if it is really needed - if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS && - !DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) - return; + if ( + !DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS && + !DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS + ) + return; - $(document).keydown(function(event) { - var activeElementType = document.activeElement.tagName; - // don't navigate when in search box, textarea, dropdown or button - if (activeElementType !== 'TEXTAREA' && activeElementType !== 'INPUT' && activeElementType !== 'SELECT' - && activeElementType !== 'BUTTON') { - if (event.altKey || event.ctrlKey || event.metaKey) - return; + const blacklistedElements = new Set([ + "TEXTAREA", + "INPUT", + "SELECT", + "BUTTON", + ]); + document.addEventListener("keydown", (event) => { + if (blacklistedElements.has(document.activeElement.tagName)) return; // bail for input elements + if (event.altKey || event.ctrlKey || event.metaKey) return; // bail with special keys - if (!event.shiftKey) { - switch (event.key) { - case 'ArrowLeft': - if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) - break; - var prevHref = $('link[rel="prev"]').prop('href'); - if (prevHref) { - window.location.href = prevHref; - return false; - } - break; - case 'ArrowRight': - if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) - break; - var nextHref = $('link[rel="next"]').prop('href'); - if (nextHref) { - window.location.href = nextHref; - return false; - } - break; - case 'Escape': - if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) - break; - Documentation.hideSearchWords(); - return false; - } - } - - // some keyboard layouts may need Shift to get / + if (!event.shiftKey) { switch (event.key) { - case '/': - if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) - break; - Documentation.focusSearchBar(); - return false; + case "ArrowLeft": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const prevLink = document.querySelector('link[rel="prev"]'); + if (prevLink && prevLink.href) { + window.location.href = prevLink.href; + event.preventDefault(); + } + break; + case "ArrowRight": + if (!DOCUMENTATION_OPTIONS.NAVIGATION_WITH_KEYS) break; + + const nextLink = document.querySelector('link[rel="next"]'); + if (nextLink && nextLink.href) { + window.location.href = nextLink.href; + event.preventDefault(); + } + break; + case "Escape": + if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) break; + Documentation.hideSearchWords(); + event.preventDefault(); } } + + // some keyboard layouts may need Shift to get / + switch (event.key) { + case "/": + if (!DOCUMENTATION_OPTIONS.ENABLE_SEARCH_SHORTCUTS) break; + Documentation.focusSearchBar(); + event.preventDefault(); + } }); - } + }, }; // quick alias for translations -_ = Documentation.gettext; +const _ = Documentation.gettext; -$(document).ready(function() { - Documentation.init(); -}); +_ready(Documentation.init); diff --git a/docs/_static/documentation_options.js b/docs/_static/documentation_options.js index b0da6c2591..0fd2c23021 100644 --- a/docs/_static/documentation_options.js +++ b/docs/_static/documentation_options.js @@ -1,7 +1,7 @@ var DOCUMENTATION_OPTIONS = { URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'), - VERSION: 'v2.3.0.dev0+85971ff', - LANGUAGE: 'None', + VERSION: 'v2.4.0.dev0+4dc9acfc9', + LANGUAGE: 'en', COLLAPSE_INDEX: false, BUILDER: 'html', FILE_SUFFIX: '.html', @@ -10,5 +10,5 @@ var DOCUMENTATION_OPTIONS = { SOURCELINK_SUFFIX: '.txt', NAVIGATION_WITH_KEYS: false, SHOW_SEARCH_SUMMARY: true, - ENABLE_SEARCH_SHORTCUTS: true, + ENABLE_SEARCH_SHORTCUTS: false, }; \ No newline at end of file diff --git a/docs/_static/jquery.js b/docs/_static/jquery.js index b0614034ad..c4c6022f29 100644 --- a/docs/_static/jquery.js +++ b/docs/_static/jquery.js @@ -1,2 +1,2 @@ -/*! jQuery v3.5.1 | (c) JS Foundation and other contributors | jquery.org/license */ -!function(e,t){"use strict";"object"==typeof module&&"object"==typeof module.exports?module.exports=e.document?t(e,!0):function(e){if(!e.document)throw new Error("jQuery requires a window with a document");return t(e)}:t(e)}("undefined"!=typeof window?window:this,function(C,e){"use strict";var t=[],r=Object.getPrototypeOf,s=t.slice,g=t.flat?function(e){return t.flat.call(e)}:function(e){return t.concat.apply([],e)},u=t.push,i=t.indexOf,n={},o=n.toString,v=n.hasOwnProperty,a=v.toString,l=a.call(Object),y={},m=function(e){return"function"==typeof e&&"number"!=typeof e.nodeType},x=function(e){return null!=e&&e===e.window},E=C.document,c={type:!0,src:!0,nonce:!0,noModule:!0};function b(e,t,n){var r,i,o=(n=n||E).createElement("script");if(o.text=e,t)for(r in c)(i=t[r]||t.getAttribute&&t.getAttribute(r))&&o.setAttribute(r,i);n.head.appendChild(o).parentNode.removeChild(o)}function w(e){return null==e?e+"":"object"==typeof e||"function"==typeof e?n[o.call(e)]||"object":typeof e}var f="3.5.1",S=function(e,t){return new S.fn.init(e,t)};function p(e){var t=!!e&&"length"in e&&e.length,n=w(e);return!m(e)&&!x(e)&&("array"===n||0===t||"number"==typeof t&&0+~]|"+M+")"+M+"*"),U=new RegExp(M+"|>"),X=new RegExp(F),V=new RegExp("^"+I+"$"),G={ID:new RegExp("^#("+I+")"),CLASS:new RegExp("^\\.("+I+")"),TAG:new RegExp("^("+I+"|[*])"),ATTR:new RegExp("^"+W),PSEUDO:new RegExp("^"+F),CHILD:new RegExp("^:(only|first|last|nth|nth-last)-(child|of-type)(?:\\("+M+"*(even|odd|(([+-]|)(\\d*)n|)"+M+"*(?:([+-]|)"+M+"*(\\d+)|))"+M+"*\\)|)","i"),bool:new RegExp("^(?:"+R+")$","i"),needsContext:new RegExp("^"+M+"*[>+~]|:(even|odd|eq|gt|lt|nth|first|last)(?:\\("+M+"*((?:-\\d)?\\d*)"+M+"*\\)|)(?=[^-]|$)","i")},Y=/HTML$/i,Q=/^(?:input|select|textarea|button)$/i,J=/^h\d$/i,K=/^[^{]+\{\s*\[native \w/,Z=/^(?:#([\w-]+)|(\w+)|\.([\w-]+))$/,ee=/[+~]/,te=new RegExp("\\\\[\\da-fA-F]{1,6}"+M+"?|\\\\([^\\r\\n\\f])","g"),ne=function(e,t){var n="0x"+e.slice(1)-65536;return t||(n<0?String.fromCharCode(n+65536):String.fromCharCode(n>>10|55296,1023&n|56320))},re=/([\0-\x1f\x7f]|^-?\d)|^-$|[^\0-\x1f\x7f-\uFFFF\w-]/g,ie=function(e,t){return t?"\0"===e?"\ufffd":e.slice(0,-1)+"\\"+e.charCodeAt(e.length-1).toString(16)+" ":"\\"+e},oe=function(){T()},ae=be(function(e){return!0===e.disabled&&"fieldset"===e.nodeName.toLowerCase()},{dir:"parentNode",next:"legend"});try{H.apply(t=O.call(p.childNodes),p.childNodes),t[p.childNodes.length].nodeType}catch(e){H={apply:t.length?function(e,t){L.apply(e,O.call(t))}:function(e,t){var n=e.length,r=0;while(e[n++]=t[r++]);e.length=n-1}}}function se(t,e,n,r){var i,o,a,s,u,l,c,f=e&&e.ownerDocument,p=e?e.nodeType:9;if(n=n||[],"string"!=typeof t||!t||1!==p&&9!==p&&11!==p)return n;if(!r&&(T(e),e=e||C,E)){if(11!==p&&(u=Z.exec(t)))if(i=u[1]){if(9===p){if(!(a=e.getElementById(i)))return n;if(a.id===i)return n.push(a),n}else if(f&&(a=f.getElementById(i))&&y(e,a)&&a.id===i)return n.push(a),n}else{if(u[2])return H.apply(n,e.getElementsByTagName(t)),n;if((i=u[3])&&d.getElementsByClassName&&e.getElementsByClassName)return H.apply(n,e.getElementsByClassName(i)),n}if(d.qsa&&!N[t+" "]&&(!v||!v.test(t))&&(1!==p||"object"!==e.nodeName.toLowerCase())){if(c=t,f=e,1===p&&(U.test(t)||z.test(t))){(f=ee.test(t)&&ye(e.parentNode)||e)===e&&d.scope||((s=e.getAttribute("id"))?s=s.replace(re,ie):e.setAttribute("id",s=S)),o=(l=h(t)).length;while(o--)l[o]=(s?"#"+s:":scope")+" "+xe(l[o]);c=l.join(",")}try{return H.apply(n,f.querySelectorAll(c)),n}catch(e){N(t,!0)}finally{s===S&&e.removeAttribute("id")}}}return g(t.replace($,"$1"),e,n,r)}function ue(){var r=[];return function e(t,n){return r.push(t+" ")>b.cacheLength&&delete e[r.shift()],e[t+" "]=n}}function le(e){return e[S]=!0,e}function ce(e){var t=C.createElement("fieldset");try{return!!e(t)}catch(e){return!1}finally{t.parentNode&&t.parentNode.removeChild(t),t=null}}function fe(e,t){var n=e.split("|"),r=n.length;while(r--)b.attrHandle[n[r]]=t}function pe(e,t){var n=t&&e,r=n&&1===e.nodeType&&1===t.nodeType&&e.sourceIndex-t.sourceIndex;if(r)return r;if(n)while(n=n.nextSibling)if(n===t)return-1;return e?1:-1}function de(t){return function(e){return"input"===e.nodeName.toLowerCase()&&e.type===t}}function he(n){return function(e){var t=e.nodeName.toLowerCase();return("input"===t||"button"===t)&&e.type===n}}function ge(t){return function(e){return"form"in e?e.parentNode&&!1===e.disabled?"label"in e?"label"in e.parentNode?e.parentNode.disabled===t:e.disabled===t:e.isDisabled===t||e.isDisabled!==!t&&ae(e)===t:e.disabled===t:"label"in e&&e.disabled===t}}function ve(a){return le(function(o){return o=+o,le(function(e,t){var n,r=a([],e.length,o),i=r.length;while(i--)e[n=r[i]]&&(e[n]=!(t[n]=e[n]))})})}function ye(e){return e&&"undefined"!=typeof e.getElementsByTagName&&e}for(e in d=se.support={},i=se.isXML=function(e){var t=e.namespaceURI,n=(e.ownerDocument||e).documentElement;return!Y.test(t||n&&n.nodeName||"HTML")},T=se.setDocument=function(e){var t,n,r=e?e.ownerDocument||e:p;return r!=C&&9===r.nodeType&&r.documentElement&&(a=(C=r).documentElement,E=!i(C),p!=C&&(n=C.defaultView)&&n.top!==n&&(n.addEventListener?n.addEventListener("unload",oe,!1):n.attachEvent&&n.attachEvent("onunload",oe)),d.scope=ce(function(e){return a.appendChild(e).appendChild(C.createElement("div")),"undefined"!=typeof e.querySelectorAll&&!e.querySelectorAll(":scope fieldset div").length}),d.attributes=ce(function(e){return e.className="i",!e.getAttribute("className")}),d.getElementsByTagName=ce(function(e){return e.appendChild(C.createComment("")),!e.getElementsByTagName("*").length}),d.getElementsByClassName=K.test(C.getElementsByClassName),d.getById=ce(function(e){return a.appendChild(e).id=S,!C.getElementsByName||!C.getElementsByName(S).length}),d.getById?(b.filter.ID=function(e){var t=e.replace(te,ne);return function(e){return e.getAttribute("id")===t}},b.find.ID=function(e,t){if("undefined"!=typeof t.getElementById&&E){var n=t.getElementById(e);return n?[n]:[]}}):(b.filter.ID=function(e){var n=e.replace(te,ne);return function(e){var t="undefined"!=typeof e.getAttributeNode&&e.getAttributeNode("id");return t&&t.value===n}},b.find.ID=function(e,t){if("undefined"!=typeof t.getElementById&&E){var n,r,i,o=t.getElementById(e);if(o){if((n=o.getAttributeNode("id"))&&n.value===e)return[o];i=t.getElementsByName(e),r=0;while(o=i[r++])if((n=o.getAttributeNode("id"))&&n.value===e)return[o]}return[]}}),b.find.TAG=d.getElementsByTagName?function(e,t){return"undefined"!=typeof t.getElementsByTagName?t.getElementsByTagName(e):d.qsa?t.querySelectorAll(e):void 0}:function(e,t){var n,r=[],i=0,o=t.getElementsByTagName(e);if("*"===e){while(n=o[i++])1===n.nodeType&&r.push(n);return r}return o},b.find.CLASS=d.getElementsByClassName&&function(e,t){if("undefined"!=typeof t.getElementsByClassName&&E)return t.getElementsByClassName(e)},s=[],v=[],(d.qsa=K.test(C.querySelectorAll))&&(ce(function(e){var t;a.appendChild(e).innerHTML="",e.querySelectorAll("[msallowcapture^='']").length&&v.push("[*^$]="+M+"*(?:''|\"\")"),e.querySelectorAll("[selected]").length||v.push("\\["+M+"*(?:value|"+R+")"),e.querySelectorAll("[id~="+S+"-]").length||v.push("~="),(t=C.createElement("input")).setAttribute("name",""),e.appendChild(t),e.querySelectorAll("[name='']").length||v.push("\\["+M+"*name"+M+"*="+M+"*(?:''|\"\")"),e.querySelectorAll(":checked").length||v.push(":checked"),e.querySelectorAll("a#"+S+"+*").length||v.push(".#.+[+~]"),e.querySelectorAll("\\\f"),v.push("[\\r\\n\\f]")}),ce(function(e){e.innerHTML="";var t=C.createElement("input");t.setAttribute("type","hidden"),e.appendChild(t).setAttribute("name","D"),e.querySelectorAll("[name=d]").length&&v.push("name"+M+"*[*^$|!~]?="),2!==e.querySelectorAll(":enabled").length&&v.push(":enabled",":disabled"),a.appendChild(e).disabled=!0,2!==e.querySelectorAll(":disabled").length&&v.push(":enabled",":disabled"),e.querySelectorAll("*,:x"),v.push(",.*:")})),(d.matchesSelector=K.test(c=a.matches||a.webkitMatchesSelector||a.mozMatchesSelector||a.oMatchesSelector||a.msMatchesSelector))&&ce(function(e){d.disconnectedMatch=c.call(e,"*"),c.call(e,"[s!='']:x"),s.push("!=",F)}),v=v.length&&new RegExp(v.join("|")),s=s.length&&new RegExp(s.join("|")),t=K.test(a.compareDocumentPosition),y=t||K.test(a.contains)?function(e,t){var n=9===e.nodeType?e.documentElement:e,r=t&&t.parentNode;return e===r||!(!r||1!==r.nodeType||!(n.contains?n.contains(r):e.compareDocumentPosition&&16&e.compareDocumentPosition(r)))}:function(e,t){if(t)while(t=t.parentNode)if(t===e)return!0;return!1},D=t?function(e,t){if(e===t)return l=!0,0;var n=!e.compareDocumentPosition-!t.compareDocumentPosition;return n||(1&(n=(e.ownerDocument||e)==(t.ownerDocument||t)?e.compareDocumentPosition(t):1)||!d.sortDetached&&t.compareDocumentPosition(e)===n?e==C||e.ownerDocument==p&&y(p,e)?-1:t==C||t.ownerDocument==p&&y(p,t)?1:u?P(u,e)-P(u,t):0:4&n?-1:1)}:function(e,t){if(e===t)return l=!0,0;var n,r=0,i=e.parentNode,o=t.parentNode,a=[e],s=[t];if(!i||!o)return e==C?-1:t==C?1:i?-1:o?1:u?P(u,e)-P(u,t):0;if(i===o)return pe(e,t);n=e;while(n=n.parentNode)a.unshift(n);n=t;while(n=n.parentNode)s.unshift(n);while(a[r]===s[r])r++;return r?pe(a[r],s[r]):a[r]==p?-1:s[r]==p?1:0}),C},se.matches=function(e,t){return se(e,null,null,t)},se.matchesSelector=function(e,t){if(T(e),d.matchesSelector&&E&&!N[t+" "]&&(!s||!s.test(t))&&(!v||!v.test(t)))try{var n=c.call(e,t);if(n||d.disconnectedMatch||e.document&&11!==e.document.nodeType)return n}catch(e){N(t,!0)}return 0":{dir:"parentNode",first:!0}," ":{dir:"parentNode"},"+":{dir:"previousSibling",first:!0},"~":{dir:"previousSibling"}},preFilter:{ATTR:function(e){return e[1]=e[1].replace(te,ne),e[3]=(e[3]||e[4]||e[5]||"").replace(te,ne),"~="===e[2]&&(e[3]=" "+e[3]+" "),e.slice(0,4)},CHILD:function(e){return e[1]=e[1].toLowerCase(),"nth"===e[1].slice(0,3)?(e[3]||se.error(e[0]),e[4]=+(e[4]?e[5]+(e[6]||1):2*("even"===e[3]||"odd"===e[3])),e[5]=+(e[7]+e[8]||"odd"===e[3])):e[3]&&se.error(e[0]),e},PSEUDO:function(e){var t,n=!e[6]&&e[2];return G.CHILD.test(e[0])?null:(e[3]?e[2]=e[4]||e[5]||"":n&&X.test(n)&&(t=h(n,!0))&&(t=n.indexOf(")",n.length-t)-n.length)&&(e[0]=e[0].slice(0,t),e[2]=n.slice(0,t)),e.slice(0,3))}},filter:{TAG:function(e){var t=e.replace(te,ne).toLowerCase();return"*"===e?function(){return!0}:function(e){return e.nodeName&&e.nodeName.toLowerCase()===t}},CLASS:function(e){var t=m[e+" "];return t||(t=new RegExp("(^|"+M+")"+e+"("+M+"|$)"))&&m(e,function(e){return t.test("string"==typeof e.className&&e.className||"undefined"!=typeof e.getAttribute&&e.getAttribute("class")||"")})},ATTR:function(n,r,i){return function(e){var t=se.attr(e,n);return null==t?"!="===r:!r||(t+="","="===r?t===i:"!="===r?t!==i:"^="===r?i&&0===t.indexOf(i):"*="===r?i&&-1:\x20\t\r\n\f]*)[\x20\t\r\n\f]*\/?>(?:<\/\1>|)$/i;function D(e,n,r){return m(n)?S.grep(e,function(e,t){return!!n.call(e,t,e)!==r}):n.nodeType?S.grep(e,function(e){return e===n!==r}):"string"!=typeof n?S.grep(e,function(e){return-1)[^>]*|#([\w-]+))$/;(S.fn.init=function(e,t,n){var r,i;if(!e)return this;if(n=n||j,"string"==typeof e){if(!(r="<"===e[0]&&">"===e[e.length-1]&&3<=e.length?[null,e,null]:q.exec(e))||!r[1]&&t)return!t||t.jquery?(t||n).find(e):this.constructor(t).find(e);if(r[1]){if(t=t instanceof S?t[0]:t,S.merge(this,S.parseHTML(r[1],t&&t.nodeType?t.ownerDocument||t:E,!0)),N.test(r[1])&&S.isPlainObject(t))for(r in t)m(this[r])?this[r](t[r]):this.attr(r,t[r]);return this}return(i=E.getElementById(r[2]))&&(this[0]=i,this.length=1),this}return e.nodeType?(this[0]=e,this.length=1,this):m(e)?void 0!==n.ready?n.ready(e):e(S):S.makeArray(e,this)}).prototype=S.fn,j=S(E);var L=/^(?:parents|prev(?:Until|All))/,H={children:!0,contents:!0,next:!0,prev:!0};function O(e,t){while((e=e[t])&&1!==e.nodeType);return e}S.fn.extend({has:function(e){var t=S(e,this),n=t.length;return this.filter(function(){for(var e=0;e\x20\t\r\n\f]*)/i,he=/^$|^module$|\/(?:java|ecma)script/i;ce=E.createDocumentFragment().appendChild(E.createElement("div")),(fe=E.createElement("input")).setAttribute("type","radio"),fe.setAttribute("checked","checked"),fe.setAttribute("name","t"),ce.appendChild(fe),y.checkClone=ce.cloneNode(!0).cloneNode(!0).lastChild.checked,ce.innerHTML="",y.noCloneChecked=!!ce.cloneNode(!0).lastChild.defaultValue,ce.innerHTML="",y.option=!!ce.lastChild;var ge={thead:[1,"","
      "],col:[2,"","
      "],tr:[2,"","
      "],td:[3,"","
      "],_default:[0,"",""]};function ve(e,t){var n;return n="undefined"!=typeof e.getElementsByTagName?e.getElementsByTagName(t||"*"):"undefined"!=typeof e.querySelectorAll?e.querySelectorAll(t||"*"):[],void 0===t||t&&A(e,t)?S.merge([e],n):n}function ye(e,t){for(var n=0,r=e.length;n",""]);var me=/<|&#?\w+;/;function xe(e,t,n,r,i){for(var o,a,s,u,l,c,f=t.createDocumentFragment(),p=[],d=0,h=e.length;d\s*$/g;function qe(e,t){return A(e,"table")&&A(11!==t.nodeType?t:t.firstChild,"tr")&&S(e).children("tbody")[0]||e}function Le(e){return e.type=(null!==e.getAttribute("type"))+"/"+e.type,e}function He(e){return"true/"===(e.type||"").slice(0,5)?e.type=e.type.slice(5):e.removeAttribute("type"),e}function Oe(e,t){var n,r,i,o,a,s;if(1===t.nodeType){if(Y.hasData(e)&&(s=Y.get(e).events))for(i in Y.remove(t,"handle events"),s)for(n=0,r=s[i].length;n").attr(n.scriptAttrs||{}).prop({charset:n.scriptCharset,src:n.url}).on("load error",i=function(e){r.remove(),i=null,e&&t("error"===e.type?404:200,e.type)}),E.head.appendChild(r[0])},abort:function(){i&&i()}}});var Ut,Xt=[],Vt=/(=)\?(?=&|$)|\?\?/;S.ajaxSetup({jsonp:"callback",jsonpCallback:function(){var e=Xt.pop()||S.expando+"_"+Ct.guid++;return this[e]=!0,e}}),S.ajaxPrefilter("json jsonp",function(e,t,n){var r,i,o,a=!1!==e.jsonp&&(Vt.test(e.url)?"url":"string"==typeof e.data&&0===(e.contentType||"").indexOf("application/x-www-form-urlencoded")&&Vt.test(e.data)&&"data");if(a||"jsonp"===e.dataTypes[0])return r=e.jsonpCallback=m(e.jsonpCallback)?e.jsonpCallback():e.jsonpCallback,a?e[a]=e[a].replace(Vt,"$1"+r):!1!==e.jsonp&&(e.url+=(Et.test(e.url)?"&":"?")+e.jsonp+"="+r),e.converters["script json"]=function(){return o||S.error(r+" was not called"),o[0]},e.dataTypes[0]="json",i=C[r],C[r]=function(){o=arguments},n.always(function(){void 0===i?S(C).removeProp(r):C[r]=i,e[r]&&(e.jsonpCallback=t.jsonpCallback,Xt.push(r)),o&&m(i)&&i(o[0]),o=i=void 0}),"script"}),y.createHTMLDocument=((Ut=E.implementation.createHTMLDocument("").body).innerHTML="
      ",2===Ut.childNodes.length),S.parseHTML=function(e,t,n){return"string"!=typeof e?[]:("boolean"==typeof t&&(n=t,t=!1),t||(y.createHTMLDocument?((r=(t=E.implementation.createHTMLDocument("")).createElement("base")).href=E.location.href,t.head.appendChild(r)):t=E),o=!n&&[],(i=N.exec(e))?[t.createElement(i[1])]:(i=xe([e],t,o),o&&o.length&&S(o).remove(),S.merge([],i.childNodes)));var r,i,o},S.fn.load=function(e,t,n){var r,i,o,a=this,s=e.indexOf(" ");return-1").append(S.parseHTML(e)).find(r):e)}).always(n&&function(e,t){a.each(function(){n.apply(this,o||[e.responseText,t,e])})}),this},S.expr.pseudos.animated=function(t){return S.grep(S.timers,function(e){return t===e.elem}).length},S.offset={setOffset:function(e,t,n){var r,i,o,a,s,u,l=S.css(e,"position"),c=S(e),f={};"static"===l&&(e.style.position="relative"),s=c.offset(),o=S.css(e,"top"),u=S.css(e,"left"),("absolute"===l||"fixed"===l)&&-1<(o+u).indexOf("auto")?(a=(r=c.position()).top,i=r.left):(a=parseFloat(o)||0,i=parseFloat(u)||0),m(t)&&(t=t.call(e,n,S.extend({},s))),null!=t.top&&(f.top=t.top-s.top+a),null!=t.left&&(f.left=t.left-s.left+i),"using"in t?t.using.call(e,f):("number"==typeof f.top&&(f.top+="px"),"number"==typeof f.left&&(f.left+="px"),c.css(f))}},S.fn.extend({offset:function(t){if(arguments.length)return void 0===t?this:this.each(function(e){S.offset.setOffset(this,t,e)});var e,n,r=this[0];return r?r.getClientRects().length?(e=r.getBoundingClientRect(),n=r.ownerDocument.defaultView,{top:e.top+n.pageYOffset,left:e.left+n.pageXOffset}):{top:0,left:0}:void 0},position:function(){if(this[0]){var e,t,n,r=this[0],i={top:0,left:0};if("fixed"===S.css(r,"position"))t=r.getBoundingClientRect();else{t=this.offset(),n=r.ownerDocument,e=r.offsetParent||n.documentElement;while(e&&(e===n.body||e===n.documentElement)&&"static"===S.css(e,"position"))e=e.parentNode;e&&e!==r&&1===e.nodeType&&((i=S(e).offset()).top+=S.css(e,"borderTopWidth",!0),i.left+=S.css(e,"borderLeftWidth",!0))}return{top:t.top-i.top-S.css(r,"marginTop",!0),left:t.left-i.left-S.css(r,"marginLeft",!0)}}},offsetParent:function(){return this.map(function(){var e=this.offsetParent;while(e&&"static"===S.css(e,"position"))e=e.offsetParent;return e||re})}}),S.each({scrollLeft:"pageXOffset",scrollTop:"pageYOffset"},function(t,i){var o="pageYOffset"===i;S.fn[t]=function(e){return $(this,function(e,t,n){var r;if(x(e)?r=e:9===e.nodeType&&(r=e.defaultView),void 0===n)return r?r[i]:e[t];r?r.scrollTo(o?r.pageXOffset:n,o?n:r.pageYOffset):e[t]=n},t,e,arguments.length)}}),S.each(["top","left"],function(e,n){S.cssHooks[n]=$e(y.pixelPosition,function(e,t){if(t)return t=Be(e,n),Me.test(t)?S(e).position()[n]+"px":t})}),S.each({Height:"height",Width:"width"},function(a,s){S.each({padding:"inner"+a,content:s,"":"outer"+a},function(r,o){S.fn[o]=function(e,t){var n=arguments.length&&(r||"boolean"!=typeof e),i=r||(!0===e||!0===t?"margin":"border");return $(this,function(e,t,n){var r;return x(e)?0===o.indexOf("outer")?e["inner"+a]:e.document.documentElement["client"+a]:9===e.nodeType?(r=e.documentElement,Math.max(e.body["scroll"+a],r["scroll"+a],e.body["offset"+a],r["offset"+a],r["client"+a])):void 0===n?S.css(e,t,i):S.style(e,t,n,i)},s,n?e:void 0,n)}})}),S.each(["ajaxStart","ajaxStop","ajaxComplete","ajaxError","ajaxSuccess","ajaxSend"],function(e,t){S.fn[t]=function(e){return this.on(t,e)}}),S.fn.extend({bind:function(e,t,n){return this.on(e,null,t,n)},unbind:function(e,t){return this.off(e,null,t)},delegate:function(e,t,n,r){return this.on(t,e,n,r)},undelegate:function(e,t,n){return 1===arguments.length?this.off(e,"**"):this.off(t,e||"**",n)},hover:function(e,t){return this.mouseenter(e).mouseleave(t||e)}}),S.each("blur focus focusin focusout resize scroll click dblclick mousedown mouseup mousemove mouseover mouseout mouseenter mouseleave change select submit keydown keypress keyup contextmenu".split(" "),function(e,n){S.fn[n]=function(e,t){return 0+~]|"+M+")"+M+"*"),U=new RegExp(M+"|>"),X=new RegExp(F),V=new RegExp("^"+I+"$"),G={ID:new RegExp("^#("+I+")"),CLASS:new RegExp("^\\.("+I+")"),TAG:new RegExp("^("+I+"|[*])"),ATTR:new RegExp("^"+W),PSEUDO:new RegExp("^"+F),CHILD:new RegExp("^:(only|first|last|nth|nth-last)-(child|of-type)(?:\\("+M+"*(even|odd|(([+-]|)(\\d*)n|)"+M+"*(?:([+-]|)"+M+"*(\\d+)|))"+M+"*\\)|)","i"),bool:new RegExp("^(?:"+R+")$","i"),needsContext:new RegExp("^"+M+"*[>+~]|:(even|odd|eq|gt|lt|nth|first|last)(?:\\("+M+"*((?:-\\d)?\\d*)"+M+"*\\)|)(?=[^-]|$)","i")},Y=/HTML$/i,Q=/^(?:input|select|textarea|button)$/i,J=/^h\d$/i,K=/^[^{]+\{\s*\[native \w/,Z=/^(?:#([\w-]+)|(\w+)|\.([\w-]+))$/,ee=/[+~]/,te=new RegExp("\\\\[\\da-fA-F]{1,6}"+M+"?|\\\\([^\\r\\n\\f])","g"),ne=function(e,t){var n="0x"+e.slice(1)-65536;return t||(n<0?String.fromCharCode(n+65536):String.fromCharCode(n>>10|55296,1023&n|56320))},re=/([\0-\x1f\x7f]|^-?\d)|^-$|[^\0-\x1f\x7f-\uFFFF\w-]/g,ie=function(e,t){return t?"\0"===e?"\ufffd":e.slice(0,-1)+"\\"+e.charCodeAt(e.length-1).toString(16)+" ":"\\"+e},oe=function(){T()},ae=be(function(e){return!0===e.disabled&&"fieldset"===e.nodeName.toLowerCase()},{dir:"parentNode",next:"legend"});try{H.apply(t=O.call(p.childNodes),p.childNodes),t[p.childNodes.length].nodeType}catch(e){H={apply:t.length?function(e,t){L.apply(e,O.call(t))}:function(e,t){var n=e.length,r=0;while(e[n++]=t[r++]);e.length=n-1}}}function se(t,e,n,r){var i,o,a,s,u,l,c,f=e&&e.ownerDocument,p=e?e.nodeType:9;if(n=n||[],"string"!=typeof t||!t||1!==p&&9!==p&&11!==p)return n;if(!r&&(T(e),e=e||C,E)){if(11!==p&&(u=Z.exec(t)))if(i=u[1]){if(9===p){if(!(a=e.getElementById(i)))return n;if(a.id===i)return n.push(a),n}else if(f&&(a=f.getElementById(i))&&y(e,a)&&a.id===i)return n.push(a),n}else{if(u[2])return H.apply(n,e.getElementsByTagName(t)),n;if((i=u[3])&&d.getElementsByClassName&&e.getElementsByClassName)return H.apply(n,e.getElementsByClassName(i)),n}if(d.qsa&&!N[t+" "]&&(!v||!v.test(t))&&(1!==p||"object"!==e.nodeName.toLowerCase())){if(c=t,f=e,1===p&&(U.test(t)||z.test(t))){(f=ee.test(t)&&ye(e.parentNode)||e)===e&&d.scope||((s=e.getAttribute("id"))?s=s.replace(re,ie):e.setAttribute("id",s=S)),o=(l=h(t)).length;while(o--)l[o]=(s?"#"+s:":scope")+" "+xe(l[o]);c=l.join(",")}try{return H.apply(n,f.querySelectorAll(c)),n}catch(e){N(t,!0)}finally{s===S&&e.removeAttribute("id")}}}return g(t.replace($,"$1"),e,n,r)}function ue(){var r=[];return function e(t,n){return r.push(t+" ")>b.cacheLength&&delete e[r.shift()],e[t+" "]=n}}function le(e){return e[S]=!0,e}function ce(e){var t=C.createElement("fieldset");try{return!!e(t)}catch(e){return!1}finally{t.parentNode&&t.parentNode.removeChild(t),t=null}}function fe(e,t){var n=e.split("|"),r=n.length;while(r--)b.attrHandle[n[r]]=t}function pe(e,t){var n=t&&e,r=n&&1===e.nodeType&&1===t.nodeType&&e.sourceIndex-t.sourceIndex;if(r)return r;if(n)while(n=n.nextSibling)if(n===t)return-1;return e?1:-1}function de(t){return function(e){return"input"===e.nodeName.toLowerCase()&&e.type===t}}function he(n){return function(e){var t=e.nodeName.toLowerCase();return("input"===t||"button"===t)&&e.type===n}}function ge(t){return function(e){return"form"in e?e.parentNode&&!1===e.disabled?"label"in e?"label"in e.parentNode?e.parentNode.disabled===t:e.disabled===t:e.isDisabled===t||e.isDisabled!==!t&&ae(e)===t:e.disabled===t:"label"in e&&e.disabled===t}}function ve(a){return le(function(o){return o=+o,le(function(e,t){var n,r=a([],e.length,o),i=r.length;while(i--)e[n=r[i]]&&(e[n]=!(t[n]=e[n]))})})}function ye(e){return e&&"undefined"!=typeof e.getElementsByTagName&&e}for(e in d=se.support={},i=se.isXML=function(e){var t=e&&e.namespaceURI,n=e&&(e.ownerDocument||e).documentElement;return!Y.test(t||n&&n.nodeName||"HTML")},T=se.setDocument=function(e){var t,n,r=e?e.ownerDocument||e:p;return r!=C&&9===r.nodeType&&r.documentElement&&(a=(C=r).documentElement,E=!i(C),p!=C&&(n=C.defaultView)&&n.top!==n&&(n.addEventListener?n.addEventListener("unload",oe,!1):n.attachEvent&&n.attachEvent("onunload",oe)),d.scope=ce(function(e){return a.appendChild(e).appendChild(C.createElement("div")),"undefined"!=typeof e.querySelectorAll&&!e.querySelectorAll(":scope fieldset div").length}),d.attributes=ce(function(e){return e.className="i",!e.getAttribute("className")}),d.getElementsByTagName=ce(function(e){return e.appendChild(C.createComment("")),!e.getElementsByTagName("*").length}),d.getElementsByClassName=K.test(C.getElementsByClassName),d.getById=ce(function(e){return a.appendChild(e).id=S,!C.getElementsByName||!C.getElementsByName(S).length}),d.getById?(b.filter.ID=function(e){var t=e.replace(te,ne);return function(e){return e.getAttribute("id")===t}},b.find.ID=function(e,t){if("undefined"!=typeof t.getElementById&&E){var n=t.getElementById(e);return n?[n]:[]}}):(b.filter.ID=function(e){var n=e.replace(te,ne);return function(e){var t="undefined"!=typeof e.getAttributeNode&&e.getAttributeNode("id");return t&&t.value===n}},b.find.ID=function(e,t){if("undefined"!=typeof t.getElementById&&E){var n,r,i,o=t.getElementById(e);if(o){if((n=o.getAttributeNode("id"))&&n.value===e)return[o];i=t.getElementsByName(e),r=0;while(o=i[r++])if((n=o.getAttributeNode("id"))&&n.value===e)return[o]}return[]}}),b.find.TAG=d.getElementsByTagName?function(e,t){return"undefined"!=typeof t.getElementsByTagName?t.getElementsByTagName(e):d.qsa?t.querySelectorAll(e):void 0}:function(e,t){var n,r=[],i=0,o=t.getElementsByTagName(e);if("*"===e){while(n=o[i++])1===n.nodeType&&r.push(n);return r}return o},b.find.CLASS=d.getElementsByClassName&&function(e,t){if("undefined"!=typeof t.getElementsByClassName&&E)return t.getElementsByClassName(e)},s=[],v=[],(d.qsa=K.test(C.querySelectorAll))&&(ce(function(e){var t;a.appendChild(e).innerHTML="",e.querySelectorAll("[msallowcapture^='']").length&&v.push("[*^$]="+M+"*(?:''|\"\")"),e.querySelectorAll("[selected]").length||v.push("\\["+M+"*(?:value|"+R+")"),e.querySelectorAll("[id~="+S+"-]").length||v.push("~="),(t=C.createElement("input")).setAttribute("name",""),e.appendChild(t),e.querySelectorAll("[name='']").length||v.push("\\["+M+"*name"+M+"*="+M+"*(?:''|\"\")"),e.querySelectorAll(":checked").length||v.push(":checked"),e.querySelectorAll("a#"+S+"+*").length||v.push(".#.+[+~]"),e.querySelectorAll("\\\f"),v.push("[\\r\\n\\f]")}),ce(function(e){e.innerHTML="";var t=C.createElement("input");t.setAttribute("type","hidden"),e.appendChild(t).setAttribute("name","D"),e.querySelectorAll("[name=d]").length&&v.push("name"+M+"*[*^$|!~]?="),2!==e.querySelectorAll(":enabled").length&&v.push(":enabled",":disabled"),a.appendChild(e).disabled=!0,2!==e.querySelectorAll(":disabled").length&&v.push(":enabled",":disabled"),e.querySelectorAll("*,:x"),v.push(",.*:")})),(d.matchesSelector=K.test(c=a.matches||a.webkitMatchesSelector||a.mozMatchesSelector||a.oMatchesSelector||a.msMatchesSelector))&&ce(function(e){d.disconnectedMatch=c.call(e,"*"),c.call(e,"[s!='']:x"),s.push("!=",F)}),v=v.length&&new RegExp(v.join("|")),s=s.length&&new RegExp(s.join("|")),t=K.test(a.compareDocumentPosition),y=t||K.test(a.contains)?function(e,t){var n=9===e.nodeType?e.documentElement:e,r=t&&t.parentNode;return e===r||!(!r||1!==r.nodeType||!(n.contains?n.contains(r):e.compareDocumentPosition&&16&e.compareDocumentPosition(r)))}:function(e,t){if(t)while(t=t.parentNode)if(t===e)return!0;return!1},j=t?function(e,t){if(e===t)return l=!0,0;var n=!e.compareDocumentPosition-!t.compareDocumentPosition;return n||(1&(n=(e.ownerDocument||e)==(t.ownerDocument||t)?e.compareDocumentPosition(t):1)||!d.sortDetached&&t.compareDocumentPosition(e)===n?e==C||e.ownerDocument==p&&y(p,e)?-1:t==C||t.ownerDocument==p&&y(p,t)?1:u?P(u,e)-P(u,t):0:4&n?-1:1)}:function(e,t){if(e===t)return l=!0,0;var n,r=0,i=e.parentNode,o=t.parentNode,a=[e],s=[t];if(!i||!o)return e==C?-1:t==C?1:i?-1:o?1:u?P(u,e)-P(u,t):0;if(i===o)return pe(e,t);n=e;while(n=n.parentNode)a.unshift(n);n=t;while(n=n.parentNode)s.unshift(n);while(a[r]===s[r])r++;return r?pe(a[r],s[r]):a[r]==p?-1:s[r]==p?1:0}),C},se.matches=function(e,t){return se(e,null,null,t)},se.matchesSelector=function(e,t){if(T(e),d.matchesSelector&&E&&!N[t+" "]&&(!s||!s.test(t))&&(!v||!v.test(t)))try{var n=c.call(e,t);if(n||d.disconnectedMatch||e.document&&11!==e.document.nodeType)return n}catch(e){N(t,!0)}return 0":{dir:"parentNode",first:!0}," ":{dir:"parentNode"},"+":{dir:"previousSibling",first:!0},"~":{dir:"previousSibling"}},preFilter:{ATTR:function(e){return e[1]=e[1].replace(te,ne),e[3]=(e[3]||e[4]||e[5]||"").replace(te,ne),"~="===e[2]&&(e[3]=" "+e[3]+" "),e.slice(0,4)},CHILD:function(e){return e[1]=e[1].toLowerCase(),"nth"===e[1].slice(0,3)?(e[3]||se.error(e[0]),e[4]=+(e[4]?e[5]+(e[6]||1):2*("even"===e[3]||"odd"===e[3])),e[5]=+(e[7]+e[8]||"odd"===e[3])):e[3]&&se.error(e[0]),e},PSEUDO:function(e){var t,n=!e[6]&&e[2];return G.CHILD.test(e[0])?null:(e[3]?e[2]=e[4]||e[5]||"":n&&X.test(n)&&(t=h(n,!0))&&(t=n.indexOf(")",n.length-t)-n.length)&&(e[0]=e[0].slice(0,t),e[2]=n.slice(0,t)),e.slice(0,3))}},filter:{TAG:function(e){var t=e.replace(te,ne).toLowerCase();return"*"===e?function(){return!0}:function(e){return e.nodeName&&e.nodeName.toLowerCase()===t}},CLASS:function(e){var t=m[e+" "];return t||(t=new RegExp("(^|"+M+")"+e+"("+M+"|$)"))&&m(e,function(e){return t.test("string"==typeof e.className&&e.className||"undefined"!=typeof e.getAttribute&&e.getAttribute("class")||"")})},ATTR:function(n,r,i){return function(e){var t=se.attr(e,n);return null==t?"!="===r:!r||(t+="","="===r?t===i:"!="===r?t!==i:"^="===r?i&&0===t.indexOf(i):"*="===r?i&&-1:\x20\t\r\n\f]*)[\x20\t\r\n\f]*\/?>(?:<\/\1>|)$/i;function j(e,n,r){return m(n)?S.grep(e,function(e,t){return!!n.call(e,t,e)!==r}):n.nodeType?S.grep(e,function(e){return e===n!==r}):"string"!=typeof n?S.grep(e,function(e){return-1)[^>]*|#([\w-]+))$/;(S.fn.init=function(e,t,n){var r,i;if(!e)return this;if(n=n||D,"string"==typeof e){if(!(r="<"===e[0]&&">"===e[e.length-1]&&3<=e.length?[null,e,null]:q.exec(e))||!r[1]&&t)return!t||t.jquery?(t||n).find(e):this.constructor(t).find(e);if(r[1]){if(t=t instanceof S?t[0]:t,S.merge(this,S.parseHTML(r[1],t&&t.nodeType?t.ownerDocument||t:E,!0)),N.test(r[1])&&S.isPlainObject(t))for(r in t)m(this[r])?this[r](t[r]):this.attr(r,t[r]);return this}return(i=E.getElementById(r[2]))&&(this[0]=i,this.length=1),this}return e.nodeType?(this[0]=e,this.length=1,this):m(e)?void 0!==n.ready?n.ready(e):e(S):S.makeArray(e,this)}).prototype=S.fn,D=S(E);var L=/^(?:parents|prev(?:Until|All))/,H={children:!0,contents:!0,next:!0,prev:!0};function O(e,t){while((e=e[t])&&1!==e.nodeType);return e}S.fn.extend({has:function(e){var t=S(e,this),n=t.length;return this.filter(function(){for(var e=0;e\x20\t\r\n\f]*)/i,he=/^$|^module$|\/(?:java|ecma)script/i;ce=E.createDocumentFragment().appendChild(E.createElement("div")),(fe=E.createElement("input")).setAttribute("type","radio"),fe.setAttribute("checked","checked"),fe.setAttribute("name","t"),ce.appendChild(fe),y.checkClone=ce.cloneNode(!0).cloneNode(!0).lastChild.checked,ce.innerHTML="",y.noCloneChecked=!!ce.cloneNode(!0).lastChild.defaultValue,ce.innerHTML="",y.option=!!ce.lastChild;var ge={thead:[1,"","
      "],col:[2,"","
      "],tr:[2,"","
      "],td:[3,"","
      "],_default:[0,"",""]};function ve(e,t){var n;return n="undefined"!=typeof e.getElementsByTagName?e.getElementsByTagName(t||"*"):"undefined"!=typeof e.querySelectorAll?e.querySelectorAll(t||"*"):[],void 0===t||t&&A(e,t)?S.merge([e],n):n}function ye(e,t){for(var n=0,r=e.length;n",""]);var me=/<|&#?\w+;/;function xe(e,t,n,r,i){for(var o,a,s,u,l,c,f=t.createDocumentFragment(),p=[],d=0,h=e.length;d\s*$/g;function je(e,t){return A(e,"table")&&A(11!==t.nodeType?t:t.firstChild,"tr")&&S(e).children("tbody")[0]||e}function De(e){return e.type=(null!==e.getAttribute("type"))+"/"+e.type,e}function qe(e){return"true/"===(e.type||"").slice(0,5)?e.type=e.type.slice(5):e.removeAttribute("type"),e}function Le(e,t){var n,r,i,o,a,s;if(1===t.nodeType){if(Y.hasData(e)&&(s=Y.get(e).events))for(i in Y.remove(t,"handle events"),s)for(n=0,r=s[i].length;n").attr(n.scriptAttrs||{}).prop({charset:n.scriptCharset,src:n.url}).on("load error",i=function(e){r.remove(),i=null,e&&t("error"===e.type?404:200,e.type)}),E.head.appendChild(r[0])},abort:function(){i&&i()}}});var _t,zt=[],Ut=/(=)\?(?=&|$)|\?\?/;S.ajaxSetup({jsonp:"callback",jsonpCallback:function(){var e=zt.pop()||S.expando+"_"+wt.guid++;return this[e]=!0,e}}),S.ajaxPrefilter("json jsonp",function(e,t,n){var r,i,o,a=!1!==e.jsonp&&(Ut.test(e.url)?"url":"string"==typeof e.data&&0===(e.contentType||"").indexOf("application/x-www-form-urlencoded")&&Ut.test(e.data)&&"data");if(a||"jsonp"===e.dataTypes[0])return r=e.jsonpCallback=m(e.jsonpCallback)?e.jsonpCallback():e.jsonpCallback,a?e[a]=e[a].replace(Ut,"$1"+r):!1!==e.jsonp&&(e.url+=(Tt.test(e.url)?"&":"?")+e.jsonp+"="+r),e.converters["script json"]=function(){return o||S.error(r+" was not called"),o[0]},e.dataTypes[0]="json",i=C[r],C[r]=function(){o=arguments},n.always(function(){void 0===i?S(C).removeProp(r):C[r]=i,e[r]&&(e.jsonpCallback=t.jsonpCallback,zt.push(r)),o&&m(i)&&i(o[0]),o=i=void 0}),"script"}),y.createHTMLDocument=((_t=E.implementation.createHTMLDocument("").body).innerHTML="
      ",2===_t.childNodes.length),S.parseHTML=function(e,t,n){return"string"!=typeof e?[]:("boolean"==typeof t&&(n=t,t=!1),t||(y.createHTMLDocument?((r=(t=E.implementation.createHTMLDocument("")).createElement("base")).href=E.location.href,t.head.appendChild(r)):t=E),o=!n&&[],(i=N.exec(e))?[t.createElement(i[1])]:(i=xe([e],t,o),o&&o.length&&S(o).remove(),S.merge([],i.childNodes)));var r,i,o},S.fn.load=function(e,t,n){var r,i,o,a=this,s=e.indexOf(" ");return-1").append(S.parseHTML(e)).find(r):e)}).always(n&&function(e,t){a.each(function(){n.apply(this,o||[e.responseText,t,e])})}),this},S.expr.pseudos.animated=function(t){return S.grep(S.timers,function(e){return t===e.elem}).length},S.offset={setOffset:function(e,t,n){var r,i,o,a,s,u,l=S.css(e,"position"),c=S(e),f={};"static"===l&&(e.style.position="relative"),s=c.offset(),o=S.css(e,"top"),u=S.css(e,"left"),("absolute"===l||"fixed"===l)&&-1<(o+u).indexOf("auto")?(a=(r=c.position()).top,i=r.left):(a=parseFloat(o)||0,i=parseFloat(u)||0),m(t)&&(t=t.call(e,n,S.extend({},s))),null!=t.top&&(f.top=t.top-s.top+a),null!=t.left&&(f.left=t.left-s.left+i),"using"in t?t.using.call(e,f):c.css(f)}},S.fn.extend({offset:function(t){if(arguments.length)return void 0===t?this:this.each(function(e){S.offset.setOffset(this,t,e)});var e,n,r=this[0];return r?r.getClientRects().length?(e=r.getBoundingClientRect(),n=r.ownerDocument.defaultView,{top:e.top+n.pageYOffset,left:e.left+n.pageXOffset}):{top:0,left:0}:void 0},position:function(){if(this[0]){var e,t,n,r=this[0],i={top:0,left:0};if("fixed"===S.css(r,"position"))t=r.getBoundingClientRect();else{t=this.offset(),n=r.ownerDocument,e=r.offsetParent||n.documentElement;while(e&&(e===n.body||e===n.documentElement)&&"static"===S.css(e,"position"))e=e.parentNode;e&&e!==r&&1===e.nodeType&&((i=S(e).offset()).top+=S.css(e,"borderTopWidth",!0),i.left+=S.css(e,"borderLeftWidth",!0))}return{top:t.top-i.top-S.css(r,"marginTop",!0),left:t.left-i.left-S.css(r,"marginLeft",!0)}}},offsetParent:function(){return this.map(function(){var e=this.offsetParent;while(e&&"static"===S.css(e,"position"))e=e.offsetParent;return e||re})}}),S.each({scrollLeft:"pageXOffset",scrollTop:"pageYOffset"},function(t,i){var o="pageYOffset"===i;S.fn[t]=function(e){return $(this,function(e,t,n){var r;if(x(e)?r=e:9===e.nodeType&&(r=e.defaultView),void 0===n)return r?r[i]:e[t];r?r.scrollTo(o?r.pageXOffset:n,o?n:r.pageYOffset):e[t]=n},t,e,arguments.length)}}),S.each(["top","left"],function(e,n){S.cssHooks[n]=Fe(y.pixelPosition,function(e,t){if(t)return t=We(e,n),Pe.test(t)?S(e).position()[n]+"px":t})}),S.each({Height:"height",Width:"width"},function(a,s){S.each({padding:"inner"+a,content:s,"":"outer"+a},function(r,o){S.fn[o]=function(e,t){var n=arguments.length&&(r||"boolean"!=typeof e),i=r||(!0===e||!0===t?"margin":"border");return $(this,function(e,t,n){var r;return x(e)?0===o.indexOf("outer")?e["inner"+a]:e.document.documentElement["client"+a]:9===e.nodeType?(r=e.documentElement,Math.max(e.body["scroll"+a],r["scroll"+a],e.body["offset"+a],r["offset"+a],r["client"+a])):void 0===n?S.css(e,t,i):S.style(e,t,n,i)},s,n?e:void 0,n)}})}),S.each(["ajaxStart","ajaxStop","ajaxComplete","ajaxError","ajaxSuccess","ajaxSend"],function(e,t){S.fn[t]=function(e){return this.on(t,e)}}),S.fn.extend({bind:function(e,t,n){return this.on(e,null,t,n)},unbind:function(e,t){return this.off(e,null,t)},delegate:function(e,t,n,r){return this.on(t,e,n,r)},undelegate:function(e,t,n){return 1===arguments.length?this.off(e,"**"):this.off(t,e||"**",n)},hover:function(e,t){return this.mouseenter(e).mouseleave(t||e)}}),S.each("blur focus focusin focusout resize scroll click dblclick mousedown mouseup mousemove mouseover mouseout mouseenter mouseleave change select submit keydown keypress keyup contextmenu".split(" "),function(e,n){S.fn[n]=function(e,t){return 0 { + const [docname, title, anchor, descr, score, filename] = result + return score }, */ @@ -28,9 +30,11 @@ if (!Scorer) { // or matches in the last dotted part of the object name objPartialMatch: 6, // Additive scores depending on the priority of the object - objPrio: {0: 15, // used to be importantResults - 1: 5, // used to be objectResults - 2: -5}, // used to be unimportantResults + objPrio: { + 0: 15, // used to be importantResults + 1: 5, // used to be objectResults + 2: -5, // used to be unimportantResults + }, // Used when the priority is not in the mapping. objPrioDefault: 0, @@ -39,452 +43,455 @@ if (!Scorer) { partialTitle: 7, // query found in terms term: 5, - partialTerm: 2 + partialTerm: 2, }; } -if (!splitQuery) { - function splitQuery(query) { - return query.split(/\s+/); +const _removeChildren = (element) => { + while (element && element.lastChild) element.removeChild(element.lastChild); +}; + +/** + * See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#escaping + */ +const _escapeRegExp = (string) => + string.replace(/[.*+\-?^${}()|[\]\\]/g, "\\$&"); // $& means the whole matched string + +const _displayItem = (item, highlightTerms, searchTerms) => { + const docBuilder = DOCUMENTATION_OPTIONS.BUILDER; + const docUrlRoot = DOCUMENTATION_OPTIONS.URL_ROOT; + const docFileSuffix = DOCUMENTATION_OPTIONS.FILE_SUFFIX; + const docLinkSuffix = DOCUMENTATION_OPTIONS.LINK_SUFFIX; + const showSearchSummary = DOCUMENTATION_OPTIONS.SHOW_SEARCH_SUMMARY; + + const [docName, title, anchor, descr] = item; + + let listItem = document.createElement("li"); + let requestUrl; + let linkUrl; + if (docBuilder === "dirhtml") { + // dirhtml builder + let dirname = docName + "/"; + if (dirname.match(/\/index\/$/)) + dirname = dirname.substring(0, dirname.length - 6); + else if (dirname === "index/") dirname = ""; + requestUrl = docUrlRoot + dirname; + linkUrl = requestUrl; + } else { + // normal html builders + requestUrl = docUrlRoot + docName + docFileSuffix; + linkUrl = docName + docLinkSuffix; + } + const params = new URLSearchParams(); + params.set("highlight", [...highlightTerms].join(" ")); + let linkEl = listItem.appendChild(document.createElement("a")); + linkEl.href = linkUrl + "?" + params.toString() + anchor; + linkEl.innerHTML = title; + if (descr) + listItem.appendChild(document.createElement("span")).innerText = + " (" + descr + ")"; + else if (showSearchSummary) + fetch(requestUrl) + .then((responseData) => responseData.text()) + .then((data) => { + if (data) + listItem.appendChild( + Search.makeSearchSummary(data, searchTerms, highlightTerms) + ); + }); + Search.output.appendChild(listItem); +}; +const _finishSearch = (resultCount) => { + Search.stopPulse(); + Search.title.innerText = _("Search Results"); + if (!resultCount) + Search.status.innerText = Documentation.gettext( + "Your search did not match any documents. Please make sure that all words are spelled correctly and that you've selected enough categories." + ); + else + Search.status.innerText = _( + `Search finished, found ${resultCount} page(s) matching the search query.` + ); +}; +const _displayNextItem = ( + results, + resultCount, + highlightTerms, + searchTerms +) => { + // results left, load the summary and display it + // this is intended to be dynamic (don't sub resultsCount) + if (results.length) { + _displayItem(results.pop(), highlightTerms, searchTerms); + setTimeout( + () => _displayNextItem(results, resultCount, highlightTerms, searchTerms), + 5 + ); } + // search finished, update title and status message + else _finishSearch(resultCount); +}; + +/** + * Default splitQuery function. Can be overridden in ``sphinx.search`` with a + * custom function per language. + * + * The regular expression works by splitting the string on consecutive characters + * that are not Unicode letters, numbers, underscores, or emoji characters. + * This is the same as ``\W+`` in Python, preserving the surrogate pair area. + */ +if (typeof splitQuery === "undefined") { + var splitQuery = (query) => query + .split(/[^\p{Letter}\p{Number}_\p{Emoji_Presentation}]+/gu) + .filter(term => term) // remove remaining empty strings } /** * Search Module */ -var Search = { - - _index : null, - _queued_query : null, - _pulse_status : -1, - - htmlToText : function(htmlString) { - var virtualDocument = document.implementation.createHTMLDocument('virtual'); - var htmlElement = $(htmlString, virtualDocument); - htmlElement.find('.headerlink').remove(); - docContent = htmlElement.find('[role=main]')[0]; - if(docContent === undefined) { - console.warn("Content block not found. Sphinx search tries to obtain it " + - "via '[role=main]'. Could you check your theme or template."); - return ""; - } - return docContent.textContent || docContent.innerText; +const Search = { + _index: null, + _queued_query: null, + _pulse_status: -1, + + htmlToText: (htmlString) => { + const htmlElement = document + .createRange() + .createContextualFragment(htmlString); + _removeChildren(htmlElement.querySelectorAll(".headerlink")); + const docContent = htmlElement.querySelector('[role="main"]'); + if (docContent !== undefined) return docContent.textContent; + console.warn( + "Content block not found. Sphinx search tries to obtain it via '[role=main]'. Could you check your theme or template." + ); + return ""; }, - init : function() { - var params = $.getQueryParameters(); - if (params.q) { - var query = params.q[0]; - $('input[name="q"]')[0].value = query; - this.performSearch(query); - } + init: () => { + const query = new URLSearchParams(window.location.search).get("q"); + document + .querySelectorAll('input[name="q"]') + .forEach((el) => (el.value = query)); + if (query) Search.performSearch(query); }, - loadIndex : function(url) { - $.ajax({type: "GET", url: url, data: null, - dataType: "script", cache: true, - complete: function(jqxhr, textstatus) { - if (textstatus != "success") { - document.getElementById("searchindexloader").src = url; - } - }}); - }, + loadIndex: (url) => + (document.body.appendChild(document.createElement("script")).src = url), - setIndex : function(index) { - var q; - this._index = index; - if ((q = this._queued_query) !== null) { - this._queued_query = null; - Search.query(q); + setIndex: (index) => { + Search._index = index; + if (Search._queued_query !== null) { + const query = Search._queued_query; + Search._queued_query = null; + Search.query(query); } }, - hasIndex : function() { - return this._index !== null; - }, + hasIndex: () => Search._index !== null, - deferQuery : function(query) { - this._queued_query = query; - }, + deferQuery: (query) => (Search._queued_query = query), - stopPulse : function() { - this._pulse_status = 0; - }, + stopPulse: () => (Search._pulse_status = -1), - startPulse : function() { - if (this._pulse_status >= 0) - return; - function pulse() { - var i; + startPulse: () => { + if (Search._pulse_status >= 0) return; + + const pulse = () => { Search._pulse_status = (Search._pulse_status + 1) % 4; - var dotString = ''; - for (i = 0; i < Search._pulse_status; i++) - dotString += '.'; - Search.dots.text(dotString); - if (Search._pulse_status > -1) - window.setTimeout(pulse, 500); - } + Search.dots.innerText = ".".repeat(Search._pulse_status); + if (Search._pulse_status >= 0) window.setTimeout(pulse, 500); + }; pulse(); }, /** * perform a search for something (or wait until index is loaded) */ - performSearch : function(query) { + performSearch: (query) => { // create the required interface elements - this.out = $('#search-results'); - this.title = $('

      ' + _('Searching') + '

      ').appendTo(this.out); - this.dots = $('').appendTo(this.title); - this.status = $('

       

      ').appendTo(this.out); - this.output = $('

      Python API Documenation

      Python API Documenation

        @@ -416,7 +419,7 @@
        -

        Conversion Phase

        +

        Conversion Phase

        Once the graph has be simplified to a form thats easy to convert, we then set up a conversion context to manage the construction of a TensorRT INetworkDefinition from the blocks nodes. The conversion context records the set of converted nodes, block inputs and outputs and other information about the conversion @@ -455,7 +458,7 @@

      -

      Node Evaluation

      +

      Node Evaluation

      There are some nodes that contain static data and are resources for operations. These can be evaluated at conversion time so that you can use those values when doing node conversion. In theory any node kind can have a conversion time evaluator as long as it produces a static IValue, This IValue will be stored in the conversion @@ -463,7 +466,7 @@

      Node Evaluationprim::Constant which emits a constant and prim::ListConstruct which makes lists.

      -

      Node Converters

      +

      Node Converters

      Node converters map JIT nodes to layers or subgraphs of layers. They then associate outputs from the JIT graph and the TRT graph together in the conversion context. This allows the conversion stage to assemble the inputs for the next node. There are some cases where a node produces an output that is not a Tensor but a static result @@ -538,6 +541,7 @@

      Node Converters + diff --git a/docs/contributors/dynamo_converters.html b/docs/contributors/dynamo_converters.html index 3ff0b2b9f6..d41c6e7040 100644 --- a/docs/contributors/dynamo_converters.html +++ b/docs/contributors/dynamo_converters.html @@ -10,7 +10,7 @@ - Writing Dynamo Converters — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Writing Dynamo Converters — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -237,7 +237,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -304,6 +304,9 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
    • +
    • Wrapping Custom Kernels to use in TensorRT
    • +
    • Using Torch-TensorRT to Insert the Kernel

    Python API Documenation

      @@ -414,12 +417,12 @@
      -

      Writing Dynamo Converters

      +

      Writing Dynamo Converters

      The dynamo converter library in Torch-TensorRT is located in TensorRT/py/torch_tensorrt/dynamo/conversion.

      -

      Converter implementation

      +

      Converter implementation

      -

      Registration

      +

      Registration

      A converter is a function decrorated with torch_tensorrt.dynamo.dynamo_tensorrt_converter that follows the function signature:

      @torch_tensorrt.dynamo.conversion.dynamo_tensorrt_converter(torch.ops.aten.leaky_relu.default)
       def leaky_relu_converter(
      @@ -457,19 +460,19 @@ 

      Registrationtensorrt.ITensor or some collection of tensorrt.ITensor for use in the torch_tensorrt.dynamo.conversion.TRTInterpreter matching the output signature of the operation being converted

      -

      Capability Validation

      +

      Capability Validation

      There are some converters which have special cases to be accounted for. In those cases, one should use capability_validators to register the converter using @dynamo_tensorrt_converter We illustrate this through torch.ops.aten.embedding.default. It has parameters - scale_grad_by_freq and sparse which are not currently supported by the implementation. In such cases we can write validator embedding_param_validator which implements that given those paramters the converter is not supported and register the converter by

      -

      Type Contract

      +

      Type Contract

      The function is expected to follow the type contract established by the signature. This includes accepting the union of valid PyTorch types + numpy arrays for constant tensors and TensorRT ITensors. In the case that only a subset of types is supported in the converter, you can also add the torch_tensorrt.dynamo.conversion.converter_utils.enforce_tensor_types, which allows you to specify a dictionary mapping between input positions and types that those inputs can take. Where possible the decorator will convert inputs to match these types prefering the order provided. int keys in the dictionary will refer to positional arguments in args. str keys will refer to keyword arguments in kwargs.

      -

      Example: Convolution

      +

      Example: Convolution

      The default convolution converter both uses a capability validator and type enforcement to prevent being run in unsupported situations. The capability validator is run during partitioning to determine if a particular convolution node can be converted to TensorRT or needs to run in PyTorch. Here the validator ensures that the convolution is no greater than 3D. The type enforcer will autocast before the converter is called, inputs to the supported type in the converter, thereby limiting the number of cases an author must handle.

      @@ -495,7 +498,7 @@

      Example: Convol

      -

      Evaluators

      +

      Evaluators

      Some operations do not produce TensorRT subgraphs as a side-effect. These are termed evaluators.

      Example: operator.getitem

      @@ -504,11 +507,11 @@

      Evaluators -

      Operator Decomposition

      +

      Operator Decomposition

      There are some converters which can be decomposed into suboperations in PyTorch and need not have seperate converter registration. Such converters can be implemented via a decomposition

      -

      Example: addmm

      +

      Example: addmm

      The decompositions are registered via register_torch_trt_decomposition decorator We define addmm_replacement and replace it with the torch ops, which will have their corresponding converters called.

      @torch_tensorrt.dynamo.lowering.register_torch_trt_decomposition(torch.ops.aten.addmm)
      @@ -605,6 +608,7 @@ 

      Example: addmm< + diff --git a/docs/contributors/lowering.html b/docs/contributors/lowering.html index cb213ca08a..0af47376f0 100644 --- a/docs/contributors/lowering.html +++ b/docs/contributors/lowering.html @@ -10,7 +10,7 @@ - Lowering Phase — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Lowering Phase — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -237,7 +237,7 @@
      - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
      @@ -304,6 +304,9 @@
    • Compiling a Transformer using torch.compile and TensorRT
    • Torch Compile Advanced Usage
    • Torch Compile Stable Diffusion
    • +
    • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
    • +
    • Wrapping Custom Kernels to use in TensorRT
    • +
    • Using Torch-TensorRT to Insert the Kernel

    Python API Documenation

    Python API Documenation

      @@ -416,7 +419,7 @@
      -

      Partitioning Phase

      +

      Partitioning Phase

      The phase is optional and enabled by the user. It instructs the compiler to separate nodes into ones that should run in PyTorch and ones that should run in TensorRT. Criteria for separation include: Lack of a converter, operator is explicitly set to run in PyTorch by the user or the node has a flag which tells partitioning to run in PyTorch by the module fallback passes.

      @@ -430,28 +433,28 @@

    Here are the brief description of these functions of each file:

    -

    PartitonInfo.h/.cpp

    +

    PartitonInfo.h/.cpp

    The automatic fallback APIs that is used for partitioning.

    -

    SegmentedBlock.h/.cpp

    +

    SegmentedBlock.h/.cpp

    The main data structures that is used to maintain information for each segments after segmentation.

    -

    shape_analysis.h/.cpp

    +

    shape_analysis.h/.cpp

    Code implementation to get the shapes for each segments by running them in JIT.

    -

    partitioning.h/.cpp

    +

    partitioning.h/.cpp

    @@ -459,7 +462,7 @@

    partitioning.h/.cpp -

    Automatic Fallback

    +

    Automatic Fallback

    To enable automatic fallback feature, you can set following attributes in Python:

    import torch
     import torch_tensorrt as torchtrt
    @@ -497,7 +500,7 @@ 

    Automatic Fallback -

    Dependency Aware Partitioning

    +

    Dependency Aware Partitioning

    During segmentation, Torch-TensorRT uses a dependency graph of the input TorchScript nodes to reduce the number of segments created. Consider this example from test Partitioning.SegmentModelWithDependencyAwareness in tests/core/partitioning/test_segmentation.cpp

    graph(%x : Tensor, %y : Tensor):
         %3 : int = prim::Constant[value=0]()
    @@ -705,6 +708,7 @@ 

    Dependency Aware Partitioning + diff --git a/docs/contributors/phases.html b/docs/contributors/phases.html index e1c6bfe4df..8671a32d69 100644 --- a/docs/contributors/phases.html +++ b/docs/contributors/phases.html @@ -10,7 +10,7 @@ - Compiler Phases — Torch-TensorRT v2.3.0.dev0+85971ff documentation + Compiler Phases — Torch-TensorRT v2.4.0.dev0+4dc9acfc9 documentation @@ -235,7 +235,7 @@
    - v2.3.0.dev0+85971ff + v2.4.0.dev0+4dc9acfc9
    @@ -302,6 +302,9 @@
  • Compiling a Transformer using torch.compile and TensorRT
  • Torch Compile Advanced Usage
  • Torch Compile Stable Diffusion
  • +
  • Using Custom Kernels within TensorRT Engines with Torch-TensorRT
  • +
  • Wrapping Custom Kernels to use in TensorRT
  • +
  • Using Torch-TensorRT to Insert the Kernel
  • Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

    Python API Documenation

      @@ -414,7 +417,7 @@
      -

      Compiling Exported Programs with Torch-TensorRT

      +

      Compiling Exported Programs with Torch-TensorRT

      Pytorch 2.1 introduced torch.export APIs which can export graphs from Pytorch programs into ExportedProgram objects. Torch-TensorRT dynamo frontend compiles these ExportedProgram objects and optimizes them using TensorRT. Here’s a simple @@ -434,7 +437,7 @@

      torch_tensorrt.dynamo.compile is the main API for users to interact with Torch-TensorRT dynamo frontend. The input type of the model should be ExportedProgram (ideally the output of torch.export.export or torch_tensorrt.dynamo.trace (discussed in the section below)) and output type is a torch.fx.GraphModule object.

    -

    Customizeable Settings

    +

    Customizeable Settings

    There are lot of options for users to customize their settings for optimizing with TensorRT. Some of the frequently used options are as follows:

      @@ -451,7 +454,7 @@

      Customizeable Settings

    -

    Under the hood

    +

    Under the hood

    Under the hood, torch_tensorrt.dynamo.compile performs the following on the graph.

    Python API Documenation

    Python API Documenation

      @@ -414,7 +417,7 @@
      -

      Torch-TensorRT (FX Frontend) User Guide

      +

      Torch-TensorRT (FX Frontend) User Guide

      Torch-TensorRT (FX Frontend) is a tool that can convert a PyTorch model through torch.fx to an TensorRT engine optimized targeting running on Nvidia GPUs. TensorRT is the inference engine developed by NVIDIA which composed of various kinds of optimization including kernel fusion, @@ -427,7 +430,7 @@

    -

    Converting a PyTorch Model to TensorRT Engine

    +

    Converting a PyTorch Model to TensorRT Engine

    In general, users are welcome to use the compile() to finish the conversion from a model to tensorRT engine. It is a wrapper API that consists of the major steps needed to finish this converison. Please refer to an example usage in lower_example.py file under examples/fx.

    def compile(
    @@ -584,7 +587,7 @@ 

    Converting a PyTorch Model to TensorRT Engine -

    Acc Tracer

    +

    Acc Tracer

    Acc tracer is a custom FX symbolic tracer. It does a couple more things compare to the vanilla FX symbolic tracer. We mainly depend on it to convert PyTorch ops or builtin ops to acc ops. There are two main purposes for fx2trt to use acc ops:

    1. there’re many ops that do similar things in PyTorch ops and builtin ops such like torch.add, builtin.add and torch.Tensor.add. Using acc tracer, we normalize these three ops to a single acc_ops.add. This helps reduce the number of converters we need to write.

    2. @@ -592,7 +595,7 @@

      Acc Tracer -

      FX2TRT

      +

      FX2TRT

      After symbolic tracing, we have the graph representation of a PyTorch model. fx2trt leverages the power of fx.Interpreter. fx.Interpreter goes through the whole graph node by node and calls the function that node represents. fx2trt overrides the original behavior of calling the function with invoking corresponding converts for each node. Each converter function adds corresponding TensorRT layer(s).

      Below is an example of a converter function. The decorator is used to register this converter function with the corresponding node. In this example, we register this converter to a fx node whose target is acc_ops.sigmoid.

      @tensorrt_converter(acc_ops.sigmoid)
      @@ -614,7 +617,7 @@ 

      FX2TRT -

      How to Add a Missing Op

      +

      How to Add a Missing Op

      You can actually add it wherever you want just need to remember import the file so that all acc ops and mapper will be registered before tracing with acc_tracer.

      Python API Documenation

      -

      A

      - - -
      -

      C

      D

      - +
      -
    3. dump_build_info() (in module torch_tensorrt) -
    4. E