Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT 10.6-GA OSS Release #4238

Merged

Conversation

kevinch-nv
Copy link
Collaborator

@kevinch-nv kevinch-nv commented Nov 5, 2024

10.6.0 GA - 2024-11-5

Key Feature and Updates:

  • Demo Changes

    • demoBERT: The use of fcPlugin in demoBERT has been removed.
    • demoBERT: All TensorRT plugins now used in demoBERT (CustomEmbLayerNormDynamic, CustomSkipLayerNormDynamic, and CustomQKVToContextDynamic) now have versions that inherit from IPluginV3 interface classes. The user can opt-in to use these V3 plugins by specifying --use-v3-plugins to the builder scripts.
      • Opting-in to use V3 plugins does not affect performance, I/O, or plugin attributes.
      • There is a known issue in the V3 (version 4) of CustomQKVToContextDynamic plugin from TensorRT 10.6.0, causing an internal assertion error if either the batch or sequence dimensions differ at runtime from the ones used to serialize the engine. See the “known issues” section of TensorRT-10.6.0 release notes.
      • For smoother migration, the default behavior is still using the deprecated IPluginV2DynamicExt-derived plugins, when the flag: --use-v3-plugins isn't specified in the builder scripts. The flag --use-deprecated-plugins was added as an explicit way to enforce the default behavior, and is mutually exclusive with --use-v3-plugins.
    • demoDiffusion
      • Introduced BF16 and FP8 support for the Flux.1-dev pipeline.
      • Expanded FP8 support on Ada platforms.
      • Enabled LoRA adapter compatibility for SDv1.5, SDv2.1, and SDXL pipelines using Diffusers version 0.30.3.
  • Sample Changes

    • Added the Python sample quickly_deployable_plugins, which demonstrates quickly deployable Python-based plugin definitions (QDPs) in TensorRT. QDPs are a simple and intuitive decorator-based approach to defining TensorRT plugins, requiring drastically less code.
  • Plugin Changes

    • The fcPlugin has been deprecated. Its functionality has been superseded by the IMatrixMultiplyLayer that is natively provided by TensorRT.
    • Migrated IPluginV2-descendent version 1 of CustomEmbLayerNormDynamic, to version 6, which implements IPluginV3.
      • The newer versions preserve the attributes and I/O of the corresponding older plugin version.
      • The older plugin versions are deprecated and will be removed in a future release.
  • Parser Changes

    • Updated ONNX submodule version to 1.17.0.
    • Fixed issue where conditional layers were incorrectly being added.
    • Updated local function metadata to contain more information.
    • Added support for parsing nodes with Quickly Deployable Plugins.
    • Fixed handling of optional outputs.
  • Tool Updates

    • ONNX-Graphsurgeon updated to version 0.5.3
    • Polygraphy updated to 0.49.14.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@venkywonka venkywonka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added the hyperlinks

README.md Outdated Show resolved Hide resolved
@kevinch-nv kevinch-nv force-pushed the dev-kevinch-10.6-staging branch from 0442495 to b43d99e Compare November 5, 2024 21:51
@kevinch-nv kevinch-nv merged commit c468d67 into NVIDIA:release/10.6 Nov 5, 2024
1 check passed
Signed-off-by: Kevin Chen <kevinch@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants