Temporarily disable cuda graph based RNN-T greedy inference for r2.0.0rc1 #9904
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Temporarily disable cuda graph based RNN-T greedy inference for r2.0.0rc1.
For very rare input shapes, a cooperative kernel might be used by
pytorch for LSTM operations. This does not work within a cuda graph
conditional node until CUDA 12.6.
Unfortunately CUDA 12.6 is not part of the 24.07 pytorch container
release, which this release of nemo is intended for.