Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RNN-T and TDT inference: use CUDA graphs by default #8972
RNN-T and TDT inference: use CUDA graphs by default #8972
Changes from 35 commits
7441a2f
4761e68
2b13f31
d391e98
0205101
0cac599
cb68701
b38ff6f
25235bb
f175c8a
1672739
9e18150
94ba6af
7b9d619
2d3b083
7e805bc
43c01ac
030be86
100fd9c
6b5d1d2
07bd665
638823e
464dd51
2b7cd73
e730f91
cc06bf0
cf38241
19ca09d
d98b8fc
05eb103
7c6f7f0
cb6d500
35564df
7192acc
d4a27f6
c0877f2
b880510
82f83dc
4e47010
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two methods should be class methods of WithOptionalCYDAGraphs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, generally, this is not a good idea to introduce a 2-way dependency
WithOptionalCudaGraphs
<->ASRModel
(actually,EncDecRNNTModel
, since decoding is only in this model).I made the method more abstract to separate the logic, separating the path in the model and the lookup logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprsied that you would silently change the behavior on lines 630 to 639 rather than throw an exception in these cases, to be honest.
Meanwhile, the situation where we set the symbols_per_step to 10 if it is None seems okay because it is unlikely to change the results, since 10 is such a large number.
I'm not going to hold up merging this because of this concern, anyway, since it is a code path most people won't see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this fallback behavior to prevent crashes when the user wants to change some parameters since
use_cuda_graph_decoder
isTrue
by default now. Since it's only about speed (not quality), it is acceptable to switch silently between implementations instead of requiring the user to understand all the nuances of the available parameter combinations.LoopLabelsComputer
(s) are designed to handle all situations without explicit errors (e.g., when cuda is unavailable, etc.).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not certain, but it currently looks like we will throw an exception if max_symbols_per_step is None, rather than overriding it to 10 for the frame-loop decoder right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, thanks for catching this. I will address this in a follow-up PR
Check warning
Code scanning / CodeQL
Overwriting attribute in super-class or sub-class Warning
Check warning
Code scanning / CodeQL
Overwriting attribute in super-class or sub-class Warning