-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MoE support for T5 model (w/o expert parallel) #5409
Conversation
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! And can we please add a CI test for this preferably with something where |
@MaximumEntropy did a quick look through. I think adapters can work out of the box with the moe implementation. Iiuc, moe swaps out the parallelMLP with switchMLP, so adapters aren't touched. IA3 however will not work properly. |
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
Signed-off-by: Abhinav Khattar <aklife97@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
* clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (#5410) (#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (#5421) (#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (#5413) (#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (#5421)" (#5431) (#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (#5420) (#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (#5382) (#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA#5410) (NVIDIA#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: shane carroll <shane.carroll@utsa.edu>
* clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA#5410) (NVIDIA#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
* clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA#5410) (NVIDIA#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
* clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA#5410) (NVIDIA#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421) (NVIDIA#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA#5413) (NVIDIA#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA#5421)" (NVIDIA#5431) (NVIDIA#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA#5420) (NVIDIA#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA#5382) (NVIDIA#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
What does this PR do ?
Add Mixture of Experts support for T5 model
Collection: NLP
Changelog
Usage
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information