Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge r1.11 to main #4920

Merged
merged 34 commits into from
Sep 13, 2022
Merged

merge r1.11 to main #4920

merged 34 commits into from
Sep 13, 2022

Conversation

XuesongYang
Copy link
Collaborator

What does this PR do ?

This PR is to merge r1.11.0 branch to the main branch. There were merge conflicts according to several updates in all domains. I fixed all conflicts. For some conflicts, I chose the main changes, while for other conflicts, I made a decision after reading detailed changes in both r1.11.0 and main.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • [] Make sure you read and followed Contributor guidelines
  • [] Did you write any new necessary tests?
  • [] Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Owners of PRs in every domain, please help to review the changes. Thanks.

Additional Information

  • Related to # (issue)

@lgtm-com
Copy link

lgtm-com bot commented Sep 12, 2022

This pull request introduces 4 alerts when merging 11238b0 into 8d609b3 - view on LGTM.com

new alerts:

  • 2 for Syntax error
  • 1 for Unused import
  • 1 for Variable defined multiple times

Copy link
Collaborator

@redoctopus redoctopus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are a few files that need to be kept from main due to the change in G2P module locations from #4690, @AlexGrinch could you take a look to check what files need to be updated?

A few that I've caught that should be kept as the version from main:

  • nemo/collections/tts/torch/g2ps.py
  • nemo/collections/tts/torch/tts_dataset.yaml
  • nemo/collections/tts/models/fastpitch.py
  • nemo/collections/tts/torch/tts_tokenizers.py
  • tutorials/tts/FastPitch_MixerTTS_Training.ipynb

ericharper and others added 25 commits September 12, 2022 12:39
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>
…treaming. (#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>
* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>
* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
…c-dec models (#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: ericharper <complex451@gmail.com>
* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
@ericharper ericharper mentioned this pull request Sep 12, 2022
8 tasks
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
…huish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
@XuesongYang XuesongYang force-pushed the xueyang-merge-r1.11-to-main branch from 47b25a4 to db06e61 Compare September 12, 2022 21:51
ericharper
ericharper previously approved these changes Sep 12, 2022
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@XuesongYang XuesongYang changed the title Xueyang merge r1.11 to main merge r1.11 to main Sep 12, 2022
@XuesongYang XuesongYang dismissed redoctopus’s stale review September 12, 2022 22:58

reverted to main path.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
@XuesongYang XuesongYang force-pushed the xueyang-merge-r1.11-to-main branch from bd3e2e6 to b679599 Compare September 12, 2022 22:59
Copy link
Contributor

@vadam5 vadam5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@XuesongYang
Copy link
Collaborator Author

XuesongYang commented Sep 12, 2022

Blocking this as I'm seeing a bunch of CI tests are being skipped but the CI is showing as passed.

Thanks for pointing it out. I rebased to the latest main. It would be related to the most recent PR that removed computer vision codes and tests.
#4907

README.rst Show resolved Hide resolved
@ericharper ericharper merged commit cf83c5f into main Sep 13, 2022
@ericharper ericharper deleted the xueyang-merge-r1.11-to-main branch September 13, 2022 00:24
jubick1337 pushed a commit to jubick1337/NeMo that referenced this pull request Sep 28, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* [TTS] bugfix for missing configs. (NVIDIA#4725)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix pynini install in TTS tutorials (NVIDIA#4729)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* [TTS] updated config with a German IPA phoneme tokenizer (NVIDIA#4756)

* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Update r1.11 to new heteronyms list (NVIDIA#4745)

* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix tutorial formatting (NVIDIA#4778)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* update branch and typos (NVIDIA#4788)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Adding support for models trained with full context for cache-aware streaming. (NVIDIA#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Update megatron encoder decoder model to support py37 for colab (NVIDIA#4791)

* [ASR] Add pretrained ASR models for Croatian (NVIDIA#4682)

* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* added/fixed export for Megatron models (NVIDIA#4712)

* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch for qa notebook

Signed-off-by: ericharper <complex451@gmail.com>

* Fix initializing weights from ptl ckpt with exclude (NVIDIA#4807)

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* Fix index error from addition of voiced_mask and p_voiced (NVIDIA#4811)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* T5 prompt learning fixes (NVIDIA#4771)

* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* G2P docs (NVIDIA#4841)

* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix providing glue in seq2seq eval (NVIDIA#4843)

* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updated inference code and squad scripts (NVIDIA#4835)

* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Set the number of workers to 0 for validation and test sets in all enc-dec models (NVIDIA#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix Megatron NMT consumed samples and ckpt_to_nemo split rank (NVIDIA#4884)

* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* added utf8 encoding (NVIDIA#4892)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* update readme with apex commit

Signed-off-by: ericharper <complex451@gmail.com>

* Add support for Apex distributed Adam optimizer with GPT-3 (NVIDIA#4487)

* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* fixed styles

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed unsued import.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed duplicated func defintion.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace 'r1.11.0' with 'main' in Jenkinsfile and all tutorials.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: PRE_RELEASE = 'rc0'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace branch name to main for asr_with_adapters.ipynb.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix Fastpitch mixertts tutorial format to align with main to distingshuish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: correct path for tokenizers.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
jubick1337 pushed a commit to jubick1337/NeMo that referenced this pull request Oct 3, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* [TTS] bugfix for missing configs. (NVIDIA#4725)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix pynini install in TTS tutorials (NVIDIA#4729)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* [TTS] updated config with a German IPA phoneme tokenizer (NVIDIA#4756)

* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Update r1.11 to new heteronyms list (NVIDIA#4745)

* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix tutorial formatting (NVIDIA#4778)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* update branch and typos (NVIDIA#4788)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Adding support for models trained with full context for cache-aware streaming. (NVIDIA#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Update megatron encoder decoder model to support py37 for colab (NVIDIA#4791)

* [ASR] Add pretrained ASR models for Croatian (NVIDIA#4682)

* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* added/fixed export for Megatron models (NVIDIA#4712)

* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch for qa notebook

Signed-off-by: ericharper <complex451@gmail.com>

* Fix initializing weights from ptl ckpt with exclude (NVIDIA#4807)

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* Fix index error from addition of voiced_mask and p_voiced (NVIDIA#4811)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* T5 prompt learning fixes (NVIDIA#4771)

* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* G2P docs (NVIDIA#4841)

* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix providing glue in seq2seq eval (NVIDIA#4843)

* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updated inference code and squad scripts (NVIDIA#4835)

* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Set the number of workers to 0 for validation and test sets in all enc-dec models (NVIDIA#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix Megatron NMT consumed samples and ckpt_to_nemo split rank (NVIDIA#4884)

* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* added utf8 encoding (NVIDIA#4892)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* update readme with apex commit

Signed-off-by: ericharper <complex451@gmail.com>

* Add support for Apex distributed Adam optimizer with GPT-3 (NVIDIA#4487)

* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* fixed styles

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed unsued import.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed duplicated func defintion.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace 'r1.11.0' with 'main' in Jenkinsfile and all tutorials.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: PRE_RELEASE = 'rc0'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace branch name to main for asr_with_adapters.ipynb.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix Fastpitch mixertts tutorial format to align with main to distingshuish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: correct path for tokenizers.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
jubick1337 pushed a commit to jubick1337/NeMo that referenced this pull request Oct 4, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* [TTS] bugfix for missing configs. (NVIDIA#4725)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix pynini install in TTS tutorials (NVIDIA#4729)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* [TTS] updated config with a German IPA phoneme tokenizer (NVIDIA#4756)

* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Update r1.11 to new heteronyms list (NVIDIA#4745)

* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix tutorial formatting (NVIDIA#4778)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* update branch and typos (NVIDIA#4788)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Adding support for models trained with full context for cache-aware streaming. (NVIDIA#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Update megatron encoder decoder model to support py37 for colab (NVIDIA#4791)

* [ASR] Add pretrained ASR models for Croatian (NVIDIA#4682)

* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* added/fixed export for Megatron models (NVIDIA#4712)

* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch for qa notebook

Signed-off-by: ericharper <complex451@gmail.com>

* Fix initializing weights from ptl ckpt with exclude (NVIDIA#4807)

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* Fix index error from addition of voiced_mask and p_voiced (NVIDIA#4811)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* T5 prompt learning fixes (NVIDIA#4771)

* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* G2P docs (NVIDIA#4841)

* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix providing glue in seq2seq eval (NVIDIA#4843)

* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updated inference code and squad scripts (NVIDIA#4835)

* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Set the number of workers to 0 for validation and test sets in all enc-dec models (NVIDIA#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix Megatron NMT consumed samples and ckpt_to_nemo split rank (NVIDIA#4884)

* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* added utf8 encoding (NVIDIA#4892)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* update readme with apex commit

Signed-off-by: ericharper <complex451@gmail.com>

* Add support for Apex distributed Adam optimizer with GPT-3 (NVIDIA#4487)

* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* fixed styles

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed unsued import.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed duplicated func defintion.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace 'r1.11.0' with 'main' in Jenkinsfile and all tutorials.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: PRE_RELEASE = 'rc0'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace branch name to main for asr_with_adapters.ipynb.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix Fastpitch mixertts tutorial format to align with main to distingshuish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: correct path for tokenizers.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
jubick1337 pushed a commit to jubick1337/NeMo that referenced this pull request Oct 4, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* [TTS] bugfix for missing configs. (NVIDIA#4725)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix pynini install in TTS tutorials (NVIDIA#4729)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* [TTS] updated config with a German IPA phoneme tokenizer (NVIDIA#4756)

* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Update r1.11 to new heteronyms list (NVIDIA#4745)

* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix tutorial formatting (NVIDIA#4778)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* update branch and typos (NVIDIA#4788)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Adding support for models trained with full context for cache-aware streaming. (NVIDIA#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Update megatron encoder decoder model to support py37 for colab (NVIDIA#4791)

* [ASR] Add pretrained ASR models for Croatian (NVIDIA#4682)

* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* added/fixed export for Megatron models (NVIDIA#4712)

* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch for qa notebook

Signed-off-by: ericharper <complex451@gmail.com>

* Fix initializing weights from ptl ckpt with exclude (NVIDIA#4807)

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* Fix index error from addition of voiced_mask and p_voiced (NVIDIA#4811)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* T5 prompt learning fixes (NVIDIA#4771)

* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* G2P docs (NVIDIA#4841)

* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix providing glue in seq2seq eval (NVIDIA#4843)

* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updated inference code and squad scripts (NVIDIA#4835)

* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Set the number of workers to 0 for validation and test sets in all enc-dec models (NVIDIA#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix Megatron NMT consumed samples and ckpt_to_nemo split rank (NVIDIA#4884)

* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* added utf8 encoding (NVIDIA#4892)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* update readme with apex commit

Signed-off-by: ericharper <complex451@gmail.com>

* Add support for Apex distributed Adam optimizer with GPT-3 (NVIDIA#4487)

* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* fixed styles

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed unsued import.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed duplicated func defintion.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace 'r1.11.0' with 'main' in Jenkinsfile and all tutorials.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: PRE_RELEASE = 'rc0'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace branch name to main for asr_with_adapters.ipynb.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix Fastpitch mixertts tutorial format to align with main to distingshuish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: correct path for tokenizers.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
jubick1337 pushed a commit to jubick1337/NeMo that referenced this pull request Oct 4, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* [TTS] bugfix for missing configs. (NVIDIA#4725)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix pynini install in TTS tutorials (NVIDIA#4729)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* [TTS] updated config with a German IPA phoneme tokenizer (NVIDIA#4756)

* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Update r1.11 to new heteronyms list (NVIDIA#4745)

* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix tutorial formatting (NVIDIA#4778)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* update branch and typos (NVIDIA#4788)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Adding support for models trained with full context for cache-aware streaming. (NVIDIA#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Update megatron encoder decoder model to support py37 for colab (NVIDIA#4791)

* [ASR] Add pretrained ASR models for Croatian (NVIDIA#4682)

* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* added/fixed export for Megatron models (NVIDIA#4712)

* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch for qa notebook

Signed-off-by: ericharper <complex451@gmail.com>

* Fix initializing weights from ptl ckpt with exclude (NVIDIA#4807)

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* Fix index error from addition of voiced_mask and p_voiced (NVIDIA#4811)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* T5 prompt learning fixes (NVIDIA#4771)

* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* G2P docs (NVIDIA#4841)

* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix providing glue in seq2seq eval (NVIDIA#4843)

* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updated inference code and squad scripts (NVIDIA#4835)

* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Set the number of workers to 0 for validation and test sets in all enc-dec models (NVIDIA#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix Megatron NMT consumed samples and ckpt_to_nemo split rank (NVIDIA#4884)

* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* added utf8 encoding (NVIDIA#4892)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* update readme with apex commit

Signed-off-by: ericharper <complex451@gmail.com>

* Add support for Apex distributed Adam optimizer with GPT-3 (NVIDIA#4487)

* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* fixed styles

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed unsued import.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed duplicated func defintion.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace 'r1.11.0' with 'main' in Jenkinsfile and all tutorials.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: PRE_RELEASE = 'rc0'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace branch name to main for asr_with_adapters.ipynb.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix Fastpitch mixertts tutorial format to align with main to distingshuish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: correct path for tokenizers.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Matvei Novikov <mattyson.so@gmail.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* [TTS] bugfix for missing configs. (NVIDIA#4725)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix pynini install in TTS tutorials (NVIDIA#4729)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* [TTS] updated config with a German IPA phoneme tokenizer (NVIDIA#4756)

* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Update r1.11 to new heteronyms list (NVIDIA#4745)

* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix tutorial formatting (NVIDIA#4778)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* update branch and typos (NVIDIA#4788)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Adding support for models trained with full context for cache-aware streaming. (NVIDIA#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Update megatron encoder decoder model to support py37 for colab (NVIDIA#4791)

* [ASR] Add pretrained ASR models for Croatian (NVIDIA#4682)

* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* added/fixed export for Megatron models (NVIDIA#4712)

* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch for qa notebook

Signed-off-by: ericharper <complex451@gmail.com>

* Fix initializing weights from ptl ckpt with exclude (NVIDIA#4807)

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* Fix index error from addition of voiced_mask and p_voiced (NVIDIA#4811)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* T5 prompt learning fixes (NVIDIA#4771)

* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* G2P docs (NVIDIA#4841)

* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix providing glue in seq2seq eval (NVIDIA#4843)

* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updated inference code and squad scripts (NVIDIA#4835)

* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Set the number of workers to 0 for validation and test sets in all enc-dec models (NVIDIA#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix Megatron NMT consumed samples and ckpt_to_nemo split rank (NVIDIA#4884)

* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* added utf8 encoding (NVIDIA#4892)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* update readme with apex commit

Signed-off-by: ericharper <complex451@gmail.com>

* Add support for Apex distributed Adam optimizer with GPT-3 (NVIDIA#4487)

* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* fixed styles

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed unsued import.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed duplicated func defintion.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace 'r1.11.0' with 'main' in Jenkinsfile and all tutorials.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: PRE_RELEASE = 'rc0'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace branch name to main for asr_with_adapters.ipynb.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix Fastpitch mixertts tutorial format to align with main to distingshuish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: correct path for tokenizers.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
hainan-xv pushed a commit to hainan-xv/NeMo that referenced this pull request Nov 29, 2022
* update branch

Signed-off-by: ericharper <complex451@gmail.com>

* update package info and dockerfile

Signed-off-by: ericharper <complex451@gmail.com>

* [TTS] bugfix for missing configs. (NVIDIA#4725)

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix pynini install in TTS tutorials (NVIDIA#4729)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* [TTS] updated config with a German IPA phoneme tokenizer (NVIDIA#4756)

* [TTS] added a German IPA phoneme tokenizer
* [TTS][ASR] enabled customized arguments for trimming the leading and trailing silence.
* [TTS] disabled spline interpolation for beta-binomial distribution. Let it generate align prior and save to disks. Use a new phoneme tokenizer.
* [TTS] use consistent spline interpolation with fastpitch checkpoint when generating mel-spectrograms for hifigan finetune.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Update r1.11 to new heteronyms list (NVIDIA#4745)

* Update configs to new heteronyms list
* Remove old heteronyms list, add alt 'merchandise' pron to CMUdict
* Update remaining references to old heteronyms list

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* Fix tutorial formatting (NVIDIA#4778)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* update branch and typos (NVIDIA#4788)

Signed-off-by: ericharper <complex451@gmail.com>

Signed-off-by: ericharper <complex451@gmail.com>

* Adding support for models trained with full context for cache-aware streaming. (NVIDIA#4687)

* added support for models trained with full context.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* dropped seq_range

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed indexing in caching methods.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* updated docs.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* addressed comments.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* change frame-wise to cache-aware.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* fixed code style.

Signed-off-by: Vahid <vnoroozi@nvidia.com>

Signed-off-by: Vahid <vnoroozi@nvidia.com>

* Update megatron encoder decoder model to support py37 for colab (NVIDIA#4791)

* [ASR] Add pretrained ASR models for Croatian (NVIDIA#4682)

* [ASR] Add pretrained ASR models for Croatian

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

* Fix style for import

Signed-off-by: Ante Jukić <ajukic@nvidia.com>

Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Eric Harper <complex451@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>

* added/fixed export for Megatron models (NVIDIA#4712)

* added/fixed export for Megatron models

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed style

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* fixed FusedScaleMaskSoftmax in BioMegatron

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

* included comments

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update branch for qa notebook

Signed-off-by: ericharper <complex451@gmail.com>

* Fix initializing weights from ptl ckpt with exclude (NVIDIA#4807)

Signed-off-by: sam1373 <samuelkriman@gmail.com>

Signed-off-by: sam1373 <samuelkriman@gmail.com>

* Fix index error from addition of voiced_mask and p_voiced (NVIDIA#4811)

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>

* T5 prompt learning fixes (NVIDIA#4771)

* RPE, hidden size and config fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update to reflect new config names

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Sentencepiece fixes

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix finetuning

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add encoder seq len to gpt

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Add finetune eval script

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix name

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update Jenkinsfile

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update config

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update check

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Backward compat

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Update CI test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Split rank for Enc-Dec models

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Address comments

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>

* G2P docs (NVIDIA#4841)

* g2p docs added

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* fix references

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* address review feedback

Signed-off-by: ekmb <ebakhturina@nvidia.com>

Signed-off-by: ekmb <ebakhturina@nvidia.com>

* Fix providing glue in seq2seq eval (NVIDIA#4843)

* Fix providing glue in seq2seq eval

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Fix

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Updated inference code and squad scripts (NVIDIA#4835)

* Updated inference code and squad scripts

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Reverted GPT & T5 inference files back to use NLPDDPlugin

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Overwrite frozen LM to use fused adam

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added padded vocab size

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Fixed val check interval value

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Python format fix

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Make t5 prompt learning preds write to file

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added back dp=1 check

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>

* Set the number of workers to 0 for validation and test sets in all enc-dec models (NVIDIA#4790)

* Set workers to 0 for validation and test

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Revert pin memory

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>

* Fix Megatron NMT consumed samples and ckpt_to_nemo split rank (NVIDIA#4884)

* Fix nmt and ckpt_to_nemo

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* Style

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

* added utf8 encoding (NVIDIA#4892)

Signed-off-by: Virginia Adams <vadams@nvidia.com>

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* update readme with apex commit

Signed-off-by: ericharper <complex451@gmail.com>

* Add support for Apex distributed Adam optimizer with GPT-3 (NVIDIA#4487)

* Add support for Apex distributed Adam optimizer with GPT-3

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug in grad clipping with dist Adam

Grad norm was computed over all params, not respecting model parallelism.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bug with DDP initialization

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Make distopt dependent on megatron_amp_o2

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix code formatting

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Handle dist Adam in optimizer unit tests

Signed-off-by: Tim Moon <tmoon@nvidia.com>

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: Eric Harper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* update readme

Signed-off-by: ericharper <complex451@gmail.com>

* fixed styles

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed unsued import.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* removed duplicated func defintion.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace 'r1.11.0' with 'main' in Jenkinsfile and all tutorials.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: PRE_RELEASE = 'rc0'

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* replace branch name to main for asr_with_adapters.ipynb.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix Fastpitch mixertts tutorial format to align with main to distingshuish diff

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

* fix: correct path for tokenizers.

Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>

Signed-off-by: ericharper <complex451@gmail.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
Signed-off-by: Jocelyn Huang <jocelynh@nvidia.com>
Signed-off-by: Vahid <vnoroozi@nvidia.com>
Signed-off-by: Ante Jukić <ajukic@nvidia.com>
Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Signed-off-by: sam1373 <samuelkriman@gmail.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
Signed-off-by: ekmb <ebakhturina@nvidia.com>
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Co-authored-by: ericharper <complex451@gmail.com>
Co-authored-by: Jocelyn <jocelynh@nvidia.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Co-authored-by: Zhilin Wang <wangzhilin12061996@hotmail.com>
Co-authored-by: anteju <108555623+anteju@users.noreply.github.com>
Co-authored-by: Ante Jukić <ajukic@nvidia.com>
Co-authored-by: Nithin Rao <nithinrao.koluguri@gmail.com>
Co-authored-by: Somshubra Majumdar <titu1994@gmail.com>
Co-authored-by: David <amosalla@asu.edu>
Co-authored-by: David Mosallanezhad <dmosallanezh@nvidia.com>
Co-authored-by: Samuel Kriman <samuelkriman@gmail.com>
Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com>
Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Signed-off-by: Hainan Xu <hainanx@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.