Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.rst #4

Closed
wants to merge 1 commit into from
Closed

Update README.rst #4

wants to merge 1 commit into from

Conversation

okuchaiev
Copy link
Member

No description provided.

@okuchaiev okuchaiev closed this Sep 12, 2019
@okuchaiev okuchaiev deleted the okuchaiev-patch-1 branch October 19, 2019 00:30
yzhang123 pushed a commit to yzhang123/NeMo that referenced this pull request May 1, 2023
allow as many tokens to be generated as max target
lhb8125 added a commit to lhb8125/NeMo that referenced this pull request Jul 19, 2023
* Create README.rst

* Update and rename README.rst to README.md

* Update README.md
zhehuaichen added a commit that referenced this pull request Oct 9, 2023
)

* add initial impl of ModularizedSpeechGPTModel and integration test

* fix typo in the test name (#1)

approve the nit change

* clean a initial version of example config; make sure it works by test (#2)

approve as no need to review

* add the test for training_step and fix the code correspondingly (test passed now) (#3)

* add test for validation_step (#4)

* mv audio and text emb concat to prepare_llm_input so as to write test to guard the llm input

* Merge heh and zhehuai's initial version of frozen am+llm (#5)

* Merge heh and zhehuai's initial version of frozen am+llm

The previous differences are summarized here:
https://docs.google.com/document/d/1zNI4hC6vJtUfcHbrUSPaMuYWRBQdN_36H0P2NiBiuPY/edit

This PR includes
1. Finish merging the model, dataset, and config code
2. Previous tests are still enabled and passed (prepare_llm_input, training_step,
    validation_step)
3. the example training script with LS960 has been run to make sure the training
pipeline works

The major remaining works are listed here
https://docs.google.com/document/d/1o0AM7v4gcTQkPZjE0Vl9TTX4vYnGTrbXEFGWh0UhGlk/edit#bookmark=id.pzvdadt5oxyw

---------

Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* fix a nit init bug broke test (#6)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Clean up implementation for SALM paper and sync to NEMO v1.20.0 (#18)

* wip

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix consumed_samples

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the training restart problem by storing adapter+perception model and
init them from the ckpt

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refix state dict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support wer and inf

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nan guard

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* reimpl inf and bug fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* multi loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* unfreeze lm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* flag for load am

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* tokenizer

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite vocab size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support bpe dropout

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add tarred datasets

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix sample_alpha

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bpe dropout bugs in the mismatched context in tokenization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add bleu metric

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update metrics

Signed-off-by: stevehuang52 <heh@nvidia.com>

* support inference and fix a bug in wer calculation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix bucketing dataset

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support question set file per dataset/data loader in preparation for
multitask understanding; also fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support simple random context for word boosting

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* use sacrebleu.corpus_bleu to be consistent with the rest

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make audio_file optional in the data loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add a tool to materialize mt and text data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* compatible with tar dataset

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for metric and speed up materialization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make num of context configurable

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* val_check_interval fix; make manifest dumping consistent with speech models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* random_context_positive_ratio configurable to control precision

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* bug fix: freeze_llm flag is not passed to the model cfg

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite tensor_model_parallel_size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support both stt and ssl models for loading audio encoder

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the inference config so as to use sampling; allow inference config update in training

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refactorize and clean up code for preprocessing collections, dataset interface, model inference and rename some classes to be consistent with salm paper.
also make sure test passed

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Undo changes in megatron_gpt_peft_models.py and move them to speechllm_models.py; make sure the correctness by test_speechllm_models.py::TestModularizedAudioGPTModel::test_predict_step

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update default inference config and test golden value accordingly

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* integration test and minor fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit bug fix on manifest_filepath introduced by code cleanup

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update workspace/ files; consider moving to examples later

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* further remove unnecessary stuff in the inference implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert the update in default end_string to be compatible with legacy models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* rename 'ModularizedAudioGPTModel' to 'ModularAudioGPTLoRAModel'; move speechllm stuff under nemo/collections/multimodal/speechllm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update copyright; remove workspace/scripts and workspace/tools folders since the main branch has LLaMA support

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: Zhehuai Chen <chenzhehuai.sjtu@aispeech.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
zhehuaichen added a commit that referenced this pull request Oct 13, 2023
)

* add initial impl of ModularizedSpeechGPTModel and integration test

* fix typo in the test name (#1)

approve the nit change

* clean a initial version of example config; make sure it works by test (#2)

approve as no need to review

* add the test for training_step and fix the code correspondingly (test passed now) (#3)

* add test for validation_step (#4)

* mv audio and text emb concat to prepare_llm_input so as to write test to guard the llm input

* Merge heh and zhehuai's initial version of frozen am+llm (#5)

* Merge heh and zhehuai's initial version of frozen am+llm

The previous differences are summarized here:
https://docs.google.com/document/d/1zNI4hC6vJtUfcHbrUSPaMuYWRBQdN_36H0P2NiBiuPY/edit

This PR includes
1. Finish merging the model, dataset, and config code
2. Previous tests are still enabled and passed (prepare_llm_input, training_step,
    validation_step)
3. the example training script with LS960 has been run to make sure the training
pipeline works

The major remaining works are listed here
https://docs.google.com/document/d/1o0AM7v4gcTQkPZjE0Vl9TTX4vYnGTrbXEFGWh0UhGlk/edit#bookmark=id.pzvdadt5oxyw

---------

Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* fix a nit init bug broke test (#6)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Clean up implementation for SALM paper and sync to NEMO v1.20.0 (#18)

* wip

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix consumed_samples

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the training restart problem by storing adapter+perception model and
init them from the ckpt

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refix state dict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support wer and inf

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nan guard

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* reimpl inf and bug fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* multi loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* unfreeze lm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* flag for load am

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* tokenizer

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite vocab size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support bpe dropout

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add tarred datasets

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix sample_alpha

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bpe dropout bugs in the mismatched context in tokenization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add bleu metric

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update metrics

Signed-off-by: stevehuang52 <heh@nvidia.com>

* support inference and fix a bug in wer calculation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix bucketing dataset

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support question set file per dataset/data loader in preparation for
multitask understanding; also fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support simple random context for word boosting

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* use sacrebleu.corpus_bleu to be consistent with the rest

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make audio_file optional in the data loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add a tool to materialize mt and text data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* compatible with tar dataset

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for metric and speed up materialization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make num of context configurable

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* val_check_interval fix; make manifest dumping consistent with speech models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* random_context_positive_ratio configurable to control precision

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* bug fix: freeze_llm flag is not passed to the model cfg

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite tensor_model_parallel_size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support both stt and ssl models for loading audio encoder

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the inference config so as to use sampling; allow inference config update in training

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refactorize and clean up code for preprocessing collections, dataset interface, model inference and rename some classes to be consistent with salm paper.
also make sure test passed

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Undo changes in megatron_gpt_peft_models.py and move them to speechllm_models.py; make sure the correctness by test_speechllm_models.py::TestModularizedAudioGPTModel::test_predict_step

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update default inference config and test golden value accordingly

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* integration test and minor fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit bug fix on manifest_filepath introduced by code cleanup

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update workspace/ files; consider moving to examples later

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* further remove unnecessary stuff in the inference implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert the update in default end_string to be compatible with legacy models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* rename 'ModularizedAudioGPTModel' to 'ModularAudioGPTLoRAModel'; move speechllm stuff under nemo/collections/multimodal/speechllm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update copyright; remove workspace/scripts and workspace/tools folders since the main branch has LLaMA support

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: Zhehuai Chen <chenzhehuai.sjtu@aispeech.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
pzelasko pushed a commit to pzelasko/NeMo that referenced this pull request Nov 29, 2023
pzelasko pushed a commit to pzelasko/NeMo that referenced this pull request May 8, 2024
dcurran90 pushed a commit to dcurran90/NeMo that referenced this pull request Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant