Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel prompt tuning #3670

Merged
merged 16 commits into from
Feb 17, 2022
Merged

Parallel prompt tuning #3670

merged 16 commits into from
Feb 17, 2022

Conversation

vadam5
Copy link
Contributor

@vadam5 vadam5 commented Feb 15, 2022

What does this PR do ?

This PR fixes prompt tuning to work with the updated pipeline parallel code. In this PR a user can also prompt tune using tensor parallel > 1. More work will need to be done for prompt tuning to be supported for pipeline parallel > 1.

Collection: [Note which collection this PR will affect]
BigNLP?

Changelog

  • prompt tuning with tp=1 and pp=1 works again with updated pipeline parallel code
  • prompt tuning with tp > 1 works
  • prompt tuning works with the updated complete and compute log probs methods
  • Added a prompt tuning config file with prompting specific defaults set
  • prompt tuning tests are add back

Usage

python examples/nlp/language_modeling/megatron_gpt_prompt_tuning.py \
    --config-name=megatron_prompt_tuning_gpt \
    restore_from_path='/prompt-tuning/megatron_gpt.nemo' \
    trainer.val_check_interval=2 \
    trainer.max_steps=5 \
    model.new_prompt_tags=['Winogrande, BoolQ'] \
    model.new_prompt_init_text=['logic choose person name, None'] \
    model.new_prompt_init_methods=['text, random'] \
    model.data.train_ds='/prompt-tuning/wino_bool_prompt_tuning_train.json' \
    model.data.valid_ds='/prompt-tuning/wino_bool_prompt_tuning_val.json' \
    +model.data.test_ds='/prompt-tuning/wino_bool_prompt_tuning_val.json' \
    model.micro_batch_size=2 \
    model.global_batch_size=4 \
    model.optim.lr=2e-2 \
    model.optim.sched.min_lr=2e-3 \
    model.optim.sched.warmup_steps=2 \
    model.optim.sched.constant_steps=8 \
    model.encoder_seq_length=2048

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
  • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

@lgtm-com
Copy link

lgtm-com bot commented Feb 15, 2022

This pull request introduces 9 alerts and fixes 1 when merging 02d395a into aeeb0d2 - view on LGTM.com

new alerts:

  • 7 for Unused local variable
  • 1 for Wrong name for an argument in a call
  • 1 for Wrong name for an argument in a class instantiation

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Feb 15, 2022

This pull request introduces 9 alerts and fixes 1 when merging 9381c4c into b98a07d - view on LGTM.com

new alerts:

  • 7 for Unused local variable
  • 1 for Wrong name for an argument in a call
  • 1 for Wrong name for an argument in a class instantiation

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Feb 16, 2022

This pull request fixes 1 alert when merging a31a264 into b5012d0 - view on LGTM.com

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 16, 2022

This pull request introduces 1 alert and fixes 1 when merging 59af92b into 2ebca22 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Feb 16, 2022

This pull request introduces 1 alert and fixes 1 when merging a8a2c3b into 8ffc92e - view on LGTM.com

new alerts:

  • 1 for Unused local variable

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request introduces 3 alerts and fixes 1 when merging 8be857f into c00bcd6 - view on LGTM.com

new alerts:

  • 3 for Unused import

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request introduces 3 alerts and fixes 1 when merging 724773f into 6dd4263 - view on LGTM.com

new alerts:

  • 3 for Unused import

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request introduces 3 alerts and fixes 1 when merging 085d2ed into 6dd4263 - view on LGTM.com

new alerts:

  • 3 for Unused import

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request introduces 1 alert and fixes 1 when merging 44f6e22 into 6dd4263 - view on LGTM.com

new alerts:

  • 1 for Unused import

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request introduces 1 alert and fixes 1 when merging 901c9ee into 1b89a70 - view on LGTM.com

new alerts:

  • 1 for Unused import

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request introduces 1 alert and fixes 1 when merging f5c2b6e into 1b89a70 - view on LGTM.com

new alerts:

  • 1 for Unused import

fixed alerts:

  • 1 for Wrong name for an argument in a class instantiation

Signed-off-by: Virginia Adams <vadams@nvidia.com>
@vadam5 vadam5 marked this pull request as ready for review February 17, 2022 21:03
@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request introduces 1 alert and fixes 3 when merging 312bcc4 into 1b89a70 - view on LGTM.com

new alerts:

  • 1 for Unused import

fixed alerts:

  • 2 for Variable defined multiple times
  • 1 for Wrong name for an argument in a class instantiation

Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a couple of imports that you did not want to add, see the comments.

@vadam5 vadam5 requested a review from ericharper February 17, 2022 22:51
Signed-off-by: Virginia Adams <vadams@nvidia.com>
Copy link
Collaborator

@ericharper ericharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@lgtm-com
Copy link

lgtm-com bot commented Feb 17, 2022

This pull request fixes 3 alerts when merging 2667ad8 into 1b89a70 - view on LGTM.com

fixed alerts:

  • 2 for Variable defined multiple times
  • 1 for Wrong name for an argument in a class instantiation

Copy link
Member

@okuchaiev okuchaiev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@okuchaiev okuchaiev merged commit c00e623 into main Feb 17, 2022
@ericharper ericharper deleted the parallel_prompt_tuning branch February 18, 2022 00:00
fayejf pushed a commit that referenced this pull request Mar 2, 2022
* Started combined tensor parallel and pipeline parallel changes

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Gets through validation sanity checks

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Still working through bugs

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Able to run training but virtual token parameters don't get updated

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* params weren't updating because they weren't setup w/ optimizer

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Parallel with single GPU is working!

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Tensor parallel = 2 is working

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Tensor parallel working and code cleaned up

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Added prompt tuning testing back in

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* Complete method works again for prompt tuned mdoels

Signed-off-by: Virginia Adams <vadams@nvidia.com>

* removed random imports

Signed-off-by: Virginia Adams <vadams@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants