-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Akoumparouli/nemo ux update param name #10441
Conversation
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering what the reason for this change is. This will be a breaking change and will require us to update the user guide. Also, the use of dir
is consistent with NeMo 1
[edit] I saw there were concerns that the current naming could lead to silent bugs in the future, and this makes sense to me. Better to change it now than to run into bugs moving forward. Thanks for making this change.
* NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com> Signed-off-by: George Armstrong <georgea@nvidia.com>
* NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com>
* NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com>
* [NeMo-UX] Add token drop callback and optimize mixtral configs (#10361) * add token drop plugin Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * add checks Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * add expert parallel configs Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com> * amend comment Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com> * add comm overlap Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * fix rebase errors Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * Apply isort and black reformatting Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com> * fix typo Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * add test configs Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> * fix Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> --------- Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com> Co-authored-by: Jimmy Zhang <jiemingz@nvidia.com> Co-authored-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com> Co-authored-by: Pablo Garay <palenq@gmail.com> Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com> Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: artbataev <artbataev@users.noreply.github.com> Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove run Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fixes Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * length fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * update pretrain_recipe_performance param dir -> ckpt_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * Akoumparouli/nemo ux update param name (#10441) * NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com> * pass ckpt_dir to log_dir for the default_log Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * param rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Bump `Dockerfile.ci` (2024-09-09) (#10423) * [🤠]: Howdy folks, let's bump `Dockerfile.ci` to 8307fcd ! Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * update TE import paths Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * Update parallelisms.rst fix sed typo. Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com> * fix for mcore dist opt refactor: move overlap_grad_reduce/overlap_param_gather to ddp config Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * remove overlap_grad_reduce overlap_param_gather from autoconfig Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * subclass TransformerConfig because megatronmodule expects it to have fp8 attr Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * revert change; Use ModelParallelConfig & add fp8 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix, set NVTE_APPLY_QK_LAYER_SCALIN=1 Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com> Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com> Co-authored-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: Pablo Garay <palenq@gmail.com> Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com> * remove hf_resume for mixtral-8x3b Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * update mistral recipe Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * comment tests for non-merged recipes Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * NeMoLogger uses log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * more fixes Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * more fixes Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix param Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Fix dockerfile build order Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Jimmy Zhang <jiemingz@nvidia.com> Signed-off-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com> Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: artbataev <artbataev@users.noreply.github.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com> Co-authored-by: JimmyZhang12 <67203904+JimmyZhang12@users.noreply.github.com> Co-authored-by: Jimmy Zhang <jiemingz@nvidia.com> Co-authored-by: JimmyZhang12 <JimmyZhang12@users.noreply.github.com> Co-authored-by: Pablo Garay <palenq@gmail.com> Co-authored-by: artbataev <artbataev@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com> Co-authored-by: oliver könig <okoenig@nvidia.com> Co-authored-by: pablo-garay <7166088+pablo-garay@users.noreply.github.com>
* NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com>
* NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com> Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>
* NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com> Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>
* NeMoLogger: update dir to log_dir Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * NeMologger: update calls Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Co-authored-by: Marc Romeyn <mromeijn@nvidia.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use this
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information