Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support and recipes for HF models via AutoModelForCausalLM (NVIDI…
…A#10962) * initial hf_lit_module Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * make sft gpt dataset sanity check optional Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * HF sft example Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Rename HfLitModule to HfAutoModel Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * update default model id Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * move rank&world_size as params Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix mbs in example Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix for fsdp and logger Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * make loss_fn configurable Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * remove optim from HfAutoModel Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add pytorch native optim Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add hfAutoModel pretrain nemorun recipe Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove debug Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove stale imports Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove stale import Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm stale imports Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm stale imports Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * tokenizer fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * update example Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rename pytorch_adam to pytorch_adam_with_cosine_annealing Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * small refactor Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix no_weight_decay_cond Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * fix Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * switch to flat_lr optim for example Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> * remove imports & update docstrings Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add a tokenizer setter to allow it to work with nemo/collections/llm/api.py::_use_tokenizer Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * remove unused import Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * allow loss_mask to be none Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Add HF-dataset lightning module Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * check if pad_token_id is None Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rename hf_lit_module.py to hf_auto_model.py Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * class rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rename Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * update example Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * HfAutoModelForCausalLM Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rm stale import Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add option to start with random weights Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * add check in megatron-strategy Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * rename param Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * drop mcore sampler from squadmodule Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * make megatron_sampler optional in HfDatasetDataModule Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * copyright Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * use is_hf_model to mark hf classes Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> * Apply isort and black reformatting Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> --------- Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com> Signed-off-by: akoumpa <akoumpa@users.noreply.github.com> Co-authored-by: akoumpa <akoumpa@users.noreply.github.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
- Loading branch information