-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should I do 'accelerator.prepare()' again after 'accelerator.unwrap_model()' ? #948
Comments
Hello, below is the code structure of follow:
Let us know if that solves your query. For such queries in future, please use https://discuss.huggingface.co/c/accelerate/18 |
Thanks for your helpful reply. |
Hello @DtYXs, can you please provide a minimal reproducible example of the issue that you are facing? |
I'm having problems with this too. When using mixed precision with fp16 and no distributed training, after I use accelerate/src/accelerate/utils/other.py Line 55 in aaa2637
unwrap_model seems to affect the passed model and remove its autocast etc wrapper for the forward pass.
|
Yes it does, this was added recently by @muellerzr. |
One would expect they get removed on the returned object, but never on the object you pass to this function, or? |
It's tricky since a model is a reference type, and we can't really create a whole copy of the model. |
@miquelmarti what Sylvain already describes is actually supported. Just do: model = accelerator.unwrap_model(model, keep_fp32_wrapper=True) Will this work for you? :) |
I understand and yes, this flag does work for me. However, the behaviour before the change was equivalent to setting the flag to True and IMO that should be the default now as well so that unwrapping the model does not affect the next calls to forward. I think train -> checkpoint unwrapped model -> keep training is probably the most common case of EDIT: I guess #858 is the reason to remove the hooks by default. If I remove the hook using the flag I have the same problem as described in the issue when I do intermediate checkpoints, if I don't training cannot continue in the same way. There should be a way to allow for both. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
@muellerzr @sgugger Was any consensus reached on what to do here? I am facing the same dilemma as @miquelmarti and would rather prefer that unwrapping and checkpointing the model (using |
For multi-gpu training:
model = accelerator.prepare(model)
I want to save ckpt during training. Then I do this:
model = accelerator.unwrap_model(model)
pipeline = Pipeline(model=model)
pipeline.save_pretrained(...)
And I want to continue training. Should I do
model = accelerator.prepare(model)
again after saving?The text was updated successfully, but these errors were encountered: