generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TR-DPO bug #1971
Comments
could be an edge case... do you mind sending a PR to check? |
do you mean something like this: @staticmethod
def sync_target_model(model, target_model, alpha):
deepspeed_plugin = AcceleratorState().deepspeed_plugin
if deepspeed_plugin is not None and deepspeed_plugin.zero_stage == 3:
with deepspeed.zero.GatheredParameters(
list(model.parameters()) + list(target_model.parameters()), modifier_rank=0
):
if deepspeed.comm.get_rank() == 0:
SyncRefModelCallback._sync_target_model(model, target_model, alpha)
else:
SyncRefModelCallback._sync_target_model(model, target_model, alpha) |
here is how I solved it
|
@Bodoral can you check if it works with the current head? |
tested this and it works perfectly |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm running dpo with
sync_ref_model
on. the dpo training script is adopted from alignment-handbook with deepspeed stage 3 enabled. however the training crashed at the point of updating reference model weights, here is the error I gotafter checking SyncRefModelCallback it turns out that it only gathers the parameters of the dpo model being trained but not the reference model.
I fixed that by first gathering the dpo model being trained and save its parameters data in a list then gather the reference model and update its parameters.
I'm not sure if this is actually a bug or there is something wrong with my settings
OS type: 20.04.1-Ubuntu
Python: 3.10
packages:
trl==0.9.6
torch==2.3.1
torchaudio==2.4.0
torchvision==0.19.0
transformers==4.44.1
deepspeed==0.14.5
The text was updated successfully, but these errors were encountered: