fix model to save in ppov2 #1776

mnoukhov · 2024-06-26T18:38:10Z

currently saving self.backup_model but this should be self.model
self.backup_model is only a temp model used to store the policy and value function whereas self.model should have just the policy to save

@vwxyzjn let me know if I'm off base

currently saving self.backup_model but this should be self.model self.backup_model is only a temp model used to store the policy and value function whereas self.model should have just the policy to save

vwxyzjn · 2024-06-27T13:31:33Z

trl/trainer/ppov2_trainer.py

@@ -220,7 +220,7 @@ def save_model(self, output_dir: Optional[str] = None, _internal_call: bool = Fa
            self.model = self.accelerator.unwrap_model(self.model).policy  # save only the policy
        if output_dir is None:
            output_dir = self.args.output_dir
-        state_dict = self.accelerator.get_state_dict(self.backup_model)
+        state_dict = self.accelerator.get_state_dict(self.model)


This is prob incorrect. There are two scenarios:

We call trainer.save_model, in which case, if not _internal_call gets triggered, and self.model becomes the policy

we call trainer.push_to_hub, in which case, push_to_hub sets the self.model to be the policy, and super().push_to_hub(**kwargs) calls save_model(..., _internal_call=True), and in that case self.model is still the policy.

It's a bit unfortunate that the logic is a bit convoluted... 🫠

But how is self.backup_model set in the case when _internal_call. I don't see it being set and I'm getting an error

mnoukhov · 2024-07-01T21:49:13Z

I looked into it a bit more and don't think there's any need to have a separate push_to_hub and save_model. The simplified logic works on my end and is fine with deepspeed. I haven't added the model_wrapped change that sagemaker would need but otherwise this seems fine. Let me know if this works for you!

github-actions · 2024-07-27T15:05:19Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

HuggingFaceDocBuilderDev · 2024-08-17T14:19:30Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2024-08-17T15:21:45Z

You can try with:

from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoModelForSequenceClassification, AutoTokenizer
from trl.trainer.ppov2_trainer import PPOv2Config, PPOv2Trainer
from trl.trainer.utils import SIMPLE_QUERY_CHAT_TEMPLATE


def main():
    config = PPOv2Config(output_dir="tmp")
    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-1b-deduped", padding_side="left")
    tokenizer.add_special_tokens({"pad_token": "[PAD]"})
    tokenizer.chat_template = SIMPLE_QUERY_CHAT_TEMPLATE
    value_model = AutoModelForSequenceClassification.from_pretrained("EleutherAI/pythia-1b-deduped", num_labels=1)
    reward_model = AutoModelForSequenceClassification.from_pretrained("EleutherAI/pythia-1b-deduped", num_labels=1)
    ref_policy = AutoModelForCausalLM.from_pretrained("EleutherAI/pythia-1b-deduped")
    policy = AutoModelForCausalLM.from_pretrained("EleutherAI/pythia-1b-deduped")

    raw_datasets = load_dataset("trl-internal-testing/descriptiveness-sentiment-trl-style", split="descriptiveness")
    train_dataset = raw_datasets.select(range(50))

    def tokenize(element):
        outputs = tokenizer(element["prompt"], padding=False)
        return {"input_ids": outputs["input_ids"]}

    train_dataset = train_dataset.map(tokenize, batched=True, remove_columns=train_dataset.column_names)

    trainer = PPOv2Trainer(
        config=config,
        tokenizer=tokenizer,
        policy=policy,
        ref_policy=ref_policy,
        reward_model=reward_model,
        value_model=value_model,
        train_dataset=train_dataset,
    )
    trainer.save_model(config.output_dir)
    trainer.push_to_hub()


if __name__ == "__main__":
    main()

qgallouedec · 2024-08-17T15:47:07Z

lgtm thanks @mnoukhov!

fix model to save in ppov2

4858912

currently saving self.backup_model but this should be self.model self.backup_model is only a temp model used to store the policy and value function whereas self.model should have just the policy to save

vwxyzjn reviewed Jun 27, 2024

View reviewed changes

simplified logic

54aa3c9

remove unused ordereddict

d15b66d

github-actions bot closed this Aug 5, 2024

qgallouedec reopened this Aug 17, 2024

qgallouedec and others added 2 commits August 17, 2024 16:12

Merge branch 'main' into ppov2-fix

cd02ec3

format

ae20665

fix the fix

24204b7

qgallouedec approved these changes Aug 17, 2024

View reviewed changes

qgallouedec merged commit bbdef00 into huggingface:main Aug 17, 2024
9 checks passed

qgallouedec mentioned this pull request Aug 17, 2024

save error #1938

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix model to save in ppov2 #1776

fix model to save in ppov2 #1776

mnoukhov commented Jun 26, 2024

vwxyzjn Jun 27, 2024 •

edited

Loading

mnoukhov Jul 1, 2024

mnoukhov commented Jul 1, 2024 •

edited

Loading

github-actions bot commented Jul 27, 2024

HuggingFaceDocBuilderDev commented Aug 17, 2024

qgallouedec commented Aug 17, 2024

qgallouedec commented Aug 17, 2024

fix model to save in ppov2 #1776

fix model to save in ppov2 #1776

Conversation

mnoukhov commented Jun 26, 2024

vwxyzjn Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

mnoukhov Jul 1, 2024

Choose a reason for hiding this comment

mnoukhov commented Jul 1, 2024 • edited Loading

github-actions bot commented Jul 27, 2024

HuggingFaceDocBuilderDev commented Aug 17, 2024

qgallouedec commented Aug 17, 2024

qgallouedec commented Aug 17, 2024

vwxyzjn Jun 27, 2024 •

edited

Loading

mnoukhov commented Jul 1, 2024 •

edited

Loading