Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FYI HuggingFace update (Transformers v4.46.3) appears to have made your default baseline Encoder Decoder Transformer demo non-functional ( https://github.com/frankaging/ReCOGS/blob/1b6eca8ff4dca5fd2fb284a7d470998af5083beb/README.md?plain=1#L75 ) (workaround: pip install transformers==v4.45.2 first) #1

Open
willy-b opened this issue Dec 17, 2024 · 1 comment

Comments

@willy-b
Copy link

willy-b commented Dec 17, 2024

Hey just FYI, a HuggingFace update (seeing this on Transformers v4.46.3) appears to have made your default baseline Encoder Decoder Transformer demo non-functional (though pip install transformers==v4.45.2 is enough to work around for now if one is on a more recent affected version like v4.46.3).

That is, if one does a clean clone and runs the recommended command in

python run_cogs.py \
,

We only have a single training script `run_cogs.py`. You can use it to reproduce our Transformers result. Here is one example,

```bash
python run_cogs.py \
--model_name ende_transformer \
--gpu 1 \
--train_batch_size 128 \
--eval_batch_size 128 \
--lr 0.0001 \
--data_path ./cogs \
--output_dir ./results_cogs \
--lfs cogs \
--do_train \
--do_test \
--do_gen \
--max_seq_len 512 \
--output_json \
--epochs 300 \
--seeds "42;66;77;88;99"

it will currently fail after finishing training when it attempts to generate from the model, with:

Traceback (most recent call last):
  File "/content/ReCOGS/run_cogs.py", line 300, in <module>
    outputs = trainer.model.generate(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2215, in generate
    result = self._sample(
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 3199, in _sample
    model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
  File "/content/ReCOGS/model/encoder_decoder_hf.py", line 867, in prepare_inputs_for_generation
    "past_key_values": decoder_inputs["past_key_values"],
KeyError: 'past_key_values'

An example notebook showing this new behavior (in November on Google Colab I did not have this issue, which can be worked around by downgrading HF Transformers) is available at:
https://colab.research.google.com/drive/1pv4tqu4XunBMwyfPF43T8omkUhN3pkYB?usp=sharing

It can be worked around by reverting to an old version of HuggingFace Transformers v4.45.2,
for example:

!pip install transformers==v4.45.2 

is enough to avoid this.

I am trying to write a paper and am using the script as a baseline to compare a different model to and to predict the specific errors made by the baseline Transformer you provide based on an analysis of a Transformer-compatible rule based model.
This does not break my model (which does not depend on this or HuggingFace, just uses the ReCOGS dataset to show it can be done by a Transformer compatible model) but thought I would let you know as I think the baseline is extremely useful for comparison and studying Transformer behavior and people may want to reproduce the results of your excellent paper, https://arxiv.org/abs/2303.13716 . Not everyone may think to run pip install transformers==v4.45.2 to get around this, and the default version of Transformers on Google Colab will now break your script.

Thanks

@willy-b
Copy link
Author

willy-b commented Dec 17, 2024

Updated issue description as one sentence suggested I had not yet confirmed downgrading HF Transformers to v4.45.2 was a workaround (it was confirmed while writing the issue). Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant