Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transformers 4.44.2 doesn't work with torch.compile and torch.export on T5 generate() #33283

Open
2 of 4 tasks
yiming0416 opened this issue Sep 3, 2024 · 3 comments
Open
2 of 4 tasks
Assignees
Labels
bug Compilation Issues related to torchdynamo and torchinductor Generation

Comments

@yiming0416
Copy link

System Info

  • transformers version: 4.44.2
  • Platform: Linux-5.19.0-0_fbk12_hardened_11583_g0bef9520ca2b-x86_64-with-glibc2.34
  • Python version: 3.10.14
  • Huggingface_hub version: 0.24.6
  • Safetensors version: 0.4.4
  • Accelerate version: 0.33.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.5.0a0+git33ba952 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: NVIDIA PG509-210

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

The following code breaks:

import torch
import transformers
from transformers import GenerationConfig
from transformers import AutoConfig

def generate_inputs_for_model(model_cls, model):
    eval_context = torch.randint(0, model.config.vocab_size, (4, 2048)).to("cuda")
    return {"input_ids": eval_context}

config = AutoConfig.from_pretrained("t5-small")
model_cls = getattr(transformers, "AutoModelForSeq2SeqLM")
model = model_cls.from_config(config).to("cuda")
example_inputs = generate_inputs_for_model(model_cls, model)
example_inputs = (example_inputs["input_ids"],)

generation_config = GenerationConfig(
    max_new_tokens=256,
    pad_token_id=0,
    eos_token_id=None,
    do_sample=False,
    num_beams=1,
    use_cache=True,
)

class GenerationWrapper(torch.nn.Module):
    def __init__(self, model, generation_config):
        super().__init__()
        self.model = model
        self.generation_config = generation_config

    def forward(self, inputs):
        return self.model.generate(inputs, self.generation_config)

model = GenerationWrapper(model, generation_config)
# torch.compile repro
model_opt = torch.compile(model)
output = model_opt(*example_inputs)
# torch.export repro
torch.export.export(model, args=example_inputs, strict=False)

With the following error:

ValueError: `decoder_start_token_id` or `bos_token_id` has to be defined for encoder-decoder generation.

If I manually add decoder_start_token_id=0 to the GenerationConfig. Then both compile and export work, although very slow.

Expected behavior

Expected generate to work like before without manually specifying decoder_start_token_id or bos_token_id in the GenerationConfig.

@LysandreJik
Copy link
Member

Thanks for the issue! cc @ArthurZucker

@ArthurZucker
Copy link
Collaborator

#33221 seems like it is required quite a lot.
cc @gante let's fix the generate issues!

@ArthurZucker ArthurZucker added Compilation Issues related to torchdynamo and torchinductor Generation labels Sep 6, 2024
@gante
Copy link
Member

gante commented Sep 6, 2024

cc @zucchini-nlp, who will be converting BART and T5 to be compile-compatible (using EncoderDecoderCache, like we did on Whisper)

@zucchini-nlp zucchini-nlp self-assigned this Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Compilation Issues related to torchdynamo and torchinductor Generation
Projects
None yet
Development

No branches or pull requests

5 participants