Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'CohereModel' object has no attribute '_prune_heads' #33235

Open
4 tasks
mnauf opened this issue Aug 31, 2024 · 3 comments
Open
4 tasks

'CohereModel' object has no attribute '_prune_heads' #33235

mnauf opened this issue Aug 31, 2024 · 3 comments

Comments

@mnauf
Copy link

mnauf commented Aug 31, 2024

System Info

class CoherePreTrainedModel docstring says that "This model inherits from [PreTrainedModel]. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)". However, it doesn't have prune_heads() methods, even though it explicitly mentions that it has in the docstring.

Screenshot 2024-08-31 at 20 54 26
@ArthurZucker

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoTokenizer, AutoModel

model_id = "CohereForAI/aya-23-8B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModel.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16)
model.prune_heads({1: [1]})

Expected behavior

I expect the method to prune heads of the cohere model.

@mnauf mnauf added the bug label Aug 31, 2024
@mnauf
Copy link
Author

mnauf commented Aug 31, 2024

#29622

@mnauf
Copy link
Author

mnauf commented Aug 31, 2024

@saurabhdash2512 Please have a look

@LysandreJik
Copy link
Member

Thanks for your issue @mnauf ! The documentation should be updated, as indeed prune heads is only supported for older models as newer models haven't received this exploration from a research perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants