Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analysing a fairseq transformer model #37

Open
aprzez opened this issue Jan 12, 2023 · 4 comments
Open

Analysing a fairseq transformer model #37

aprzez opened this issue Jan 12, 2023 · 4 comments

Comments

@aprzez
Copy link

aprzez commented Jan 12, 2023

Hello,

I trained a fairseq transfomer model on an inflection task, and I am now trying to use the NeuroX toolkit to extract the representations. However, I am not sure how to import the model into neurox.data.extraction.transformers_extractor - I have the .pt files from the training checkpoints. Could you guide me?

@fdalvi
Copy link
Owner

fdalvi commented Jan 15, 2023

Hello @aprzez,

Currently NeuroX only supports transformers (from HuggingFace) models for extraction, because extracting the representations need access and knowledge of the internals of the implementation. There are two possible ways to enable extraction for fairseq models:

  1. Implement a fairseq_extractor module. Depending on how fairseq implements models, this may be reasonably easy to code up. I see some references to return_hidden (https://github.com/facebookresearch/fairseq/blob/58cc6cca18f15e6d56e3f60c959fe4f878960a60/fairseq/models/transformer/transformer_encoder.py#L139) in the code, which is what we probably will end up using if we go down this path. I am not sure when I can do this, however, if you are interested, I'd be happy to mentor you through this process!
  2. Convert your fairseq model to a transformers model. I am not sure how well this works, but it seems others have tried this before with some success: https://discuss.huggingface.co/t/how-can-i-convert-a-model-created-with-fairseq/564/16

Hope this helps!

@aprzez
Copy link
Author

aprzez commented Jan 29, 2023

Hello @fdalvi ,

thank you so much for responding!

If you have some guidance for the first option - implementing fairseq_extractor - that would be great.

Thank you!

@aprzez
Copy link
Author

aprzez commented Feb 6, 2023

Hi again,

I'm assuming I need to set the return_all_hiddens to True, so that all hidden states during training are saved. The model checkpoints when return_all_hiddens is True look the same as when it is False. I would be very grateful if you had some pointers on implementing the fairseq_extractor module, it's not clear to me what else I need to do to get this to work. Thank you in advance!

@fdalvi
Copy link
Owner

fdalvi commented Mar 6, 2023

Apologies for the delayed response, but I was away and this slipped my mind after coming back. return_all_hiddens should change what your inference time outputs look like, not what happens during training. So after you set return_all_hiddens=True, the fourth output from the forward function will contain the encoder states. Once you have these, we can work towards a detoknization procedure like we have in the transformers_extractor. Please feel free to ask if something is unclear or you need some assistance!

Best,
Fahim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants