Analysing a fairseq transformer model #37

aprzez · 2023-01-12T23:22:18Z

Hello,

I trained a fairseq transfomer model on an inflection task, and I am now trying to use the NeuroX toolkit to extract the representations. However, I am not sure how to import the model into neurox.data.extraction.transformers_extractor - I have the .pt files from the training checkpoints. Could you guide me?

fdalvi · 2023-01-15T07:43:37Z

Hello @aprzez,

Currently NeuroX only supports transformers (from HuggingFace) models for extraction, because extracting the representations need access and knowledge of the internals of the implementation. There are two possible ways to enable extraction for fairseq models:

Implement a fairseq_extractor module. Depending on how fairseq implements models, this may be reasonably easy to code up. I see some references to return_hidden (https://github.com/facebookresearch/fairseq/blob/58cc6cca18f15e6d56e3f60c959fe4f878960a60/fairseq/models/transformer/transformer_encoder.py#L139) in the code, which is what we probably will end up using if we go down this path. I am not sure when I can do this, however, if you are interested, I'd be happy to mentor you through this process!
Convert your fairseq model to a transformers model. I am not sure how well this works, but it seems others have tried this before with some success: https://discuss.huggingface.co/t/how-can-i-convert-a-model-created-with-fairseq/564/16

Hope this helps!

aprzez · 2023-01-29T21:18:10Z

Hello @fdalvi ,

thank you so much for responding!

If you have some guidance for the first option - implementing fairseq_extractor - that would be great.

Thank you!

aprzez · 2023-02-06T21:55:32Z

Hi again,

I'm assuming I need to set the return_all_hiddens to True, so that all hidden states during training are saved. The model checkpoints when return_all_hiddens is True look the same as when it is False. I would be very grateful if you had some pointers on implementing the fairseq_extractor module, it's not clear to me what else I need to do to get this to work. Thank you in advance!

fdalvi · 2023-03-06T08:52:50Z

Apologies for the delayed response, but I was away and this slipped my mind after coming back. return_all_hiddens should change what your inference time outputs look like, not what happens during training. So after you set return_all_hiddens=True, the fourth output from the forward function will contain the encoder states. Once you have these, we can work towards a detoknization procedure like we have in the transformers_extractor. Please feel free to ask if something is unclear or you need some assistance!

Best,
Fahim

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysing a fairseq transformer model #37

Analysing a fairseq transformer model #37

aprzez commented Jan 12, 2023

fdalvi commented Jan 15, 2023

aprzez commented Jan 29, 2023

aprzez commented Feb 6, 2023

fdalvi commented Mar 6, 2023

Analysing a fairseq transformer model #37

Analysing a fairseq transformer model #37

Comments

aprzez commented Jan 12, 2023

fdalvi commented Jan 15, 2023

aprzez commented Jan 29, 2023

aprzez commented Feb 6, 2023

fdalvi commented Mar 6, 2023