You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
fromtransformersimportAutoModelForCausalLMmodel=AutoModelForCausalLM.from_pretrained('facebook/xglm-564M')
# Some weights of XGLMForCausalLM were not initialized from the model checkpoint # at facebook/xglm-564M and are newly initialized: ['model.embed_positions.weights']# You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Expected behavior
The warning should not be triggered.
The positional embedding for XGLM uses XGLMSinusoidalPositionalEmbedding which does not have actual trainable parameters. My guess is that since there is no parameters, the key is not actually stored in the checkpoint and triggers the warning. This issue might exist in other models that similarly have nn.Module with no trainable parameters.
The text was updated successfully, but these errors were encountered:
Thanks for raising the issue @MattYoon @ArthurZucker can confirm, I think the fix would be to make the weight of that module non-persistent (persistent=False) so that they won't get saved in the state_dict: https://github.com/huggingface/transformers/blob/main/src/transformers/models/xglm/modeling_xglm.py#L179
Indeed there is not need to save them in the state dict (or to consider them when loading the weights) as they are created on the fly. If that fixes the issue, would you be happy to open a PR with the fix?
System Info
transformers
version: 4.32.0Who can help?
@ArthurZucker @younesbelkada
But probably a general issue in thePretrainedModel
class.Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
The warning should not be triggered.
The positional embedding for XGLM uses
XGLMSinusoidalPositionalEmbedding
which does not have actual trainable parameters. My guess is that since there is no parameters, the key is not actually stored in the checkpoint and triggers the warning. This issue might exist in other models that similarly havenn.Module
with no trainable parameters.The text was updated successfully, but these errors were encountered: