-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checking that the LM actually trained #3728
Comments
Yes: simply |
I'd check if 'GPT2' works by sampling from a simple prompt. E.g.:
|
Thanks for clarifying! I was about to consider sending a PR for a |
I have a branch that implements a GenerationPipeline which already works for GPT modelsThe initial version of The implementation is based on the approach taken in run_generation.py, which means the forward pass uses the So far, the code above works smoothly for Sample code:
However, the module still doesn't work with other language models like I will do a root cause analysis on this and will send a PR as soon as I get this to work on the rest of the language models that should work with For more details, you can check out this colab notebook, which shows the gpt models working so far, and the rest of the models not working in the later sections. |
[UPDATE] The issues above have been resolved and I'm in the process of sending a PR.Google Colab tutorial here for running
|
You're PR looks very nice so far :-) I will take a look early next week! |
Thanks! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I have trained a gpt2 from scratch with the way that is decribed in that post https://huggingface.co/blog/how-to-train .
Just in the step 4, where he checks if the trained model actually works, he uses from pipeline the
"fill-mask" but that works only for models with masked language modeling objective.
Exists something similar i could use like "fill-mask" for my case?
The text was updated successfully, but these errors were encountered: