-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repository with recipes how to pretrain model from scratch on my own data #2814
Comments
Hi @ksopyla that's a great – but very broad – question. We just wrote a blogpost that might be helpful: https://huggingface.co/blog/how-to-train The post itself is on GitHub so feel free to improve/edit it too. |
Thank you @julien-c. It will help to add new models to transformer model repository :) |
Hi, |
@ddofer You are right, this is in process of being addressed at huggingface/blog#3 Feel free to help :) |
@julien-c Is it possible to do another example using bert to pretrain the LM instead of roberta? I followed the steps, but it doesn't seem to work when I changed the model_type to bert. |
I am a new contributor and thought this might be a reasonable issue to start with. I'm happy to add an additional example of using bert rather than roberta to pretrain the LM. Please let me know if this would be helpful and/or if starting elsewhere would be better |
Great that you want to contribute!; any help is welcome! Fine-tuning and pretraining BERT seems to be already covered in run_language_modeling.py though. So your contribution should differ significantly from this functionality. Perhaps it can be written in a more educational rather than production-ready way? That would definitely be useful - explaining all concepts from scratch and such. (But not an easy task.) |
First version of a notebook is up over at https://github.com/huggingface/blog/tree/master/notebooks |
I'll give it a shot :) |
hey @laurenmoos, model=Roberta() |
@aditya-malte I'd love to! I will work on that and evaluate the request for additional documentation afterwards. Is there an issue to jump on? |
Let me know if you’re interested. I’d be excited to collaborate! |
@aditya-malte yes! |
Hi, Did we make any progress on the feature discussed above? A keras like wrapper sounds awesome for Transformers. I would like to contribute in the development. |
@julien-c Thanks for this. I have a question regarding 05/01/2020 17:44:01 - INFO - transformers.tokenization_utils - Didn't find file /<path-to-my-output-dir>/special_tokens_map.json. We won't load it. In the tutorial this has not been mentioned. Should we create this mapping file too? |
Hi @dashayushman, |
@julien-c @aditya-malte
how can I do that? Also, save the tokenized data? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hi @BramVanroy @julien-c |
I would also like to see an example, how to train a language model (like BERT) from scratch with tensorflow on my own dataset, so i can finetune it later on a specific task. |
ping @jplu ;) |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
🚀 Feature request
It would very useful to have documentation on how to train different models, not necessarily with the use of transformers, but with use external libs (like original BERT, fairseq, etc)
Maybe another repository with readmes or docs with recipes from those who already pretrain their model in order to reproduce procedure for other languages or domain.
There are many external resources (blogs, articles in arxiv) but without any details and very often they are not reproducible.
Motivation
Have a proven recipe for training the model. Make it easy for others to train a custom model. The community will easily train language or domain-specific models.
More models available in transformers library.
There are many issues related to this:
The text was updated successfully, but these errors were encountered: