Is training from scratch possible now? #1283

Stamenov · 2019-09-18T09:04:55Z

Do the models support training from scratch, together with original (paper) parameters?

Zacharias030 · 2019-09-18T09:17:32Z

You can just instanciate the models without the .from_pretraining() like so:

config = BertConfig(**optionally your favorite parameters**)
model = BertForPretraining(config)

I added a flag to run_lm_finetuning.py that gets checked in the main(). Maybe this snipped helps (note, I am only using this with Bert w/o next sentence prediction).

# check if instead initialize freshly
if args.do_fresh_init:
    config = config_class()
    tokenizer = tokenizer_class()
    if args.block_size <= 0:
        args.block_size = tokenizer.max_len  # Our input block size will be the max possible for the model
    args.block_size = min(args.block_size, tokenizer.max_len)
    model = model_class(config=config)
else:
    config = config_class.from_pretrained(args.config_name if args.config_name else args.model_name_or_path)
    tokenizer = tokenizer_class.from_pretrained(args.tokenizer_name if args.tokenizer_name else args.model_name_or_path)
    if args.block_size <= 0:
        args.block_size = tokenizer.max_len  # Our input block size will be the max possible for the model
    args.block_size = min(args.block_size, tokenizer.max_len)
    model = model_class.from_pretrained(args.model_name_or_path, from_tf=bool('.ckpt' in args.model_name_or_path), config=config)
model.to(args.device)

Stamenov · 2019-09-18T11:55:42Z

Hi,

thanks for the quick response.
I am more interested in the XLNet and TransformerXL models. Would they have the same interface?

Zacharias030 · 2019-09-18T13:42:31Z

I don’t know firsthand, but suppose so and it is fundamentally an easy problem to reinitialize weights randomly before any kind of training in pytorch :) Good luck, Zacharias Am 18. Sep. 2019, 1:56 PM +0200 schrieb Stamenov <notifications@github.com>:

…

Hi, thanks for the quick response. I am more interested in the XLNet and TransformerXL models. Would they have the same interface? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

gooofy · 2019-09-21T21:23:50Z

I think XLNet requires a very specific training procedure, see #943 👍

"For XLNet, the implementation in this repo is missing some key functionality (the permutation generation function and an analogue of the dataset record generator) which you'd have to implement yourself."

p-stefanov · 2019-09-22T09:41:56Z

#1283 (comment)

Hmm, tokenizers' constructors require a vocab_file parameter...

stale · 2019-11-21T09:49:09Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

jbmaxwell · 2019-12-16T17:04:10Z

@Stamenov Did you figure out how to pretrain XLNet? I'm interested in that as well.

Stamenov · 2019-12-16T17:51:03Z

No, I haven't. According to some recent tweet, huggingface could prioritize putting more effort into providing interfaces for self pre-training.

julien-c · 2020-02-14T22:24:45Z

You can now leave --model_name_or_path to None in run_language_modeling.py to train a model from scratch.

See also https://huggingface.co/blog/how-to-train

As referenced in e.g. huggingface/transformers#1283

stale bot added the wontfix label Nov 21, 2019

stale bot closed this as completed Nov 28, 2019

ksopyla mentioned this issue Feb 11, 2020

Repository with recipes how to pretrain model from scratch on my own data #2814

Closed

julien-c added a commit to huggingface/blog that referenced this issue Feb 14, 2020

Mention the --model_name_or_path flag

2d969c1

As referenced in e.g. huggingface/transformers#1283

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is training from scratch possible now? #1283

Is training from scratch possible now? #1283

Stamenov commented Sep 18, 2019

Zacharias030 commented Sep 18, 2019 •

edited

Loading

Stamenov commented Sep 18, 2019

Zacharias030 commented Sep 18, 2019 via email

gooofy commented Sep 21, 2019

p-stefanov commented Sep 22, 2019

stale bot commented Nov 21, 2019

jbmaxwell commented Dec 16, 2019

Stamenov commented Dec 16, 2019

julien-c commented Feb 14, 2020 •

edited

Loading

Is training from scratch possible now? #1283

Is training from scratch possible now? #1283

Comments

Stamenov commented Sep 18, 2019

Zacharias030 commented Sep 18, 2019 • edited Loading

Stamenov commented Sep 18, 2019

Zacharias030 commented Sep 18, 2019 via email

gooofy commented Sep 21, 2019

p-stefanov commented Sep 22, 2019

stale bot commented Nov 21, 2019

jbmaxwell commented Dec 16, 2019

Stamenov commented Dec 16, 2019

julien-c commented Feb 14, 2020 • edited Loading

Zacharias030 commented Sep 18, 2019 •

edited

Loading

julien-c commented Feb 14, 2020 •

edited

Loading