Skip to content

NMT Build Options

Damien Daspit edited this page Dec 14, 2023 · 1 revision

Overview

It is possible to override the build settings for the NMT engine by setting the options property when starting a build. The options property accepts a JSON-formatted configuration. Any settings included in the JSON configuration will override the default settings for the engine. The NMT engine is based on Hugging Face Transformers. The settings allow you to configure training and inferencing of the Hugging Face model. The following is an example configuration that sets the maximum training steps to 10000:

{
    "train_params": {
         "max_steps": 10000
    }
}

Settings Reference

  • parent_model_name: the name of the parent model. See Hugging Face Hub for available models. The model must be an NMT model.
  • train_params: the training section. All available settings can be found in the Hugging Face documentation.
    • max_steps: the total number of training steps to perform.
    • num_train_epochs: the total number of training epochs to perform.
    • per_device_train_batch_size: the training batch size.
    • gradient_accumulation_steps: the number of update steps to accumulate the gradients for, before performing a backward/update pass.
    • optim: the optimizer.
    • learning_rate: the initial learning rate for the optimizer.
    • warmup_steps: the number of steps used for a linear warmup from 0 to learning_rate.
    • label_smoothing_factor: the label smoothing factor to use.
    • gradient_checkpointing: if true, use gradient checkpointing to save memory at the expense of slower backward pass.
  • generate_params: the generation section.
    • num_beams: the beam width of the decoder search. If a value of 1 is specified, then the search becomes a greedy search.
    • batch_size: the generation batch size.
  • tokenizer: the tokenizer section.
    • add_unk_src_tokens: if true, then all unknown characters in the source corpus will be added to the vocabulary.
    • add_unk_trg_tokens: if true, then all unknown characters in the target corpus will be added to the vocabulary.

The default settings can be found in the default/huggingface section of the Machine.py job settings file. For training hyperparameters that are not specified in the file, the default values for the Seq2SeqTrainingArguments class are used.