You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are two issues with the scorer (used at validation time):
here: https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/utils/scoring_utils.py#L49-L58
we remove all EOS/BOS/PAD tokens from the batch to reconstruct a "genuine" batch that will be passed to the translator.
However we shouls BOS/EOS the way it was added (depending on the task MT/LM, and on the side SRC/TGT)
Indeed, when there is such a token in the prefix or suffix strings, it is removed as it should not.
Exemple: nllb-200, it requires a suffix like " deu_Latn"
when translating requires "translation options" (egs: tgt_file_prefix, but might but true for some others) they are not taken into account since we are in "training mode" wrt the "opts" cases in opts.py
This may require some adaptation.
There are two issues with the scorer (used at validation time):
here: https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/utils/scoring_utils.py#L49-L58
we remove all EOS/BOS/PAD tokens from the batch to reconstruct a "genuine" batch that will be passed to the translator.
However we shouls BOS/EOS the way it was added (depending on the task MT/LM, and on the side SRC/TGT)
Indeed, when there is such a token in the prefix or suffix strings, it is removed as it should not.
Exemple: nllb-200, it requires a suffix like " deu_Latn"
when translating requires "translation options" (egs: tgt_file_prefix, but might but true for some others) they are not taken into account since we are in "training mode" wrt the "opts" cases in opts.py
This may require some adaptation.
@l-k-11235
The text was updated successfully, but these errors were encountered: