Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] scorer_utils #2329

Closed
vince62s opened this issue Mar 16, 2023 · 0 comments · Fixed by #2544
Closed

[bug] scorer_utils #2329

vince62s opened this issue Mar 16, 2023 · 0 comments · Fixed by #2544
Labels

Comments

@vince62s
Copy link
Member

There are two issues with the scorer (used at validation time):

  1. here: https://github.com/OpenNMT/OpenNMT-py/blob/master/onmt/utils/scoring_utils.py#L49-L58
    we remove all EOS/BOS/PAD tokens from the batch to reconstruct a "genuine" batch that will be passed to the translator.
    However we shouls BOS/EOS the way it was added (depending on the task MT/LM, and on the side SRC/TGT)
    Indeed, when there is such a token in the prefix or suffix strings, it is removed as it should not.
    Exemple: nllb-200, it requires a suffix like " deu_Latn"

  2. when translating requires "translation options" (egs: tgt_file_prefix, but might but true for some others) they are not taken into account since we are in "training mode" wrt the "opts" cases in opts.py
    This may require some adaptation.

@l-k-11235

vince62s added a commit to vince62s/OpenNMT-py that referenced this issue Jan 3, 2024
@vince62s vince62s mentioned this issue Jan 3, 2024
vince62s added a commit that referenced this issue Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant