We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I notice there is a leading space for the first token in the output. e.g. ['▁', 'New', '▁York']
Can I ask what is the purpose of the first spacer? It seams there is no extra benefit. Thank you.
The text was updated successfully, but these errors were encountered:
See https://github.com/google/sentencepiece/blob/master/src/sentencepiece_model.proto#L201
You can disable the dummy prefix with --add_dummy_prefix=false of spm_train.
--add_dummy_prefix=false
spm_train
Sorry, something went wrong.
unk_surface
No branches or pull requests
I notice there is a leading space for the first token in the output.
e.g. ['▁', 'New', '▁York']
Can I ask what is the purpose of the first spacer?
It seams there is no extra benefit.
Thank you.
The text was updated successfully, but these errors were encountered: