-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ConveRTTokenizer #4984
Add ConveRTTokenizer #4984
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome implementation! Left a couple of minor comments. I would suggest adding one more test to check if the embedding of CLS token is correctly set to the sentence encoding.
@tabergma Can you also rename |
Discussed. There is a bit more to do.
We need to add more functionality. Will open a new PR once implementation is done. |
Proposed changes:
ConveRTTokenizer
ConveRTFeaturizer
can now be used withreturn_sequence
set toTrue
ConveRTTokenizer
as tokenizer in the pipelinepretrained_embeddings_convert
closes #4978
Status (please check what you already did):
black
(please check Readme for instructions)