-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing deprecation warnings in new versions of pyterrier
and transformers
#10
base: main
Are you sure you want to change the base?
Conversation
Hi @CodingTil, we'll have a look at this. Does the change prevent older transformer version from working? We try to be fairly conservative in upgrading transformers. Craig |
Hey @cmacdonald. I've just checked this: Unfortunately, I am unable to test whether transformers>=4.0.0,<4.3 is working. The reason for that is, that I have a M1 Mac, and during the installation process (specifically the package tokenizers>=0.10.1,0.11), a rust compilation process is required, and it somehow does not think that aarch64-apple-darwin is a valid target, although it is (nowadays) and also integrated in my rustup toolchain. EDIT: |
…of transformers==4.0.0
@cmacdonald After a lot of trail and error (see commits), I could finally replicate an environment of the time when transformers 4.0.0 was released (November 30th, 2020). I've created a new GitHub workflow job that checks whether pytest runs successfully in the old version - and it does. So, I've changed the |
Hi @CodingTil I'm sorry that this dropped off our radar, we'll try to get this merged soon. Sorry about that! Craig |
In the latest version of
pyterrier
it is recommended to create new transformers by extendingpt.Transformer
directly, instead ofpt.TransfomerBase
.[1] The old approach now throws a deprecation warning. This MR fixes this, be extending now frompt.Transformer
.In the latest version of
transformers
, an issue has been fixed regarding the T5 tokenizers.[2] This MR in thetransformers
package changed how you tokenize sentences (regardingmodel_max_length
,max_length
, andtruncation
for auto-truncation), i.e., you must be explicit now. Legacy behavior is still supported, and the default, ifmodel_max_length
is not being set, resulting in a deprecation warning. I've fixed this by addingmodel_max_length
as a parameter to themonoT5
andduoT5
transformers and a parameter for thetruncation
setting - per default auto-truncation.[3]I have not (yet) updated this package's version (currently 0.0.1). I am unsure if this is deliberate.
Furthermore, I have slightly updated the versions in the
readme.txt
, as well as added more build-tests to the GitHub Workflow.