Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc bug: wrong token argument name for Tokenizer.from_pretrained() #33183

Open
4 tasks
rravenel opened this issue Aug 29, 2024 · 6 comments
Open
4 tasks

Doc bug: wrong token argument name for Tokenizer.from_pretrained() #33183

rravenel opened this issue Aug 29, 2024 · 6 comments
Labels

Comments

@rravenel
Copy link

System Info

N/A for doc bug.

Who can help?

@stevhliu

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

https://huggingface.co/docs/tokenizers/api/tokenizer#tokenizers.Tokenizer.from_pretrained

from_pretrained
( identifierrevision = 'main'auth_token = None ) → Tokenizer

Parameters

identifier (str) — The identifier of a Model on the Hugging Face Hub, that contains a tokenizer.json file
revision (str, defaults to main) — A branch or commit id
auth_token (str, optional, defaults to None) — An optional auth token used to access private repositories on the Hugging Face Hub

Expected behavior

'auth_token' is incorrect.

'use_auth_token' works, but with the following warning:

/home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages/transformers/modeling_utils.py:3220: FutureWarning: The use_auth_token argument is deprecated and will be removed in v5 of Transformers. Please use token instead.

'token' also works.

@rravenel rravenel added the bug label Aug 29, 2024
@LysandreJik
Copy link
Member

Ah, indeed! The docs should be changd! Would you be down to open a PR to update this?

@rravenel
Copy link
Author

I am brand spanking new here and don't know Rust.

Is this the right file: https://github.com/huggingface/tokenizers/blob/main/bindings/python/src/tokenizer.rs?

And I'm just removing 'auth_' between lines 572 and 613? Not sure about the instance of 'token' on line 605 - ignore?

I'm also not setup on your doc build tools. I can take a stab at this next week.

@stevhliu
Copy link
Member

stevhliu commented Aug 29, 2024

Thanks for catching this! For the docs, you can edit the docstring here 🙂

@rravenel
Copy link
Author

PR submitted.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@rravenel
Copy link
Author

rravenel commented Sep 28, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants