Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support Deepseek R series models #226

Conversation

Ubospica
Copy link
Collaborator

@Ubospica Ubospica commented Mar 5, 2025

This PR:

  1. Changes the detection logic for special tokens. Now tokens starting with < and ending with > is not treated as special tokens. Only empty string "" will be treated as a special token. Therefore, the <...> tokens will not be masked by default and can be generated. This allows certain patterns such as <think>...</think> to exist.
  2. Add test for deepseek R1 and distilled R1 models.
  3. Rename prepend_space_in_tokenization to add_prefix_space to align with huggingface tokenizer.

@Ubospica Ubospica force-pushed the main-dev/2025-02-06-allow-special-token-generated branch from 9bf387e to 9f26305 Compare March 5, 2025 22:48
@Ubospica Ubospica merged commit 7a8585d into mlc-ai:main Mar 5, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant