Skip to content

Commit

Permalink
[Doc] Add note to gte-Qwen2 models (vllm-project#11808)
Browse files Browse the repository at this point in the history
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
  • Loading branch information
DarkLight1337 authored and Ubuntu committed Jan 19, 2025
1 parent b6f3387 commit a7302a2
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/source/models/supported_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -430,6 +430,9 @@ You can set `--hf-overrides '{"is_causal": false}'` to change the attention mask
On the other hand, its 1.5B variant (`Alibaba-NLP/gte-Qwen2-1.5B-instruct`) uses causal attention
despite being described otherwise on its model card.
Regardless of the variant, you need to enable `--trust-remote-code` for the correct tokenizer to be
loaded. See [relevant issue on HF Transformers](https://github.com/huggingface/transformers/issues/34882).
```

If your model is not in the above list, we will try to automatically convert the model using
Expand Down

0 comments on commit a7302a2

Please sign in to comment.