-
Notifications
You must be signed in to change notification settings - Fork 27.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The same situation as #31377 occurred when using Qwen/Qwen2-VL-7B-Instruct #33399
Comments
这是来自QQ邮箱的假期自动回复邮件。
您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
|
Sorry, my expression may have caused you misunderstanding. I encountered a problem similar to issue #31377. However, given the differences in implementation logic between idefics2 and Qwen/Qwen2-VL-7B-Instruct, I'm unsure whether the causes of these similar phenomena are the same. Despite downloading and compiling the latest mainline code, the issue remains unresolved.
|
@toondata can you share the hash pls, I can't find it. But I tried to run your code in the latest import requests
from PIL import Image
import torch
from transformers import Qwen2VLForConditionalGeneration,AutoModel,AutoProcessor
model_path="Qwen/Qwen2-VL-7B-Instruct"
model = Qwen2VLForConditionalGeneration.from_pretrained(model_path, torch_dtype=torch.bfloat16,).to("cuda:0")
min_pixels = 256*28*28
max_pixels = 1280*28*28
processor = AutoProcessor.from_pretrained(model_path,
min_pixels=min_pixels,
max_pixels=max_pixels
)
messages = [
{
"role": "user",
"content": [
{
"type": "image"
},
{
"type": "text",
"text": "Extract text from pdf"
}
]
}
]
image = Image.open(requests.get("https://www.ilankelman.org/stopsigns/australia.jpg", stream=True).raw)
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = processor( text=[text], images=[image],).to("cuda:0")
# Inference: Generation of the output
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text) |
Here is my git hash: 96429e7. |
I ran your code and image source with the device changed to MPS, and the issue remains the same, except the tensor that caused the issue has different dimensions. File "/Users/dev/products/dev/workspaces/transformers/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1683, in forward
|
maybe #30294 helps, it has a solution that worked for llava with mps |
After looking at #30294, I feel the issue might not be related. I switched my local code to run on the CPU, and the problem is the same as with MPS.
File "/Users/dev/products/dev/workspaces/transformers/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1683, in forward |
I could also get a colab notebook working with the script, and the error on cpu also might happen as per the linked issue. Let me see if I can get an mps to reproduce it, will need some time to dig |
Thank you very much, looking forward to the results of your digings. |
met the same question |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
这是来自QQ邮箱的假期自动回复邮件。
您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
|
hey @toondata , |
这是来自QQ邮箱的假期自动回复邮件。
您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
|
Getting same error:
|
这是来自QQ邮箱的假期自动回复邮件。
您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
|
My dataset was missing |
Commenting to get updates. I have a similar error using colqwen. from colpali_engine.models import ColQwen2, ColQwen2Processor
from colpali_engine.utils.torch_utils import get_torch_device
from transformers.models.qwen2_vl import Qwen2VLForConditionalGeneration, Qwen2VLProcessor 2024-12-02 14:21:56,934 - ERROR - Error processing documents_img/image.png:
Image features and image tokens do not match: tokens: 0, features 2160
Traceback (most recent call last):
File "/content/vision-rag/vision-rag/colqwen.py", line 146, in process_single_document
outputs = self.model(**inputs, output_hidden_states=True)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/content/vision-rag/vision-rag/colqwen.py", line 47, in forward
return Qwen2VLForConditionalGeneration.forward(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1690, in forward
raise ValueError(
ValueError: Image features and image tokens do not match: tokens: 0, features 2160 /content/vision-rag# pip show transformers colpali-engine
Name: transformersg# clear
Version: 4.46.2-rag# pip show transformers colpali-eng
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /usr/local/lib/python3.10/dist-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: colpali_engine, peft, sentence-transformers
---
Name: colpali_engine
Version: 0.3.4
Summary: The code used to train and run inference with the ColPali architecture.
Home-page: https://github.com/illuin-tech/colpali
Author:
Author-email: Manuel Faysse <manuel.faysse@illuin.tech>, Hugues Sibille <hugues.sibille@illuin.tech>, Tony Wu <tony.wu@illuin.tech>
License:
Location: /usr/local/lib/python3.10/dist-packages
Requires: gputil, numpy, peft, pillow, requests, torch, transformers
Required-by: |
Just a heads up, this issue is for the |
Yes, also have this problem when running on vLLM to Qwen2-VL. This error always happened right just doing parallel request. Edit: |
|
这是来自QQ邮箱的假期自动回复邮件。
您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
这是来自QQ邮箱的假期自动回复邮件。
您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。
|
System Info
transformers
version: 4.45.0.dev0Who can help?
@zucchini-nlp @amyer
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run this code after git clone with the hash I specified above and pip install ./transformers
Expected behavior
File "/Users/dev/products/dev/workspaces/mixparse/llm/model/modelmanager.py", line 429, in _run_safetensors_inference
generated_ids = model.generate(**inputs, max_new_tokens=128)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dev/anaconda3/envs/all-parse/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/dev/anaconda3/envs/all-parse/lib/python3.12/site-packages/transformers/generation/utils.py", line 2015, in generate
result = self._sample(
^^^^^^^^^^^^^
File "/Users/dev/anaconda3/envs/all-parse/lib/python3.12/site-packages/transformers/generation/utils.py", line 2965, in _sample
outputs = self(**model_inputs, return_dict=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dev/anaconda3/envs/all-parse/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dev/anaconda3/envs/all-parse/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/dev/anaconda3/envs/all-parse/lib/python3.12/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1683, in forward
inputs_embeds[image_mask] = image_embeds
RuntimeError: shape mismatch: value tensor of shape [630, 3584] cannot be broadcast to indexing result of shape [0, 3584]
The text was updated successfully, but these errors were encountered: