Failed to find image for token at index XXX for Qwen2-VL-7B #464

jsoma · 2025-02-27T02:22:04Z

Which version of LM Studio?
0.3.10-6 (and earlier)

Which operating system?
macOS Sonoma

What is the bug?
Repeated Failed to find image for token at index XXX errors when using Qwen2-VL-7B models such as qwen2-vl-7b-instruct and allenai_olmocr-7b-0225-preview. Doesn't happen for every image or every prompt, though. Once it hits that error it hangs.

Logs

2025-02-26 21:11:37 [DEBUG] About to embed image
2025-02-26 21:11:39 [DEBUG] BeginProcessingPrompt
2025-02-26 21:11:40 [DEBUG] PromptProcessing: 64.0801
2025-02-26 21:11:41 [DEBUG] Failed to find image for token at index 280
2025-02-26 21:11:41 [DEBUG] PromptProcessing: 99.8748

Token failure index is the same across the two models mentioned above.

To Reproduce

Prompt:

Below is the image of one page of a document, as well as some raw textual content that was previously extracted for it. Just return the plain text representation of this document as if you were reading it naturally.
Do not hallucinate.
RAW_TEXT_START
Page dimensions: 612.0x792.0
[107x583]Chapter 2
[107x533]Mathematical Induction:
[107x503]"And so on . . . "
[107x444]2.1 Introduction
[107x421]This chapter marks our first big step toward investigating mathematical proofs
[107x410]more throughly and learning to construct our own. It is also an introduction
[107x398]to the first significant
[107x398]proof technique
[209x398]we will see. As we describe below,
[107x386]this chapter is meant to be an appetizer, a first taste, of what
[107x386]mathematical
[107x374]induction
[107x374]is and how to use it. A couple of chapters from now, we will we be
[107x362]able to rigorously define induction and
[107x362]prove
[278x362]that this technique is mathemati-
[107x350]cally valid. That's right, we'll actually prove how and why it works! For now,
[107x338]though, we'll continue our investigation of some interesting mathematical puz-
[107x326]zles, with these particular problems hand-picked by us for their use of inductive
[107x314]techniques.
[107x283]2.1.1 Objectives
[107x264]The following short sections in this introduction will show you how this chapter
[107x252]fits into the scheme of the book. They will describe how our previous work
[107x240]will be helpful, they will motivate why we would care to investigate the topics
[107x228]that appear in this chapter, and they will tell you our goals and what you
[107x216]should keep in mind while reading along to achieve those goals. Right now,
[107x204]we will summarize the main objectives of this chapter for you via a series of
[107x192]statements. These describe the skills and knowledge you should have gained by
[107x180]the conclusion of this chapter. The following sections will reiterate these ideas
[107x168]in more detail, but this will provide you with a brief list for future reference.
[107x156]When you finish working through this chapter, return to this list and see if you
[107x144]understand all of these objectives. Do you see why we outlined them here as
[107x132]being important? Can you define all the terminology we use? Can you apply
[107x120]the techniques we describe?
[271x95]101

RAW_TEXT_END

With the following image attached:

It works fine with that image and "hello," though.

The text was updated successfully, but these errors were encountered:

yagil · 2025-02-27T02:24:06Z

Hi @jsoma thanks a lot for the bug report. A few questions:

Can you please share a Hugging Face link to the model?
Can you please attach a screenshot of Cmd + Shift + R?

Thanks

wbste · 2025-03-01T04:55:11Z

I came from the OPs blog post, so hopefully I represent the requested information correctly.

He has two different screenshots, but they're likely the same. https://huggingface.co/lmstudio-community/olmOCR-7B-0225-preview-GGUF and https://huggingface.co/bartowski/allenai_olmOCR-7B-0225-preview-GGUF. FWIW I'm using the lmstudio-community one.
I'm on Windows with the same error. LM Studio 0.3.10 Build 6. I have the latest runtimes (v1.17.1 Vulkan, v1.17.1 CUDA, v1.17.1 llama.cpp). Running on a 3090.

2025-02-28 20:45:07 [DEBUG] 
sampling: 
logits -> logit-bias -> penalties -> dry -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist 
generate: n_ctx = 8192, n_batch = 512, n_predict = 3000, n_keep = 831
2025-02-28 20:45:07 [DEBUG] 
About to embed image
2025-02-28 20:45:19 [DEBUG] 
BeginProcessingPrompt
2025-02-28 20:45:19 [DEBUG] 
PromptProcessing: 64.1604
2025-02-28 20:45:19 [DEBUG] 
Failed to find image for token at index 279
2025-02-28 20:45:19 [DEBUG] 
PromptProcessing: 99.8747

I have the same findings...not sure if it's related to GPU being disabled for CLIP? ggml-org/llama.cpp#10896

smahdink · 2025-03-01T12:54:12Z

I'm running into the same issue. The allenAI released an official GGUF too, I might try that to see if anything is different.

wbste · 2025-03-01T15:57:32Z

I'm running into the same issue. The allenAI released an official GGUF too, I might try that to see if anything is different.

Looks like they forgot the mmproj-model-f16.gguf file. I used the one from the lmstudio-community one and it worked fine in the UI with a small test image.

Ran the same script on the same small and large files and worked the same (small processed fine, Failed to find image for token at index X on the large). I get INFO:openai._base_client:Retrying request to /chat/completions in 0.395332 seconds when this happens in the script.

I copied the query that gets sent and just pasted it into the UI (just to make sure it wasn't something funky with the API/code) and it just hung at Processing Prompt... 100%...I assume it's due to the same issue.

jsoma · 2025-03-02T02:18:35Z

Thanks for covering for me, @wbste, you're spot-on. I tried it with both of those models, same error. One's the Q6_K, one's Q4_K_M, but same error in either case. Just tried again with the lm-studio one, same error, and cmd+shift+R gives me:

I think a gotcha is it doesn't break with all images or all prompts, but it's 100% reproducible for the cases in which it does break (using the image + text above, among plenty of others)

yagil added the more-info-needed Need more information to diagnose the problem label Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to find image for token at index XXX for Qwen2-VL-7B #464

Failed to find image for token at index XXX for Qwen2-VL-7B #464

jsoma commented Feb 27, 2025

yagil commented Feb 27, 2025

wbste commented Mar 1, 2025 •

edited

Loading

smahdink commented Mar 1, 2025

wbste commented Mar 1, 2025

jsoma commented Mar 2, 2025

Failed to find image for token at index XXX for Qwen2-VL-7B #464

Failed to find image for token at index XXX for Qwen2-VL-7B #464

Comments

jsoma commented Feb 27, 2025

yagil commented Feb 27, 2025

wbste commented Mar 1, 2025 • edited Loading

smahdink commented Mar 1, 2025

wbste commented Mar 1, 2025

jsoma commented Mar 2, 2025

wbste commented Mar 1, 2025 •

edited

Loading