Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Fail to bind LLM used by RAPTOR #4126

Open
1 task done
dromeuf opened this issue Dec 19, 2024 · 6 comments
Open
1 task done

[Bug]: Fail to bind LLM used by RAPTOR #4126

dromeuf opened this issue Dec 19, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@dromeuf
Copy link

dromeuf commented Dec 19, 2024

Is there an existing issue for the same bug?

  • I have checked the existing issues.

RAGFlow workspace code commit ID

8939206

RAGFlow image version

v0.15.0-slim

Other environment information

Linux Ubuntu 5.15.167.4-microsoft-standard-WSL2

Actual behavior

I use version 0.15.0-slim with local Ollama for embedding (snowflake-arctic-embed2) and LLM (qwen2.5:14b) with success for parse.

If I activate RAPTOR for a knowledge base, I get an error and Fail parsing :


[ERROR]Fail to bind LLM used by RAPTOR: 3 vs. 4
[ERROR]handle_task got exception, please check log

logs:

2024-12-19 10:38:25,090 INFO     32 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2024-12-19T10:38:25.090195", "boot_at": "2024-12-19T09:11:49.795794"
, "pending": 1, "lag": 0, "done": 4, "failed": 1, "current": {"id": "927bb7e4bdec11efb92d0242ac120006", "doc_id": "af0101f6bde911ef8f240242ac120006", "from_page": 100000000, "t
o_page": 100000000, "retry_count": 0, "kb_id": "4576b942bde911efb5920242ac120006", "parser_id": "paper", "parser_config": {"auto_keywords": 0, "auto_questions": 0, "raptor": {"
use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n      {cluster_content}\n
The above is the content you need to summarize.", "max_token": 256, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}}, "name": "MNRAS_67P_ModelSAP_stae1290.pdf", "type": 
"pdf", "location": "MNRAS_67P_ModelSAP_stae1290.pdf", "size": 6289538, "tenant_id": "ee69c9acbde111efbe830242ac120006", "language": "English", "embd_id": "snowflake-arctic-embe
d2:latest@Ollama", "pagerank": 0, "img2txt_id": "llama3.2-vision:latest@Ollama", "asr_id": "", "llm_id": "qwen2.5:14b@Ollama", "update_time": 1734600923929, "task_type": "rapto
r"}}                                                                                                                                                                            
2024-12-19 10:38:28,448 INFO     32 HTTP Request: POST http://host.docker.internal:11434/api/chat "HTTP/1.1 200 OK"                                                             
2024-12-19 10:38:28,488 INFO     32 HTTP Request: POST http://host.docker.internal:11434/api/embeddings "HTTP/1.1 200 OK"                                                       
2024-12-19 10:38:28,512 ERROR    32 summarize got exception                                                                                                                     
Traceback (most recent call last):                                                                                                                                              
  File "/ragflow/rag/raptor.py", line 92, in summarize                                                                                                                          
    chunks.append((cnt, self._embedding_encode(cnt)))                                                                                                                           
  File "/ragflow/rag/raptor.py", line 48, in _embedding_encode                                                                                                                  
    response = get_embed_cache(self._embd_model.llm_name, txt)                                                                                                                  
  File "/ragflow/graphrag/utils.py", line 104, in get_embed_cache                                                                                                               
    return np.array(json.loads(bin.decode("utf-8")))                                                                                                                            
AttributeError: 'str' object has no attribute 'decode'. Did you mean: 'encode'?                                                                                                 
2024-12-19 10:38:28,515 INFO     32 set_progress(927bb7e4bdec11efb92d0242ac120006), progress: -1, progress_msg: Page(100000001~100000001): [ERROR]Fail to bind LLM used by RAPTO
R: 3 vs. 4                                                                                                                                                                      
2024-12-19 10:38:28,545 ERROR    32 Fail to bind LLM used by RAPTOR: 3 vs. 4                                                                                                    
Traceback (most recent call last):                                                                                                                                              
  File "/ragflow/rag/svr/task_executor.py", line 438, in do_handle_task                                                                                                         
    chunks, token_count, vector_size = run_raptor(task, chat_model, embedding_model, progress_callback)                                                                         
  File "/ragflow/rag/svr/task_executor.py", line 370, in run_raptor                                                                                                             
    chunks = raptor(chunks, row["parser_config"]["raptor"]["random_seed"], callback)                                                                                            
  File "/ragflow/rag/raptor.py", line 132, in __call__                                                                                                                          
    assert len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end, n_clusters)                                                                                   
AssertionError: 3 vs. 4

My Ollama local LLM I'm using for RAPTOR isn't compatible ? or other bug / problem ?

Thank's for your great work.

Expected behavior

No response

Steps to reproduce

I can't find any information about RAPTOR LLM in the documentation and the problem is identical for each document added.

Additional information

No response

@dromeuf dromeuf added the bug Something isn't working label Dec 19, 2024
@KevinHuSh KevinHuSh mentioned this issue Dec 20, 2024
1 task
KevinHuSh added a commit that referenced this issue Dec 20, 2024
### What problem does this PR solve?

#4126
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
@dromeuf
Copy link
Author

dromeuf commented Dec 21, 2024

It's ok now with the nightly. Tanks.

@dromeuf
Copy link
Author

dromeuf commented Dec 22, 2024

For 3/51 documents I always get a RAPTOR error, but if I relaunch in UI, keeping the chunks already calculated (not clear existing chunks), it manages to finish and the document goes to success status. unfortunately, as I've launched about twenty simultaneous parse operations, I can't find the console logs for sure.

For 1/51 it's not working (Rapport 2005_I.pdf)

Progress:
Page(0~12): reused previous task's chunks.
Page(12~24): reused previous task's chunks.
Page(24~36): reused previous task's chunks.
Page(36~48): reused previous task's chunks.
Page(48~60): reused previous task's chunks.
Page(60~72): reused previous task's chunks.
Page(72~84): reused previous task's chunks.
Page(84~96): reused previous task's chunks.
Page(96~108): reused previous task's chunks.
Page(108~117): reused previous task's chunks.
Start to do RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval).
Task has been received.
Page(100000001~100000001): Cluster one layer: 320 -> 7
Page(100000001~100000001): Cluster one layer: 7 -> 4
Page(100000001~100000001): [ERROR]Fail to bind LLM used by RAPTOR: 2 vs. 3
[ERROR]handle_task got exception, please check log

log console

2024-12-22 10:34:52,402 INFO     43360 HTTP Request: POST http://host.docker.internal:11434/api/chat "HTTP/1.1 200 OK"
2024-12-22 10:34:52,450 INFO     43360 HTTP Request: POST http://host.docker.internal:11434/api/embeddings "HTTP/1.1 200 OK"
2024-12-22 10:34:52,467 ERROR    43360 summarize got exception
Traceback (most recent call last):
  File "/ragflow/rag/raptor.py", line 92, in summarize
    chunks.append((cnt, self._embedding_encode(cnt)))
  File "/ragflow/rag/raptor.py", line 49, in _embedding_encode
    if response:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
2024-12-22 10:34:52,470 INFO     43360 set_progress(994223dac04711efaefd0242ac120006), progress: -1, progress_msg: Page(100000001~100000001): [ERROR]Fail to bind LLM used by RAPTOR: 2 vs. 3
2024-12-22 10:34:52,488 ERROR    43360 Fail to bind LLM used by RAPTOR: 2 vs. 3
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 438, in do_handle_task
    chunks, token_count, vector_size = run_raptor(task, chat_model, embedding_model, progress_callback)
  File "/ragflow/rag/svr/task_executor.py", line 370, in run_raptor
    chunks = raptor(chunks, row["parser_config"]["raptor"]["random_seed"], callback)
  File "/ragflow/rag/raptor.py", line 132, in __call__
    assert len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end, n_clusters)
AssertionError: 2 vs. 3
2024-12-22 10:34:52,491 INFO     43360 set_progress(994223dac04711efaefd0242ac120006), progress: -1, progress_msg: [ERROR]handle_task got exception, please check log
2024-12-22 10:34:52,507 ERROR    43360 handle_task got exception for task {"id": "994223dac04711efaefd0242ac120006", "doc_id": "5a65db7cbfa011efbd950242ac120006", "from_page": 100000000, "to_page": 100000000, "retry_count": 0, "kb_id": "d5f8d862bf9f11efb7f40242ac120006", "parser_id": "book", "parser_config": {"auto_keywords": 0, "auto_questions": 0, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n      {cluster_content}\nThe above is the content you need to summarize.", "max_token": 256, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}}, "name": "Rapport 2005_I.pdf", "type": "pdf", "location": "Rapport 2005_I.pdf", "size": 15740366, "tenant_id": "ee69c9acbde111efbe830242ac120006", "language": "English", "embd_id": "snowflake-arctic-embed2:latest@Ollama", "pagerank": 0, "img2txt_id": "llama3.2-vision:latest@Ollama", "asr_id": "", "llm_id": "qwen2.5:14b@Ollama", "update_time": 1734859921824, "task_type": "raptor"}
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 511, in handle_task
    do_handle_task(task)
  File "/ragflow/rag/svr/task_executor.py", line 438, in do_handle_task
    chunks, token_count, vector_size = run_raptor(task, chat_model, embedding_model, progress_callback)
  File "/ragflow/rag/svr/task_executor.py", line 370, in run_raptor
    chunks = raptor(chunks, row["parser_config"]["raptor"]["random_seed"], callback)
  File "/ragflow/rag/raptor.py", line 132, in __call__
    assert len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end, n_clusters)
AssertionError: 2 vs. 3
2024-12-22 10:35:01,784 INFO     43360 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2024-12-22T10:35:01.784164", "boot_at": "2024-12-21T23:47:57.830717", "pending": 0, "lag": 0, "done": 48, "failed": 4, "current": null}
2024-12-22 10:35:04,238 INFO     23 172.18.0.6 - - [22/Dec/2024 10:35:04] "GET /v1/document/list?kb_id=d5f8d862bf9f11efb7f40242ac120006&keywords=&page_size=100&page=1 HTTP/1.1" 200 -
2024-12-22 10:35:19,396 INFO     23 172.18.0.6 - - [22/Dec/2024 10:35:19] "GET /v1/document/list?kb_id=d5f8d862bf9f11efb7f40242ac120006&keywords=&page_size=100&page=1 HTTP/1.1" 200 -
2024-12-22 10:35:31,815 INFO     43360 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2024-12-22T10:35:31.815294", "boot_at": "2024-12-21T23:47:57.830717", "pending": 0, "lag": 0, "done": 48, "failed": 4, "current": null}

Thanks for your great work

@rplescia
Copy link

rplescia commented Jan 14, 2025

@KevinHuSh Hi, I'm still having this issue in 0.15.1. When I use Ollama (llama3.1), but also when I use Together.ai LLM (llama 3.3) it has the same issue

@rplescia
Copy link

@KevinHuSh This is still a bug for me in 0.15.1 using either Ollama or Together.ai as an inference server.

2025-01-20 13:31:37,682 INFO 17 HTTP Request: POST http://ollamainference:11434/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-20 13:31:38,446 ERROR 17 LLMBundle.encode can't update token usage for e39eb6dccce011ef965e0242ac120006/EMBEDDING used_tokens: 322
2025-01-20 13:31:39,179 ERROR 17 LLMBundle.encode can't update token usage for e39eb6dccce011ef965e0242ac120006/EMBEDDING used_tokens: 322
2025-01-20 13:31:39,467 INFO 17 HTTP Request: POST http://ollamainference:11434/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-20 13:31:40,348 ERROR 17 LLMBundle.encode can't update token usage for e39eb6dccce011ef965e0242ac120006/EMBEDDING used_tokens: 359
2025-01-20 13:31:41,174 ERROR 17 LLMBundle.encode can't update token usage for e39eb6dccce011ef965e0242ac120006/EMBEDDING used_tokens: 359
2025-01-20 13:31:43,019 INFO 17 HTTP Request: POST http://ollamainference:11434/v1/chat/completions "HTTP/1.1 200 OK"
2025-01-20 13:31:44,021 ERROR 17 LLMBundle.encode can't update token usage for e39eb6dccce011ef965e0242ac120006/EMBEDDING used_tokens: 411
2025-01-20 13:31:44,369 INFO 17 HTTP Request: POST http://ollamainference:11434/v1/chat/completions "HTTP/1.1 500 Internal Server Error"
2025-01-20 13:31:44,370 ERROR 17 summarize got exception
Traceback (most recent call last):
File "/ragflow/rag/raptor.py", line 82, in summarize
cnt = self._chat("You're a helpful assistant.",
File "/ragflow/rag/raptor.py", line 43, in _chat
raise Exception(response)
Exception: ERROR: Error code: 500 - {'error': {'message': 'POST predict: Post "http://127.0.0.1:37507/completion": EOF', 'type': 'api_error', 'param': None, 'code': None}}
2025-01-20 13:31:44,478 INFO 17 HTTP Request: POST http://ollamainference:11434/v1/chat/completions "HTTP/1.1 500 Internal Server Error"
2025-01-20 13:31:44,478 INFO 17 HTTP Request: POST http://ollamainference:11434/v1/chat/completions "HTTP/1.1 500 Internal Server Error"
2025-01-20 13:31:44,479 INFO 17 HTTP Request: POST http://ollamainference:11434/v1/chat/completions "HTTP/1.1 500 Internal Server Error"
2025-01-20 13:31:44,480 ERROR 17 summarize got exception
Traceback (most recent call last):
File "/ragflow/rag/raptor.py", line 82, in summarize
cnt = self._chat("You're a helpful assistant.",
File "/ragflow/rag/raptor.py", line 43, in _chat
raise Exception(response)
Exception: ERROR: Error code: 500 - {'error': {'message': 'an error was encountered while running the model: unexpected EOF', 'type': 'api_error', 'param': None, 'code': None}}
2025-01-20 13:31:44,481 ERROR 17 summarize got exception
Traceback (most recent call last):
File "/ragflow/rag/raptor.py", line 82, in summarize
cnt = self._chat("You're a helpful assistant.",
File "/ragflow/rag/raptor.py", line 43, in _chat
raise Exception(response)
Exception: ERROR: Error code: 500 - {'error': {'message': 'an error was encountered while running the model: unexpected EOF', 'type': 'api_error', 'param': None, 'code': None}}
2025-01-20 13:31:44,481 ERROR 17 summarize got exception
Traceback (most recent call last):
File "/ragflow/rag/raptor.py", line 82, in summarize
cnt = self._chat("You're a helpful assistant.",
File "/ragflow/rag/raptor.py", line 43, in _chat
raise Exception(response)
Exception: ERROR: Error code: 500 - {'error': {'message': 'an error was encountered while running the model: unexpected EOF', 'type': 'api_error', 'param': None, 'code': None}}
2025-01-20 13:31:45,010 ERROR 17 LLMBundle.encode can't update token usage for e39eb6dccce011ef965e0242ac120006/EMBEDDING used_tokens: 411
2025-01-20 13:31:45,016 INFO 17 set_progress(a0892a42d73211ef9bad0242ac120006), progress: -1, progress_msg: 13:31:45 Page(100000001~100000001): [ERROR]Fail to bind LLM used by RAPTOR: 11 vs. 15
2025-01-20 13:31:45,022 ERROR 17 Fail to bind LLM used by RAPTOR: 11 vs. 15
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 440, in do_handle_task
chunks, token_count, vector_size = run_raptor(task, chat_model, embedding_model, progress_callback)
File "/ragflow/rag/svr/task_executor.py", line 372, in run_raptor
chunks = raptor(chunks, row["parser_config"]["raptor"]["random_seed"], callback)
File "/ragflow/rag/raptor.py", line 132, in call
assert len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end, n_clusters)
AssertionError: 11 vs. 15
2025-01-20 13:31:45,025 INFO 17 set_progress(a0892a42d73211ef9bad0242ac120006), progress: -1, progress_msg: 13:31:45 [ERROR]handle_task got exception, please check log
2025-01-20 13:31:45,031 ERROR 17 handle_task got exception for task {"id": "a0892a42d73211ef9bad0242ac120006", "doc_id": "210ca866d1ce11ef92290242ac120006", "from_page": 100000000, "to_page": 100000000, "retry_count": 0, "kb_id": "e45d2d96d1cd11efb6c60242ac120006", "parser_id": "laws", "parser_config": {"auto_keywords": 10, "auto_questions": 0, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful and consistent with the numbering, do not make things up. Paragraphs as follows:\n {cluster_content}\nThe above is the content you need to summarize.", "max_token": 875, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}, "layout_recognize": true, "task_page_size": 12, "pages": [[1, 1024]]}, "name": "Project-Coach-Acquisition-Term-Facility-Agreement-EXECUTED-05.09.2022_Redacted.pdf", "type": "pdf", "location": "Project-Coach-Acquisition-Term-Facility-Agreement-EXECUTED-05.09.2022_Redacted.pdf", "size": 902627, "tenant_id": "e39eb6dccce011ef965e0242ac120006", "language": "English", "embd_id": "BAAI/bge-large-en-v1.5@FastEmbed", "pagerank": 0, "img2txt_id": "", "asr_id": "", "llm_id": "llama3.1___OpenAI-API@OpenAI-API-Compatible", "update_time": 1737379791346, "task_type": "raptor"}
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 513, in handle_task
do_handle_task(task)
File "/ragflow/rag/svr/task_executor.py", line 440, in do_handle_task
chunks, token_count, vector_size = run_raptor(task, chat_model, embedding_model, progress_callback)
File "/ragflow/rag/svr/task_executor.py", line 372, in run_raptor
chunks = raptor(chunks, row["parser_config"]["raptor"]["random_seed"], callback)
File "/ragflow/rag/raptor.py", line 132, in call
assert len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end, n_clusters)
AssertionError: 11 vs. 15

@KevinHuSh
Copy link
Collaborator

Ollama seems have this kind of issue here.
What about swithing to another LLM?

@rplescia
Copy link

I tried it with Together.ai with llama3.3, the same thing happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants