[Bug] 分块失败 #4975

wangxiaodong1021 · 2024-12-11T08:46:00Z

📦 Environment

Zeabur

📌 Version

v1.36.6

💻 Operating System

Ubuntu

🌐 Browser

Edge

🐛 Bug Description

将一个网页保存为 pdf，分块失败，pdf 在知识库里可以正常打开查阅。且之前的别的 pdf 都正常，我将 pdfzuo'wei 作为附件提供了

 Diffusion Meets Flow Matching.pdf

📷 Recurrence Steps

No response

🚦 Expected Behavior

No response

📝 Additional Information

No response

lobehubbot · 2024-12-11T08:46:14Z

👀 @wangxiaodong1021

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible.
Please make sure you have given us as much context as possible.
非常感谢您提交 issue。我们会尽快调查此事，并尽快回复您。请确保您已经提供了尽可能多的背景信息。

dosubot · 2024-12-11T08:48:01Z

@wangxiaodong1021 你好！我是Dosu，一个可以帮助你解决 bug、回答问题，甚至成为贡献者的机器人。我会在这里帮助你，等待人类维护者的到来。期待与你一起解决问题！🤖

The error message "invalid byte sequence for encoding 'UTF8': 0x00" suggests that there might be an issue with the character encoding of the PDF file when attempting to process it. This could be due to non-UTF8 characters or null bytes in the file.

Here are a few steps you can take to troubleshoot and potentially resolve this issue:

Check the PDF File: Ensure that the PDF file does not contain any corrupted data or non-standard characters that might cause encoding issues.
Encoding Conversion: Try converting the PDF file to ensure it is encoded in UTF-8. You can use tools like iconv or similar to convert the file encoding.
Update Dependencies: Make sure that all dependencies related to PDF processing in your environment are up to date. The lobe-chat project uses libraries such as pdf-parse, pdfjs-dist, and react-pdf for PDF handling, which might need updates ^[1].
Inspect Logs: Check the application logs for any additional error messages or stack traces that might provide more context about the failure.
Test with Other PDFs: Since other PDFs are working fine, compare the problematic PDF with those that work to identify any differences in structure or content.

If the issue persists, you might need to delve deeper into the specific PDF processing logic in the lobe-chat codebase or consider reaching out to the maintainers for more detailed support.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

157374010 · 2024-12-20T10:34:11Z

附议，我也遇到了同样的问题同样的报错。Window11，Chrome浏览器

lobehubbot · 2024-12-20T10:34:25Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Seconded, I also encountered the same problem and the same error. Window11, Chrome browser

wangxiaodong1021 added the 🐛 Bug Something isn't working | 缺陷 label Dec 11, 2024

github-project-automation bot added this to Lobe Chat Routine Dec 11, 2024

github-project-automation bot moved this to Roadmap - Chat 1.x in Lobe Chat Routine Dec 11, 2024

dosubot bot added the files 上传文件/知识库 label Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] 分块失败 #4975

[Bug] 分块失败 #4975

wangxiaodong1021 commented Dec 11, 2024

lobehubbot commented Dec 11, 2024

dosubot bot commented Dec 11, 2024

157374010 commented Dec 20, 2024

lobehubbot commented Dec 20, 2024

[Bug] 分块失败 #4975

[Bug] 分块失败 #4975

Comments

wangxiaodong1021 commented Dec 11, 2024

📦 Environment

📌 Version

💻 Operating System

🌐 Browser

🐛 Bug Description

📷 Recurrence Steps

🚦 Expected Behavior

📝 Additional Information

lobehubbot commented Dec 11, 2024

dosubot bot commented Dec 11, 2024

157374010 commented Dec 20, 2024

lobehubbot commented Dec 20, 2024