Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: inconsistent dimensions in the vector table, When changing the embedding model of the empty knowledge base and uploading documents. #12482

Closed
5 tasks done
liyan36 opened this issue Jan 8, 2025 · 6 comments · Fixed by #12495
Assignees
Labels
👻 feat:rag Embedding related issue, like qdrant, weaviate, milvus, vector database.

Comments

@liyan36
Copy link

liyan36 commented Jan 8, 2025

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.15.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

  1. chang the embedding model of the empty knowledge base
  2. uploading documents to knowledge base

✔️ Expected Behavior

Document indexing failed, retrying do not work.

❌ Actual Behavior

vector dimension conflict!

@dosubot dosubot bot added the 👻 feat:rag Embedding related issue, like qdrant, weaviate, milvus, vector database. label Jan 8, 2025
Copy link

dosubot bot commented Jan 8, 2025

Hello @liyan36! I'm here to help you with any bugs, questions, or contributions you have for the Dify project. Let's work together to resolve the issue you're experiencing.

To resolve the "inconsistent dimensions in the vector table" error in Dify version 0.15.0, you can follow these steps:

  1. Ensure that the embedding model is set to 'text-embedding-ada-002' in the general dataset settings, as this model uses the expected embedding dimension of 1536.
  2. Check the settings of each document in your dataset to identify any that are using different embedding models, which may cause dimension mismatches (e.g., 1536 vs 3072).
  3. Reindex the documents using the same embedding model to ensure consistent dimensions across all documents. This will resolve the shape alignment error between different vector dimensions [1].

These steps should help resolve the error by ensuring all documents have consistent embedding dimensions.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@liyan36
Copy link
Author

liyan36 commented Jan 8, 2025

@jyong @dosu 申请成功开发者解决该问题

Copy link

dosubot bot commented Jan 8, 2025

This is a tough nut to crack, could really use your help on this, @takatost.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@crazywoola
Copy link
Member

For empty datasets you can not change the embedding models, see image's below.
image

@crazywoola crazywoola added the 🤔 cant-reproduce We can not reproduce it or the information is limited label Jan 8, 2025
@liyan36
Copy link
Author

liyan36 commented Jan 8, 2025

@crazywoola
Reproduction steps:

  1. Upload a document to the knowledge base and select a specific embedding model.
  2. Delete the document in the knowledge base but do not delete the knowledge base itself, resulting in an empty knowledge base.
  3. Change the embedding model of the knowledge base.
  4. Upload a new document.

@crazywoola
Copy link
Member

crazywoola commented Jan 8, 2025

  1. Change the embedding model of the knowledge base.
  2. Upload a new document.

The embedding model in step 3 should be the same as step 4.
I reproduced this.

@crazywoola crazywoola removed the 🤔 cant-reproduce We can not reproduce it or the information is limited label Jan 8, 2025
@JohnJyong JohnJyong mentioned this issue Jan 8, 2025
5 tasks
crazywoola pushed a commit that referenced this issue Jan 8, 2025
alexcodelf pushed a commit to alexcodelf/dify that referenced this issue Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👻 feat:rag Embedding related issue, like qdrant, weaviate, milvus, vector database.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants