-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metadata filtering support and fix multi-document and metadata issue #22
base: main
Are you sure you want to change the base?
Conversation
@bclavie |
Hey, thanks for this, this PR is helpful! I have just a couple change requests
I'll play with this further in the next few days just to make sure nothing's broken. Thank you again! |
@bclavie @Athe-kunal I was trying to use byaldi for my project and realized that this PR still has an issue with the metadata when there are multiple documents. In line 340, In line 406, we once again do current_metadata = metadata[i] if metadata else None -> This will give a key error. This happens specifically in v0.0.4 |
Hi @nuschandra and @bclavie |
@bclavie |
@nuschandra |
This PR includes the following changes
add_to_index
will error out because in the below code, it tries to index the ith element, but it is a dictionary, hence it will raiseKeyError
. Fixed it by renamingmetadata
asdoc_metadata
and removed the indexingfilter_metadata
field in thesearch
and get ColPali will only search from these documents.Results
Now if we check doc ID to metadata
Results
It only pulled from Doc ID 4, which we intended.
Please let me know about your suggestions. Looking forward to your collaboration