Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom metadata to uploaded files #93

Conversation

ronal2do
Copy link
Contributor

Hi, not sure if is intended, but users are not able to save custom metadata to the vector when uploading a file from ragchat.

Usage example:

await ragChat.context.add({
  type: "pdf",
  fileSource: "./data/physics_basics.pdf",
  options: { 
    namespace: "user-123-documents",
    metadata: {
      theme: "physics_basics",
    }
  },
});

Given a setup RagChat instance,
When adding a new context from sources (e.g., PDF),
And including custom metadata,
Then users should be able to visualize their custom metadata in the Upstash vector console.

Please refer to the attached screenshot for a visual example of the issue:
Screenshot 2024-11-13 at 21 55 06

Proposed solution on this PR extends config.options?.metadata for mapDocumentsIntoInsertPayload and turn it into a private method in order to be able to read the config.

@@ -0,0 +1,256 @@
/* eslint-disable @typescript-eslint/no-unsafe-assignment */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to use data: expect.any(String), instead add the entiry data as string. same for ids, it will change on every run.

@@ -0,0 +1,256 @@
/* eslint-disable @typescript-eslint/no-unsafe-assignment */
/* eslint-disable @typescript-eslint/no-explicit-any */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was added to test a bad configuration, I know we must trust on TS, but on the file we have many anys already, but was used for the last 2 tests, it kinda was just to add more branch coverage

id: nanoid(),
metadata: {
...(metadataMapper ? metadataMapper(document.metadata, index) : {}),
...this.config.options?.metadata,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was the biggest change here,

I'm destructuring the current metadata, to accept the provided metadata by the user, as we can do on raw texts

@CahidArda
Copy link
Contributor

closing this PR as it was merged through #95. Thanks for the contribution!

@CahidArda CahidArda closed this Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants