Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Async batch processing, limits, and configuration #80

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ScarFX
Copy link
Collaborator

@ScarFX ScarFX commented Sep 30, 2024

Added async api calls to embed (process) and add chunks (documents) to vector storage.

Added env variables: MAX_CHUNKS, EMBEDDING_TIMEOUT, BATCH_SIZE, and CONCURRENT_LIMIT to configure/limit batching api class. Description in Readme.

Tested with PGVector with bedrock and amazon.titan-embed-text-v2:0. Also rebuilt in docker successfully.

  1. Verified BATCH_SIZE and CONCURRENT_LIMIT affecting process calls to embed correctly
  2. Verified MAX_CHUNKS works as limit and throws exception if over. Correct default behavior of ignoring
  3. Verified EMBEDDING_TIMEOUT works within limit and throws exception if over

@ScarFX ScarFX self-assigned this Sep 30, 2024
@ScarFX
Copy link
Collaborator Author

ScarFX commented Sep 30, 2024

Ideal BATCH_SIZE varies on embeddings provider and likely model and file size. Specific default values should be added per embeddings provider in future. Right now the default for BATCH_SIZE=75 which is ideal for openai but not bedrock.

process_docs.py Outdated
start_time = time.perf_counter()
for i in range(0, len(docs), BATCH_SIZE):
batch = docs[i : min(i + BATCH_SIZE, len(docs))]
#logger.info(f"Sending batch {i} to {i+len(batch)} / {len(docs)}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add back as logger.debug()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants