Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback Request: Embedding and Metadata Design Proposal for PyVisionAI #25

Open
MDGrey33 opened this issue Jan 18, 2025 · 0 comments
Open
Labels
enhancement New feature or request question Further information is requested

Comments

@MDGrey33
Copy link
Owner

MDGrey33 commented Jan 18, 2025

We're planning to enhance PyVisionAI with two major features: embedding generation and automatic metadata creation. Before jumping into development, we’d like your feedback on the proposed design to ensure it meets the needs of the community.

Current Features for Context
PyVisionAI already supports content extraction and custom prompts to guide how files and images are processed. Examples from the README:

File Extraction with Prompt:
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir -p "Extract the key sections as bullet points."
Image Description with Prompt:
bash
Copy
Edit
describe-image -i image.jpg -p "Focus on describing the objects and their spatial arrangement."
This flexibility lets users tailor the library’s behavior to suit their workflows.

Proposed Features

  1. Embedding Generation
    Add support for generating embeddings as part of the document processing pipeline.
    Proposed Parameters:
    --embed-model: Specify the embedding model (e.g., openai or local). Automatically triggers embedding generation.
    --generate-embed: Requests embedding generation using the default model (e.g., OpenAI).
    Example CLI Usage:

bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --embed-model openai
or

bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --generate-embed
Example Library Usage:

python
Copy
Edit
from pyvisionai import create_extractor

Embedding with specified model

extractor = create_extractor("pdf", embed_model="local")
extractor.extract("path/to/input.pdf", "output_dir")

Embedding with default model

extractor = create_extractor("pdf", generate_embed=True)
extractor.extract("path/to/input.pdf", "output_dir")
2. Automatic Metadata Creation
Generate a metadata file for every processed document.
Metadata includes:
Original file details (name, path, size, etc.).
Markdown file location.
Embedding file path (if embeddings are generated).
Embedding model used (if specified).
Processing timestamps.
Example Metadata Output (JSON):

json
Copy
Edit
{
"original_file": "document.pdf",
"markdown_file": "output_dir/document.md",
"embedding_file": "output_dir/document.vec",
"embedding_model": "openai",
"processing_time": "2025-01-18T12:00:00Z"
}
Feedback Needed
We’d love your thoughts on the following:

Embedding Design:

Does the parameter logic (--embed-model and --generate-embed) feel intuitive?
Would you prefer an alternative way to enable embedding generation or select models?
Metadata:

Are there additional fields or structures you think should be included in the metadata?
Would the metadata structure fit well into your workflows?
General Design:

Are there any alternative approaches or improvements we should consider?
Next Steps
We’ll use your feedback to refine the design and ensure it aligns with community needs before starting implementation.

Thank you for helping shape the future of PyVisionAI! 🙏

@MDGrey33 MDGrey33 added enhancement New feature or request question Further information is requested labels Jan 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant