Feedback Request: Embedding and Metadata Design Proposal for PyVisionAI #25

MDGrey33 · 2025-01-18T12:17:34Z

We're planning to enhance PyVisionAI with two major features: embedding generation and automatic metadata creation. Before jumping into development, we’d like your feedback on the proposed design to ensure it meets the needs of the community.

Current Features for Context
PyVisionAI already supports content extraction and custom prompts to guide how files and images are processed. Examples from the README:

File Extraction with Prompt:
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir -p "Extract the key sections as bullet points."
Image Description with Prompt:
bash
Copy
Edit
describe-image -i image.jpg -p "Focus on describing the objects and their spatial arrangement."
This flexibility lets users tailor the library’s behavior to suit their workflows.

Proposed Features

Embedding Generation
Add support for generating embeddings as part of the document processing pipeline.
Proposed Parameters:
--embed-model: Specify the embedding model (e.g., openai or local). Automatically triggers embedding generation.
--generate-embed: Requests embedding generation using the default model (e.g., OpenAI).
Example CLI Usage:

bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --embed-model openai
or

bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --generate-embed
Example Library Usage:

python
Copy
Edit
from pyvisionai import create_extractor

Embedding with specified model

extractor = create_extractor("pdf", embed_model="local")
extractor.extract("path/to/input.pdf", "output_dir")

Embedding with default model

extractor = create_extractor("pdf", generate_embed=True)
extractor.extract("path/to/input.pdf", "output_dir")
2. Automatic Metadata Creation
Generate a metadata file for every processed document.
Metadata includes:
Original file details (name, path, size, etc.).
Markdown file location.
Embedding file path (if embeddings are generated).
Embedding model used (if specified).
Processing timestamps.
Example Metadata Output (JSON):

json
Copy
Edit
{
"original_file": "document.pdf",
"markdown_file": "output_dir/document.md",
"embedding_file": "output_dir/document.vec",
"embedding_model": "openai",
"processing_time": "2025-01-18T12:00:00Z"
}
Feedback Needed
We’d love your thoughts on the following:

Embedding Design:

Does the parameter logic (--embed-model and --generate-embed) feel intuitive?
Would you prefer an alternative way to enable embedding generation or select models?
Metadata:

Are there additional fields or structures you think should be included in the metadata?
Would the metadata structure fit well into your workflows?
General Design:

Are there any alternative approaches or improvements we should consider?
Next Steps
We’ll use your feedback to refine the design and ensure it aligns with community needs before starting implementation.

Thank you for helping shape the future of PyVisionAI! 🙏

MDGrey33 added enhancement New feature or request question Further information is requested labels Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback Request: Embedding and Metadata Design Proposal for PyVisionAI #25

Feedback Request: Embedding and Metadata Design Proposal for PyVisionAI #25

MDGrey33 commented Jan 18, 2025 •

edited

Loading

Feedback Request: Embedding and Metadata Design Proposal for PyVisionAI #25

Feedback Request: Embedding and Metadata Design Proposal for PyVisionAI #25

Comments

MDGrey33 commented Jan 18, 2025 • edited Loading

Embedding with specified model

Embedding with default model

MDGrey33 commented Jan 18, 2025 •

edited

Loading