You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're planning to enhance PyVisionAI with two major features: embedding generation and automatic metadata creation. Before jumping into development, we’d like your feedback on the proposed design to ensure it meets the needs of the community.
Current Features for Context
PyVisionAI already supports content extraction and custom prompts to guide how files and images are processed. Examples from the README:
File Extraction with Prompt:
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir -p "Extract the key sections as bullet points."
Image Description with Prompt:
bash
Copy
Edit
describe-image -i image.jpg -p "Focus on describing the objects and their spatial arrangement."
This flexibility lets users tailor the library’s behavior to suit their workflows.
Proposed Features
Embedding Generation
Add support for generating embeddings as part of the document processing pipeline.
Proposed Parameters:
--embed-model: Specify the embedding model (e.g., openai or local). Automatically triggers embedding generation.
--generate-embed: Requests embedding generation using the default model (e.g., OpenAI).
Example CLI Usage:
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --embed-model openai
or
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --generate-embed
Example Library Usage:
python
Copy
Edit
from pyvisionai import create_extractor
extractor = create_extractor("pdf", generate_embed=True)
extractor.extract("path/to/input.pdf", "output_dir")
2. Automatic Metadata Creation
Generate a metadata file for every processed document.
Metadata includes:
Original file details (name, path, size, etc.).
Markdown file location.
Embedding file path (if embeddings are generated).
Embedding model used (if specified).
Processing timestamps.
Example Metadata Output (JSON):
json
Copy
Edit
{
"original_file": "document.pdf",
"markdown_file": "output_dir/document.md",
"embedding_file": "output_dir/document.vec",
"embedding_model": "openai",
"processing_time": "2025-01-18T12:00:00Z"
}
Feedback Needed
We’d love your thoughts on the following:
Embedding Design:
Does the parameter logic (--embed-model and --generate-embed) feel intuitive?
Would you prefer an alternative way to enable embedding generation or select models?
Metadata:
Are there additional fields or structures you think should be included in the metadata?
Would the metadata structure fit well into your workflows?
General Design:
Are there any alternative approaches or improvements we should consider?
Next Steps
We’ll use your feedback to refine the design and ensure it aligns with community needs before starting implementation.
Thank you for helping shape the future of PyVisionAI! 🙏
The text was updated successfully, but these errors were encountered:
We're planning to enhance PyVisionAI with two major features: embedding generation and automatic metadata creation. Before jumping into development, we’d like your feedback on the proposed design to ensure it meets the needs of the community.
Current Features for Context
PyVisionAI already supports content extraction and custom prompts to guide how files and images are processed. Examples from the README:
File Extraction with Prompt:
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir -p "Extract the key sections as bullet points."
Image Description with Prompt:
bash
Copy
Edit
describe-image -i image.jpg -p "Focus on describing the objects and their spatial arrangement."
This flexibility lets users tailor the library’s behavior to suit their workflows.
Proposed Features
Add support for generating embeddings as part of the document processing pipeline.
Proposed Parameters:
--embed-model: Specify the embedding model (e.g., openai or local). Automatically triggers embedding generation.
--generate-embed: Requests embedding generation using the default model (e.g., OpenAI).
Example CLI Usage:
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --embed-model openai
or
bash
Copy
Edit
file-extract -t pdf -s document.pdf -o output_dir --generate-embed
Example Library Usage:
python
Copy
Edit
from pyvisionai import create_extractor
Embedding with specified model
extractor = create_extractor("pdf", embed_model="local")
extractor.extract("path/to/input.pdf", "output_dir")
Embedding with default model
extractor = create_extractor("pdf", generate_embed=True)
extractor.extract("path/to/input.pdf", "output_dir")
2. Automatic Metadata Creation
Generate a metadata file for every processed document.
Metadata includes:
Original file details (name, path, size, etc.).
Markdown file location.
Embedding file path (if embeddings are generated).
Embedding model used (if specified).
Processing timestamps.
Example Metadata Output (JSON):
json
Copy
Edit
{
"original_file": "document.pdf",
"markdown_file": "output_dir/document.md",
"embedding_file": "output_dir/document.vec",
"embedding_model": "openai",
"processing_time": "2025-01-18T12:00:00Z"
}
Feedback Needed
We’d love your thoughts on the following:
Embedding Design:
Does the parameter logic (--embed-model and --generate-embed) feel intuitive?
Would you prefer an alternative way to enable embedding generation or select models?
Metadata:
Are there additional fields or structures you think should be included in the metadata?
Would the metadata structure fit well into your workflows?
General Design:
Are there any alternative approaches or improvements we should consider?
Next Steps
We’ll use your feedback to refine the design and ensure it aligns with community needs before starting implementation.
Thank you for helping shape the future of PyVisionAI! 🙏
The text was updated successfully, but these errors were encountered: