Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: refactor source.py #351

Merged
merged 17 commits into from
Feb 20, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/how-to/document_search/search_documents.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
3. Do the search

This guide will walk you through all those steps and explain the details. Let's start with a minimalistic example to get the main idea:

```python
import asyncio
from pathlib import Path
Expand All @@ -16,6 +17,7 @@ from ragbits.document_search import DocumentSearch
from ragbits.document_search.documents.document import DocumentMeta
from ragbits.document_search.documents.sources import GCSSource


async def main() -> None:
# Load documents (there are multiple possible sources)
documents = [
Expand Down Expand Up @@ -47,11 +49,13 @@ if __name__ == "__main__":
Before doing any search we need to have some documents that will build our knowledge base. Ragbits offers a handy class `Document` that stores all the information needed for document loading.
Objects of this class are usually instantiated using `DocumentMeta` helper class that supports loading files from your local storage, GCS or HuggingFace.
You can easily add support for your custom sources by extending the `Source` class and implementing the abstract methods:

```python
from pathlib import Path

from ragbits.document_search.documents.sources import Source


class CustomSource(Source):
@property
def id(self) -> str:
Expand Down
2 changes: 1 addition & 1 deletion examples/document-search/multimodal.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@

def jpg_example(file_name: str) -> DocumentMeta:
"""
Create a document from a JPG file in the images directory.
Create a document from a JPG file in the image's directory.
"""
return DocumentMeta(document_type=DocumentType.JPG, source=LocalFileSource(path=IMAGES_PATH / file_name))

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
type: ragbits.document_search.documents.sources:HuggingFaceSource
type: ragbits.document_search.documents.sources.hf:HuggingFaceSource
config:
path: "micpst/hf-docs"
split: "train[:5]"
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
from ragbits.core.vector_stores.base import VectorStoreOptions
from ragbits.document_search.documents.document import Document, DocumentMeta
from ragbits.document_search.documents.element import Element, ImageElement
from ragbits.document_search.documents.source_resolver import SourceResolver
from ragbits.document_search.documents.sources import Source
from ragbits.document_search.documents.sources.base import SourceResolver
from ragbits.document_search.ingestion.document_processor import DocumentProcessorRouter
from ragbits.document_search.ingestion.processor_strategies import (
ProcessingExecutionStrategy,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@

from pydantic import BaseModel

from ragbits.document_search.documents.sources import LocalFileSource, Source, SourceDiscriminator
from ragbits.document_search.documents.sources import LocalFileSource, Source
from ragbits.document_search.documents.sources.base import SourceDiscriminator


class DocumentType(str, Enum):
Expand Down

This file was deleted.

Loading
Loading