support embeddings via ollama #21

miku · 2024-04-12T10:03:08Z

Note: This maybe made obsolete by #8.

This add semantic_transforms.OllamaEmbeddings, which allows to calculate embeddings locally using ollama (https://ollama.com/), following the api from OpenAIEmbeddings. Currently, ollama does not support batching (but it is on their roadmap, cf. https://ollama.com/blog/embedding-models).

The LocalSemanticIngestionPipeline shows how it can be used.

To test locally, install ollama, then pull an embeddings model, such as https://ollama.com/library/mxbai-embed-large, then:

from openparse import processing, DocumentParser
semantic_pipeline = processing.LocalSemanticIngestionPipeline(
    url="http://localhost:11434",
    model="mxbai-embed-large",
)
parser = DocumentParser(
        processing_pipeline=semantic_pipeline,
)
parsed = parser.parse("path/to/file.pdf")

This add semantic_transforms.OllamaEmbeddings, which allows to calculate embeddings locally using ollama (https://ollama.com/), following the api from OpenAIEmbeddings. Currently, ollama does not support batching (but it is on their roadmap, cf. https://ollama.com/blog/embedding-models). The LocalSemanticIngestionPipeline shows how it can be used. To test locally, install ollama, then pull an embeddings model, such as https://ollama.com/library/mxbai-embed-large, then: from openparse import processing, DocumentParser semantic_pipeline = processing.LocalSemanticIngestionPipeline( url="http://localhost:11434", model="mxbai-embed-large", ) parser = DocumentParser( processing_pipeline=semantic_pipeline, ) parsed = parser.parse("path/to/file.pdf")

Filimoa · 2024-04-12T17:09:19Z

Thanks for taking the time to create this!

We'll be integrating embedding modules in the next few days which should enable people to use a ton of different embedding providers in a single interface (choosing which ones they install).

You can track progress in PR #23

miku · 2024-04-19T15:38:19Z

Thanks for your work on open-parse - closing this in favor of #23.

Bruce337f · 2024-05-09T09:07:29Z

Any updates here please?

miku closed this Apr 19, 2024

Kydlaw mentioned this pull request Apr 24, 2024

Ollama integration #30

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support embeddings via ollama #21

support embeddings via ollama #21

miku commented Apr 12, 2024 •

edited

Loading

Filimoa commented Apr 12, 2024

miku commented Apr 19, 2024

Bruce337f commented May 9, 2024 •

edited

Loading

support embeddings via ollama #21

support embeddings via ollama #21

Conversation

miku commented Apr 12, 2024 • edited Loading

Filimoa commented Apr 12, 2024

miku commented Apr 19, 2024

Bruce337f commented May 9, 2024 • edited Loading

miku commented Apr 12, 2024 •

edited

Loading

Bruce337f commented May 9, 2024 •

edited

Loading