Provide transparent proxy API for Ollama #59

nickthecook · 2024-08-19T12:52:24Z

To work with Archyve, any LLM server client (like a chat app) needs to be modified to use the Archyve API to retrieve prompt augmentation info. This is hard to do and unlikely to be done by any projects unless Archyve becomes widely used.

Instead, Archyve should provide an API that, when called, proxies the call to the Ollama API. If the call is a chat or generate request, it will augment the prompt before sending the request on to Ollama.

This will let Archyve be used by all existing Ollama clients.

Challenges

There are some challenges to implementing this, none of them insurmountable.

While Arvchyve may match the HTTP methods, headers, and bodies of requests to Ollama, there are many quirks of HTTP implementation that may reveal to clients that Archyve is not actually Ollama. Keep this in mind and identify these cases by testing with many Ollama clients.
Because the client will not be aware it's even using Archyve, Archyve will need to select the Collections to search for each request itself. This will probably require the Knowledge Graph (Merge Knowledge Graph feature #50) to work, and it remains to be seen if this can be done reliably. An alternative is to just always search all Collections and let the similarity search sort out the useful results, but this may result in unexpected responses.
Because Archyve aims to support the OpenAI API as well, this will also need to be done for that API, which has a richer history and far more endpoints. It's likely that Archyve would need to publish how much of that API it supports, at least at first. This may be a lot of work, and a lot of bug reports if people try it out.

Features

An Archyve user should be able to easily see, in the Archyve UI, their incoming chat request, what Archyve augmented it with, and the LLM server response.
Archyve should optionally return metadata with the normal LLM server response that shows the above info. If it's just a new key in a JSON object, most clients will probably ignore it, but clients that become Archyve aware can show this to their users.
An Archyve user should be able to control which Collections are available for augmentation.

Issues

Base feature is in main, but there are some issues:

Conversations from OPP are prefixed with "(OPP)" instead of having actual metadata and a badge
Conversations from Ollama Proxy are associated with the first user, with no link to the user that used the proxy
there are no stats on proxied requests

The text was updated successfully, but these errors were encountered:

nickthecook · 2024-08-28T12:45:49Z

Going to leave this one to after launch. It's more important to get feedback on what's there than to try to implement everything the first time it's mentioned.

nickthecook added api feature launch labels Aug 19, 2024

oxaronick removed the launch label Aug 28, 2024

nickthecook changed the title ~~Prove transparent proxy API for Ollama~~ Provide transparent proxy API for Ollama Aug 28, 2024

nickthecook assigned nickthecook and oxaronick and unassigned nickthecook Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide transparent proxy API for Ollama #59

Provide transparent proxy API for Ollama #59

nickthecook commented Aug 19, 2024 •

edited

Loading

nickthecook commented Aug 28, 2024

Provide transparent proxy API for Ollama #59

Provide transparent proxy API for Ollama #59

Comments

nickthecook commented Aug 19, 2024 • edited Loading

Challenges

Features

Issues

nickthecook commented Aug 28, 2024

nickthecook commented Aug 19, 2024 •

edited

Loading