You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To work with Archyve, any LLM server client (like a chat app) needs to be modified to use the Archyve API to retrieve prompt augmentation info. This is hard to do and unlikely to be done by any projects unless Archyve becomes widely used.
Instead, Archyve should provide an API that, when called, proxies the call to the Ollama API. If the call is a chat or generate request, it will augment the prompt before sending the request on to Ollama.
This will let Archyve be used by all existing Ollama clients.
Challenges
There are some challenges to implementing this, none of them insurmountable.
While Arvchyve may match the HTTP methods, headers, and bodies of requests to Ollama, there are many quirks of HTTP implementation that may reveal to clients that Archyve is not actually Ollama. Keep this in mind and identify these cases by testing with many Ollama clients.
Because the client will not be aware it's even using Archyve, Archyve will need to select the Collections to search for each request itself. This will probably require the Knowledge Graph (Merge Knowledge Graph feature #50) to work, and it remains to be seen if this can be done reliably. An alternative is to just always search all Collections and let the similarity search sort out the useful results, but this may result in unexpected responses.
Because Archyve aims to support the OpenAI API as well, this will also need to be done for that API, which has a richer history and far more endpoints. It's likely that Archyve would need to publish how much of that API it supports, at least at first. This may be a lot of work, and a lot of bug reports if people try it out.
Features
An Archyve user should be able to easily see, in the Archyve UI, their incoming chat request, what Archyve augmented it with, and the LLM server response.
Archyve should optionally return metadata with the normal LLM server response that shows the above info. If it's just a new key in a JSON object, most clients will probably ignore it, but clients that become Archyve aware can show this to their users.
An Archyve user should be able to control which Collections are available for augmentation.
Issues
Base feature is in main, but there are some issues:
Conversations from OPP are prefixed with "(OPP)" instead of having actual metadata and a badge
Conversations from Ollama Proxy are associated with the first user, with no link to the user that used the proxy
there are no stats on proxied requests
The text was updated successfully, but these errors were encountered:
Going to leave this one to after launch. It's more important to get feedback on what's there than to try to implement everything the first time it's mentioned.
nickthecook
changed the title
Prove transparent proxy API for Ollama
Provide transparent proxy API for Ollama
Aug 28, 2024
To work with Archyve, any LLM server client (like a chat app) needs to be modified to use the Archyve API to retrieve prompt augmentation info. This is hard to do and unlikely to be done by any projects unless Archyve becomes widely used.
Instead, Archyve should provide an API that, when called, proxies the call to the Ollama API. If the call is a
chat
orgenerate
request, it will augment the prompt before sending the request on to Ollama.This will let Archyve be used by all existing Ollama clients.
Challenges
There are some challenges to implementing this, none of them insurmountable.
Features
Issues
Base feature is in
main
, but there are some issues:The text was updated successfully, but these errors were encountered: