Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide transparent proxy API for Ollama #59

Open
1 of 6 tasks
nickthecook opened this issue Aug 19, 2024 · 1 comment
Open
1 of 6 tasks

Provide transparent proxy API for Ollama #59

nickthecook opened this issue Aug 19, 2024 · 1 comment
Assignees

Comments

@nickthecook
Copy link
Owner

nickthecook commented Aug 19, 2024

To work with Archyve, any LLM server client (like a chat app) needs to be modified to use the Archyve API to retrieve prompt augmentation info. This is hard to do and unlikely to be done by any projects unless Archyve becomes widely used.

Instead, Archyve should provide an API that, when called, proxies the call to the Ollama API. If the call is a chat or generate request, it will augment the prompt before sending the request on to Ollama.

This will let Archyve be used by all existing Ollama clients.

Challenges

There are some challenges to implementing this, none of them insurmountable.

  1. While Arvchyve may match the HTTP methods, headers, and bodies of requests to Ollama, there are many quirks of HTTP implementation that may reveal to clients that Archyve is not actually Ollama. Keep this in mind and identify these cases by testing with many Ollama clients.
  2. Because the client will not be aware it's even using Archyve, Archyve will need to select the Collections to search for each request itself. This will probably require the Knowledge Graph (Merge Knowledge Graph feature #50) to work, and it remains to be seen if this can be done reliably. An alternative is to just always search all Collections and let the similarity search sort out the useful results, but this may result in unexpected responses.
  3. Because Archyve aims to support the OpenAI API as well, this will also need to be done for that API, which has a richer history and far more endpoints. It's likely that Archyve would need to publish how much of that API it supports, at least at first. This may be a lot of work, and a lot of bug reports if people try it out.

Features

  • An Archyve user should be able to easily see, in the Archyve UI, their incoming chat request, what Archyve augmented it with, and the LLM server response.
  • Archyve should optionally return metadata with the normal LLM server response that shows the above info. If it's just a new key in a JSON object, most clients will probably ignore it, but clients that become Archyve aware can show this to their users.
  • An Archyve user should be able to control which Collections are available for augmentation.

Issues

Base feature is in main, but there are some issues:

  • Conversations from OPP are prefixed with "(OPP)" instead of having actual metadata and a badge
  • Conversations from Ollama Proxy are associated with the first user, with no link to the user that used the proxy
  • there are no stats on proxied requests
@nickthecook
Copy link
Owner Author

Going to leave this one to after launch. It's more important to get feedback on what's there than to try to implement everything the first time it's mentioned.

@nickthecook nickthecook changed the title Prove transparent proxy API for Ollama Provide transparent proxy API for Ollama Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants