Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support for local LLMs like Ollama #18

Closed
mmmeff opened this issue Oct 17, 2023 · 19 comments · Fixed by #97
Closed

[Feature Request] Support for local LLMs like Ollama #18

mmmeff opened this issue Oct 17, 2023 · 19 comments · Fixed by #97
Labels
roadmap Planned features

Comments

@mmmeff
Copy link

mmmeff commented Oct 17, 2023

title

@tjweir
Copy link

tjweir commented Oct 17, 2023

The ability to use local LLMs would be great.

@cpacker
Copy link
Collaborator

cpacker commented Oct 18, 2023

added to the roadmap!

@cpacker cpacker added the roadmap Planned features label Oct 18, 2023
@Aaronminer1
Copy link

using it withe LM studios local api server would be great spent half the day trying to get it to connect but alas no good results it should be possible since the server is suppose to be a drop in replacement for openai api https://lmstudio.ai/

@molander
Copy link

LM Studio is interesting but in keeping with the spirit of open-source, a better solution would be https://github.com/go-skynet/LocalAI, a fully open drop-in OpenAI API replacement that includes support for functions. I am cloning memGPT now and have a localAI installation so perhaps I can see this weekend what would be required.

@giuliogatto
Copy link

Support for local LLMs would be a game changer, in particular being able to use Mistral 7B

@jackfood
Copy link

using it withe LM studios local api server would be great spent half the day trying to get it to connect but alas no good results it should be possible since the server is suppose to be a drop in replacement for openai api https://lmstudio.ai/

Any luck on running with LM Stuido?

@cronobjs
Copy link

using it withe LM studios local api server would be great spent half the day trying to get it to connect but alas no good results it should be possible since the server is suppose to be a drop in replacement for openai api https://lmstudio.ai/

Any luck on running with LM Stuido?

I have LM Studio And Im trying to figure this out but its so confusing. If anyone out there has able to get any Llama versions to run with MemGPT that would be Helpful.

@dataswifty
Copy link

added to the roadmap!

Thank you!

@cpacker
Copy link
Collaborator

cpacker commented Oct 21, 2023

We are actively working on this (allowing pointing MemGPT at your own hosted LLM backend that supports function calling), more updates to come soon.

@d0rc
Copy link

d0rc commented Oct 21, 2023

sorry, for asking obvious questions - isn't it possible to just start local OpenAI API, following llama.cpp-python bindings documentation, or bringing it up with LMStudio and override OPENAI_API_ENDPOINT environment variable or something like that?

@nbollman
Copy link

We are actively working on this (allowing pointing MemGPT at your own hosted LLM backend that supports function calling), more updates to come soon.

Can't wait to see what comes of your proposed Mistral 7B fine-tune, I hope it's intended use is to allow for system ai interdependence and release constrait to external OpenAI processing... I imagine a model that could call a subject matter expert model into vram for specific questioning, or just be able to conduct web research and put together reports of their own accord. Organize the data into its own fine tune safensor or lora depending on your AI core update interval... the future is coming.

@d0rc
Copy link

d0rc commented Oct 21, 2023

Ok, so the idea is not fine-tune some model to be more aligned to call MemGPT functions? But where did you get that info? Please, share.

@cpacker
Copy link
Collaborator

cpacker commented Oct 21, 2023

@d0rc

We are doing both:

  1. Adding official support for using your own LLM backend that supports function calling (this can be as simple as setting the openai.api_base property to point towards your server if the backend if configured properly, but we want to add better support for this with examples and some reference models). This will also make is easier for the community to try new function calling LLMs with MemGPT (since new ones are getting released quite frequently) to see which work best.
  2. Working on our own finetuned models that are finetuned specifically for MemGPT functions (with the idea that these should hopefully perform better than open models finetuned on general function call data, and thus help approach the performance of MemGPT+gpt4).

This issue is for tracking (1), and discussion for (2) is here: #67 (though the content of the two threads is overlapping).

@ishaan-jaff
Copy link

made a PR for this: #86

@molander
Copy link

OPENAI_API_BASE=http://localhost:8080/v1 python main.py --persona syn.txt --model wizardcoder-python-34b.gguf
Running... [exit by typing '/exit']
Warning - you are running MemGPT with wizardcoder-python-34b.gguf, which is not officially supported (yet). Expect bugs!
💭 Bootup sequence complete. Persona activated. Testing messaging functionality.
Hit enter to begin (will request first MemGPT message)hello!
💭 None
🤖 Hello, Chad! I'm Synthia. How can I assist you today?
Hi Syn, I am Matt.
and so on...

Hahaha, fantastic! Yeah, using LocalAI (single docker command and I have models lying all over the place but if that wasn't the case, LocalAI project can pull them from HuggingFace or Model Gallery that they have setup automagically at runtime)

I started off a little rocky as I spent the majority of my time on FreeBSD getting memGPT going (I will file a pr if I cant get it) but moved to a Linux box to see some forward motion and to check if one can indeed just change the endpoint on a properly config'd backend and sail away. Yes, you sure can! My first try or 2, I didn't have large enough context window, typo'd my model template, etc., but once I stopped spazzing out, it fired right up and started working straight away. Yay! Nice project, kudos you guys and great paper btw. Congrats!

Here's a horrible first proof of life video before I chop it into an actual success video later:
http://demonix.io:9000/index.php?p=&view=memgpt-localai.mp4

@garyblankenship
Copy link

Testing with LM Studio.

OPENAI_API_BASE=http://localhost:1234/v1 python3 main.py

[2023-10-22 11:50:02.528] [ERROR] Error: 'messages' array must only contain objects with a 'role' field that is either 'user', 'assistant', or 'system'.

@jackfood
Copy link

OPENAI_API_BASE=http://localhost:8080/v1 python main.py --persona syn.txt --model wizardcoder-python-34b.gguf Running... [exit by typing '/exit'] Warning - you are running MemGPT with wizardcoder-python-34b.gguf, which is not officially supported (yet). Expect bugs! 💭 Bootup sequence complete. Persona activated. Testing messaging functionality. Hit enter to begin (will request first MemGPT message)hello! 💭 None 🤖 Hello, Chad! I'm Synthia. How can I assist you today? Hi Syn, I am Matt. and so on...

Hahaha, fantastic! Yeah, using LocalAI (single docker command and I have models lying all over the place but if that wasn't the case, LocalAI project can pull them from HuggingFace or Model Gallery that they have setup automagically at runtime)

I started off a little rocky as I spent the majority of my time on FreeBSD getting memGPT going (I will file a pr if I cant get it) but moved to a Linux box to see some forward motion and to check if one can indeed just change the endpoint on a properly config'd backend and sail away. Yes, you sure can! My first try or 2, I didn't have large enough context window, typo'd my model template, etc., but once I stopped spazzing out, it fired right up and started working straight away. Yay! Nice project, kudos you guys and great paper btw. Congrats!

Here's a horrible first proof of life video before I chop it into an actual success video later: http://demonix.io:9000/index.php?p=&view=memgpt-localai.mp4

I tried too, unable to get it right.
'OPENAI_API_BASE' is not recognized as an internal or external command,
operable program or batch file.

Any assistant on this will be great on using LM Studio.

@garyblankenship
Copy link

garyblankenship commented Oct 23, 2023

I'm on Mac OS 14.0 Sonoma with an M2.

I was able to get the llama.cpp server working with

  • the llama.cpp/examples/server/api_like_OAI.py file
  • the llama.cpp/server file

The problem I ran into was I didn't find a model that supported function calling yet.

Some of the steps I took are:

  1. export OPENAI_API_KEY=123456
  2. export OPENAI_REVERSE_PROXY=http://127.0.0.1:8081/v1/chat/completions (maybe?)
  3. python api_like_OAI.py --api-key 123456 --host 127.0.0.1 --user-name "user" --system-name "assistant"
  4. ./server -c 4000 --host 0.0.0.0 -t 12 -ngl 1 -m models/airoboros-l2-13b-3.1.1.Q4_K_M.gguf --embedding --alias gpt-3.5-turbo -v

@garyblankenship
Copy link

I tried too, unable to get it right. 'OPENAI_API_BASE' is not recognized as an internal or external command, operable program or batch file.

Any assistant on this will be great on using LM Studio.

Just a note to say the OPENAI_API_BASE=host:port is just a way to set an environment variable when you run the python command. MemGPT must check for it and swap the api base url.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Planned features
Projects
None yet
Development

Successfully merging a pull request may close this issue.