You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I managed to load a local mistral 7b model and was able to chat back and forth with the interpreter, but the time it takes to begin streaming the tokens is extremely long after I send a message, considering I get decent token/s once it starts streaming.
Upon a closer look at the memory usage in the task manager, I see that every time I press enter and send a message, the program loads a brand new Ooba server and a GGUF model! This is confirmed by adding a print("Ooba starting and model loading!") before Line 84 in Ooba's llm.py and that an additional python process will pop up in the task manager and occupy 8GB RAM after sending every message.
Please see the screenshots for details.
Reproduce
interpreter --local
chat with it and see how long it takes before begin streaming the response
Monitor your RAM usage, the model is loaded into RAM every time you send a message
Expected behavior
Model load only once per session
Screenshots
Open Interpreter version
0.1.10
Python version
3.11.6
Operating System name and version
Windows 10
Additional context
I am also getting an interpreter.procedure being a NoneType error, for some reason... But I don't think these are related issues. I tried putting the procedures_db.json in the same directory as get_relevant_procedures_string.py and loaded it manually to make sure it was presenting. The NoneType error went away but a new model is still being loaded after every message.
The text was updated successfully, but these errors were encountered:
Describe the bug
I managed to load a local mistral 7b model and was able to chat back and forth with the interpreter, but the time it takes to begin streaming the tokens is extremely long after I send a message, considering I get decent token/s once it starts streaming.
Upon a closer look at the memory usage in the task manager, I see that every time I press enter and send a message, the program loads a brand new Ooba server and a GGUF model! This is confirmed by adding a print("Ooba starting and model loading!") before Line 84 in Ooba's llm.py and that an additional python process will pop up in the task manager and occupy 8GB RAM after sending every message.
Please see the screenshots for details.
Reproduce
Expected behavior
Screenshots
Open Interpreter version
0.1.10
Python version
3.11.6
Operating System name and version
Windows 10
Additional context
I am also getting an interpreter.procedure being a NoneType error, for some reason... But I don't think these are related issues. I tried putting the procedures_db.json in the same directory as get_relevant_procedures_string.py and loaded it manually to make sure it was presenting. The NoneType error went away but a new model is still being loaded after every message.
The text was updated successfully, but these errors were encountered: