Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing parameters once Llama is initialized makes it not consistent #82

Closed
adriacabeza opened this issue Apr 16, 2023 · 3 comments
Closed
Labels
bug Something isn't working

Comments

@adriacabeza
Copy link

Hi people 👋🏾 !

While using langchain and llama-cpp-python I've noticed that I had to initialise two instances of the model (one for the embeddings and another one for the inference). I wanted to change that: see this issue: langchain-ai/langchain#2630 by allowing to send the same llama "client" when initialising both objects LlamaCpp and LlamaCppEmbeddings and set the embeddings parameter to true/false when needed. However, I've noticed that this is not something it can be done currently with the library:

Reproducible code

from llama_cpp import Llama
llama = Llama(model_path=MODEL_PATH, verbose=False, embedding=False)
response = llama(prompt="Brasil is awesome") # it works
llama.embedding= True
embeddings = llama.embed("Spain is awesome") # it does not work

throwing this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 315, in embed
    return list(map(float, self.create_embedding(input)["data"][0]["embedding"]))
  File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 272, in create_embedding
    raise RuntimeError(
RuntimeError: Llama model must be created with embedding=True to call this method

Interestingly, if you initialise it as embeddings=True, you can do both, perform inference and get the embeddings:

from llama_cpp import Llama
llama = Llama(model_path=model_path, verbose=False, embedding=True)
embeddings = llama.embed("Spain is awesome") # it works
response = llama(prompt="Brasil is awesome") # it also works

Is there any reason behind not setting the embeddings parameter always to true? Why can't we change the parameter dynamically?

@adriacabeza adriacabeza changed the title Changing parameters once Llama is initialized makes it fail Changing parameters once Llama is initialized makes it not consistent Apr 16, 2023
@gjmulder gjmulder added the bug Something isn't working label May 12, 2023
@gjmulder
Copy link
Contributor

Any update?

@abetlen
Copy link
Owner

abetlen commented May 21, 2023

@adriacabeza unfortunately the parameters cannot be updated after the model is instantiated. This would require re-loading the entire model anyways so currently I would recommend just calling the contructor again with your updated parameters.

@gjmulder
Copy link
Contributor

This would require re-loading the entire model anyways so currently I would recommend just calling the constructor again with your updated parameters.

With the caveat that you probably won't want to do this if you're using a GPU until the upstream VRAM libllama memory leak bug identified in #223 is fixed.

@gjmulder gjmulder closed this as not planned Won't fix, can't repro, duplicate, stale May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants