You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While using langchain and llama-cpp-python I've noticed that I had to initialise two instances of the model (one for the embeddings and another one for the inference). I wanted to change that: see this issue: langchain-ai/langchain#2630 by allowing to send the same llama "client" when initialising both objects LlamaCpp and LlamaCppEmbeddings and set the embeddings parameter to true/false when needed. However, I've noticed that this is not something it can be done currently with the library:
Reproducible code
fromllama_cppimportLlamallama=Llama(model_path=MODEL_PATH, verbose=False, embedding=False)
response=llama(prompt="Brasil is awesome") # it worksllama.embedding=Trueembeddings=llama.embed("Spain is awesome") # it does not work
throwing this error:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 315, in embed return list(map(float, self.create_embedding(input)["data"][0]["embedding"])) File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 272, in create_embedding raise RuntimeError(RuntimeError: Llama model must be created with embedding=True to call this method
Interestingly, if you initialise it as embeddings=True, you can do both, perform inference and get the embeddings:
fromllama_cppimportLlamallama=Llama(model_path=model_path, verbose=False, embedding=True)
embeddings=llama.embed("Spain is awesome") # it worksresponse=llama(prompt="Brasil is awesome") # it also works
Is there any reason behind not setting the embeddings parameter always to true? Why can't we change the parameter dynamically?
The text was updated successfully, but these errors were encountered:
adriacabeza
changed the title
Changing parameters once Llama is initialized makes it fail
Changing parameters once Llama is initialized makes it not consistent
Apr 16, 2023
@adriacabeza unfortunately the parameters cannot be updated after the model is instantiated. This would require re-loading the entire model anyways so currently I would recommend just calling the contructor again with your updated parameters.
This would require re-loading the entire model anyways so currently I would recommend just calling the constructor again with your updated parameters.
With the caveat that you probably won't want to do this if you're using a GPU until the upstream VRAM libllama memory leak bug identified in #223 is fixed.
Hi people 👋🏾 !
While using langchain and llama-cpp-python I've noticed that I had to initialise two instances of the model (one for the embeddings and another one for the inference). I wanted to change that: see this issue: langchain-ai/langchain#2630 by allowing to send the same llama "client" when initialising both objects
LlamaCpp
andLlamaCppEmbeddings
and set theembeddings
parameter to true/false when needed. However, I've noticed that this is not something it can be done currently with the library:Reproducible code
throwing this error:
Interestingly, if you initialise it as
embeddings=True
, you can do both, perform inference and get the embeddings:Is there any reason behind not setting the embeddings parameter always to true? Why can't we change the parameter dynamically?
The text was updated successfully, but these errors were encountered: