-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: How to use Vertex AI Llama 3 endpoint on GCP using Llamaindex? #15825
Comments
Hey @abhirooptalasila! I'm here to help you with any bugs, questions, or becoming a contributor. Let's squash those bugs together! The LlamaIndex codebase includes a fallback mechanism that defaults to a specific LLM if the specified model is not available. If the provided model name is not found, the code attempts to use the default OpenAI model. If the OpenAI model cannot be loaded (e.g., due to an invalid API key), it raises a Here is the relevant part of the code: def resolve_llm(
llm: Optional[LLMType] = None, callback_manager: Optional[CallbackManager] = None
) -> LLM:
if llm == "default":
# if testing return mock llm
if os.getenv("IS_TESTING"):
llm = MockLLM()
llm.callback_manager = callback_manager or Settings.callback_manager
return llm
# return default OpenAI model. If it fails, return LlamaCPP
try:
from llama_index.llms.openai import OpenAI
from llama_index.llms.openai.utils import validate_openai_api_key
llm = OpenAI()
validate_openai_api_key(llm.api_key)
except ImportError:
raise ImportError(
"`llama-index-llms-openai` package not found, "
"please run `pip install llama-index-llms-openai`"
)
except ValueError as e:
raise ValueError(
"\n******\n"
"Could not load OpenAI model. "
"If you intended to use OpenAI, please check your OPENAI_API_KEY.\n"
"Original error:\n"
f"{e!s}"
"\nTo disable the LLM entirely, set llm=None."
"\n******"
)
elif llm is None:
print("LLM is explicitly disabled. Using MockLLM.")
llm = MockLLM()
assert isinstance(llm, LLM)
llm.callback_manager = callback_manager or Settings.callback_manager
return llm Additionally, the LlamaIndex codebase includes support for a default ModelScope model. If no specific model is provided, it defaults to using the This fallback mechanism ensures that if the specified model is not available, the system will attempt to use a default model, which might explain why the |
@dosu I got that, but I don't have any OpenAI keys in my env.
I still get a completed output when I try out the above code using the GCP JSON credentials. |
You are getting a completed output using the Here is the relevant code snippet from the document: from llama_index.llms.vertex import Vertex
from google.oauth2 import service_account
filename = "vertex-407108-37495ce6c303.json"
credentials: service_account.Credentials = (
service_account.Credentials.from_service_account_file(filename)
)
Vertex(
model="text-bison", project=credentials.project_id, credentials=credentials
) This code shows that the Additionally, the The |
@dosu Then how do I use a Llama 3 model instead? |
To use a Llama 3 model with the from llama_index.llms.vertex import Vertex
from google.oauth2 import service_account
# Load your GCP credentials
filename = "path/to/your/service-account.json"
credentials = service_account.Credentials.from_service_account_file(filename)
# Initialize the Vertex class with the desired Llama 3 model
llm = Vertex(
model="llama-3", # Replace with the specific Llama 3 model name
project=credentials.project_id,
credentials=credentials
)
# Use the complete method to generate a response
output = llm.complete("Say foo:")
print(output.text) Replace |
@dosu "llama-3" is not an available option in the Vertex class. |
To use a Llama 3 model with the Here is an example of how to use a supported model with the from llama_index.llms.vertex import Vertex
# Set up necessary variables
credentials = {
"project_id": "INSERT_PROJECT_ID",
"api_key": "INSERT_API_KEY",
}
# Create an instance of the Vertex class with a supported model
llm = Vertex(
model="text-bison", # Replace with a supported model name
project=credentials["project_id"],
credentials=credentials,
)
# Access the complete method from the instance
response = llm.complete("Hello world!")
print(str(response)) If you need to use a specific model, ensure it is listed in the supported models or check if it can be added to the The |
@logan-markewich Any idea how I can find out which model it is defaulting to? I don't have any other models deployed on GCP. I've compared outputs from local and online GCP, both are different. |
The source code is here: llama_index/llama-index-integrations/llms/llama-index-llms-vertex/llama_index/llms/vertex/base.py Line 98 in 6645910
|
I did go through that but as you must've noticed, the Llama models are not included in the models list. |
Question Validation
Question
I'm following this page to integrate a Llama 3 endpoint into my existing workflow using chat engines.
I've added my service account JSON credentials, and while trying to instantiate a Vertex instance, I can see that the only model options are
chat-bison, chat-bison-32k, chat-bison@001, ... etc.
.Using the default
text-bison
model name, I can create an instance, and the completion API works, too.But that shouldn't work. I don't have a PaLM 2 instance deployed. Is there any other logic making it default to an available LLM?
The text was updated successfully, but these errors were encountered: