-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Passing tool results to the LM #2606
Comments
Hi @anakin87, thanks a lot for reporting this issue! I managed to reproduce the bug with mistral models. However, I tried with meta-llama/Llama-3.1-8B-Instruct and HuggingFaceH4/zephyr-7b-beta (both served with TGI) and it works fine, here is the script: from huggingface_hub import InferenceClient
client = InferenceClient("meta-llama/Llama-3.1-8B-Instruct") # or "HuggingFaceH4/zephyr-7b-beta"
messages = [
{
"role": "system",
"content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
},
{
"role": "user",
"content": "What's the weather like in San Giustino (Italy) in Celsius?",
},
]
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"format": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to use. Infer this from the users location.",
},
},
"required": ["location", "format"],
},
},
}
]
output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
print(output)
tool_call = {"name": "get_current_weather", "arguments": {"location": "San Giustino, Italy", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}], "content": "22.0"})
messages.append({"role": "tool", "name": "get_current_weather", "content": "22.0"})
output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
print(output) I suspect an issue with mistral chat templates, I think you can open an issue in mistralai/Mistral-Nemo-Instruct-2407 and I will also report this internally and get back to you if there is any better workaround. |
I see a similar issue working with LangGraph and Mistarl Nemo hosted by hf.
I tried the same model served directly by Mistral and I confirm it assigns alphanumeric tool call IDs, unlike hf that assigns 0-indexed IDs. |
Hi @igor-davidyuk, from huggingface_hub import InferenceClient
client = InferenceClient("mistralai/Mistral-Nemo-Instruct-2407")
messages = [
{
"role": "system",
"content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
},
{
"role": "user",
"content": "What's the weather like in San Giustino (Italy) in Celsius?",
},
]
tools = [
{
"type": "function",
"id": "abcdef123",
"function": {
"name": "get_current_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"format": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to use. Infer this from the users location.",
},
},
"required": ["location", "format"],
},
},
}
]
output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
tool_call = {"name": "get_current_weather", "arguments": {"location": "San Giustino, Italy", "unit": "celsius"}}
messages.append(
{
"role": "assistant",
"tool_calls": [{"type": "function", "function": tool_call, "id": "abcdef123"}],
"content": "22.0",
}
)
output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
print(output) If the error persists, I suggest opening an issue in mistralai/Mistral-Nemo-Instruct-2407. I'm closing this issue, but feel free to comment if you have any further question. |
I opened a similar issue regarding tool_calls here yesterday. (Honestly not sure if it's the right repo.) I see three unexpected behaviors:
At this point, all the HF models I've tried so far with the inference clients fail at function calling. They recognize a function needs to be called, but they don't follow through once the return values are provided:
For example, the code you provided, @hanouticelina, results in unexpected behavior type 1 (above) for me: ChatCompletionOutput(choices=[ChatCompletionOutputComplete(finish_reason='stop', index=0, message=ChatCompletionOutputMessage(role='assistant', content=None, tool_calls=[ChatCompletionOutputToolCall(function=ChatCompletionOutputFunctionDefinition(arguments={'format': 'celsius', 'location': 'San Giustino, Italy'}, name='get_current_weather', description=None), id='0', type='function')]), logprobs=None)], created=1738593582, id='', model='mistralai/Mistral-Nemo-Instruct-2407', system_fingerprint='3.0.1-sha-bb9095a', usage=ChatCompletionOutputUsage(completion_tokens=32, prompt_tokens=239, total_tokens=271)) Note the I would think, similar to OpenAI, the expected behavior is a chat completion with textual content to the effect of, "The weather in San Giustino, Italy is currently 22 degrees Celsius." It is odd though that in my code, Mistral models will still complain about the 9-char requirement. In GPT, models return the IDs they require, and they're attached to the messages, not the tool definitions. I think I'm discovering that even the schema for function calling isn't consistent between HF-hosted models.
|
I moved my bug report to here: #2829 |
Describe the bug
Let's start by thanking you for this great resource 💙
The
InferenceClient
supports tool calling, as explained here.In many use cases, it is useful to pass back the tool call to the Language Model and also the tool result in a message from
tool
role.In this way, the LM can for example respond in a human-readable way.
This is supported in HF Transformers.
When using the
InferenceClient
(for Serverless Inference API or TGI), I'm struggling to find a way to reproduce this desired behavior.(I mostly experimented with Mistral and Llama models supporting tool/function calling, with similar results)
@Wauplin @hanouticelina Is this supported or planned?
Is there any workaround you suggest? So far, I've only tried to wrap the tool result in a message from
user
and this somehow works...Probably related issue (in TGI): huggingface/text-generation-inference#2461
Reproduction
System info
The text was updated successfully, but these errors were encountered: