Passing tool results to the LM #2606

anakin87 · 2024-10-15T16:53:56Z

Describe the bug

Let's start by thanking you for this great resource 💙

The InferenceClient supports tool calling, as explained here.

In many use cases, it is useful to pass back the tool call to the Language Model and also the tool result in a message from tool role.
In this way, the LM can for example respond in a human-readable way.
This is supported in HF Transformers.

When using the InferenceClient (for Serverless Inference API or TGI), I'm struggling to find a way to reproduce this desired behavior.
(I mostly experimented with Mistral and Llama models supporting tool/function calling, with similar results)

@Wauplin @hanouticelina Is this supported or planned?
Is there any workaround you suggest? So far, I've only tried to wrap the tool result in a message from user and this somehow works...

Probably related issue (in TGI): huggingface/text-generation-inference#2461

Reproduction

from huggingface_hub import InferenceClient

client = InferenceClient("mistralai/Mistral-Nemo-Instruct-2407")

messages = [
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    },
    {
        "role": "user",
        "content": "What's the weather like in San Giustino (Italy) in Celsius?",
    },
]
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        },
    }]

client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)

# this works great and produces a similar output:
# ChatCompletionOutput(choices=[ChatCompletionOutputComplete(finish_reason='stop', index=0, message=ChatCompletionOutputMessage(role='assistant', content=None, tool_calls=[ChatCompletionOutputToolCall(function=ChatCompletionOutputFunctionDefinition(arguments={'format': 'celsius', 'location': 'San Giustino, Italy'}, name='get_current_weather', description=None), id='0', type='function')]), logprobs=None)], ...)

# TRYING TO PASS BACK TOOL CALLS AND TOOL RESULT
new_messages = [el for el in messages]
id_ = "9Ae3bDc2F"  # fake ID needed to use Mistral models

tool_call = {"name": "get_current_weather", "arguments": {"location": "San Giustino, Italy", "format": "celsius"}}
new_messages.append({"role": "assistant", "content":"", "tool_calls": [{"type": "function", "function": tool_call, "id": id_}]})
new_messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0", "tool_call_id": id_})

client.chat_completion(messages=new_messages, tools=tools, max_tokens=500, temperature=0.3)

# HfHubHTTPError: 422 Client Error: Unprocessable Entity for url: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407/v1/chat/completions (Request ID: ...)
# Template error: unknown filter: filter string is unknown (in <string>:79)

System info

- huggingface_hub version: 0.25.2

The text was updated successfully, but these errors were encountered:

hanouticelina · 2024-10-16T09:17:25Z

Hi @anakin87, thanks a lot for reporting this issue! I managed to reproduce the bug with mistral models. However, I tried with meta-llama/Llama-3.1-8B-Instruct and HuggingFaceH4/zephyr-7b-beta (both served with TGI) and it works fine, here is the script:

from huggingface_hub import InferenceClient

client = InferenceClient("meta-llama/Llama-3.1-8B-Instruct") # or "HuggingFaceH4/zephyr-7b-beta"

messages = [
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    },
    {
        "role": "user",
        "content": "What's the weather like in San Giustino (Italy) in Celsius?",
    },
]
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        },
    }
]

output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
print(output)
tool_call = {"name": "get_current_weather", "arguments": {"location": "San Giustino, Italy", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}], "content": "22.0"})
messages.append({"role": "tool", "name": "get_current_weather", "content": "22.0"})

output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
print(output)

I suspect an issue with mistral chat templates, I think you can open an issue in mistralai/Mistral-Nemo-Instruct-2407 and I will also report this internally and get back to you if there is any better workaround.

igor-davidyuk · 2025-01-20T09:31:22Z

I see a similar issue working with LangGraph and Mistarl Nemo hosted by hf.
In my case, the error is a bit more specific; it says:

HfHubHTTPError: 422 Client Error: Unprocessable Entity for url: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407/v1/chat/completions

Template error: syntax error: Tool call IDs should be alphanumeric strings with length 9! (in <string>:81)

I tried the same model served directly by Mistral and I confirm it assigns alphanumeric tool call IDs, unlike hf that assigns 0-indexed IDs.
My guess is it is either a hf-hub message parsing problem or a miss on Mistral's side, but I haven't managed to locate the place where the issue originates from.

hanouticelina · 2025-01-20T11:46:32Z

Hi @igor-davidyuk,
I managed to reproduce the bug, and it's definitely related to Mistral models which do not use tool call IDs, so these must be included in your tool calls and tool results and they should be exactly 9 alphanumeric characters. This works on my side:

from huggingface_hub import InferenceClient


client = InferenceClient("mistralai/Mistral-Nemo-Instruct-2407")

messages = [
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    },
    {
        "role": "user",
        "content": "What's the weather like in San Giustino (Italy) in Celsius?",
    },
]
tools = [
    {
        "type": "function",
        "id": "abcdef123",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        },
    }
]

output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
tool_call = {"name": "get_current_weather", "arguments": {"location": "San Giustino, Italy", "unit": "celsius"}}
messages.append(
    {
        "role": "assistant",
        "tool_calls": [{"type": "function", "function": tool_call, "id": "abcdef123"}],
        "content": "22.0",
    }
)

output = client.chat_completion(messages=messages, tools=tools, max_tokens=500, temperature=0.3)
print(output)

If the error persists, I suggest opening an issue in mistralai/Mistral-Nemo-Instruct-2407.

I'm closing this issue, but feel free to comment if you have any further question.

awmartin · 2025-02-03T14:59:37Z

I opened a similar issue regarding tool_calls here yesterday. (Honestly not sure if it's the right repo.) I see three unexpected behaviors:

The final message just contains the original tool_call message repeated, instead of finishing inference with text-token content.
For the node/JS client specifically, an error saying, "An error occurred while fetching the blob".
For Mistral models, even when providing a 9-char tool_call_id, they still error out claiming a 9-char ID is required.

At this point, all the HF models I've tried so far with the inference clients fail at function calling. They recognize a function needs to be called, but they don't follow through once the return values are provided:

mistralai/Mistral-7B-Instruct-v0.3
NousResearch/Hermes-3-Llama-3.1-8B
Qwen/Qwen2.5-72B-Instruct
meta-llama/Meta-Llama-3-8B-Instruct

For example, the code you provided, @hanouticelina, results in unexpected behavior type 1 (above) for me:

ChatCompletionOutput(choices=[ChatCompletionOutputComplete(finish_reason='stop', index=0, message=ChatCompletionOutputMessage(role='assistant', content=None, tool_calls=[ChatCompletionOutputToolCall(function=ChatCompletionOutputFunctionDefinition(arguments={'format': 'celsius', 'location': 'San Giustino, Italy'}, name='get_current_weather', description=None), id='0', type='function')]), logprobs=None)], created=1738593582, id='', model='mistralai/Mistral-Nemo-Instruct-2407', system_fingerprint='3.0.1-sha-bb9095a', usage=ChatCompletionOutputUsage(completion_tokens=32, prompt_tokens=239, total_tokens=271))

Note the ChatCompletionOutputMessage(role='assistant', content=None... with content=None. This occurs even with model meta-llama/Meta-Llama-3-8B-Instruct.

I would think, similar to OpenAI, the expected behavior is a chat completion with textual content to the effect of, "The weather in San Giustino, Italy is currently 22 degrees Celsius."

It is odd though that in my code, Mistral models will still complain about the 9-char requirement. In GPT, models return the IDs they require, and they're attached to the messages, not the tool definitions. I think I'm discovering that even the schema for function calling isn't consistent between HF-hosted models.

macOS 15.2
Python 3.13.1
huggingface_hub 0.28.1

awmartin · 2025-02-03T18:04:06Z

I moved my bug report to here: #2829

anakin87 added the bug Something isn't working label Oct 15, 2024

anakin87 changed the title ~~Passing tool calls and tool results to the LM~~ Passing tool results to the LM Oct 15, 2024

anakin87 mentioned this issue Oct 18, 2024

feat: support for tools in HuggingFaceAPIChatGenerator deepset-ai/haystack-experimental#120

Merged

hanouticelina closed this as completed Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Passing tool results to the LM #2606

Passing tool results to the LM #2606

anakin87 commented Oct 15, 2024 •

edited

Loading

hanouticelina commented Oct 16, 2024

igor-davidyuk commented Jan 20, 2025

hanouticelina commented Jan 20, 2025

awmartin commented Feb 3, 2025

awmartin commented Feb 3, 2025

Passing tool results to the LM #2606

Passing tool results to the LM #2606

Comments

anakin87 commented Oct 15, 2024 • edited Loading

Describe the bug

Reproduction

System info

hanouticelina commented Oct 16, 2024

igor-davidyuk commented Jan 20, 2025

hanouticelina commented Jan 20, 2025

awmartin commented Feb 3, 2025

awmartin commented Feb 3, 2025

anakin87 commented Oct 15, 2024 •

edited

Loading