Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function/tool calling never resolves #2829

Closed
awmartin opened this issue Feb 3, 2025 · 1 comment
Closed

Function/tool calling never resolves #2829

awmartin opened this issue Feb 3, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@awmartin
Copy link

awmartin commented Feb 3, 2025

Describe the bug

Description

When using the inference client with function calling, models seem to never resolve their function calls.

(I think I opened this issue against the wrong repo, so closing that one and reposting here.)

As we know, typically, with the OpenAI pattern, the simplest function/tool call is a series of messages of various roles (system, user, assistant, tool) organized like this:

system → user ("what's the weather?") → assistant (tool_calls) → tool (result: "4ºC") → assistant (content: "it's 4ºC today")

The HF docs seem to indicate this is the same pattern, although the messages have some minor differences (e.g. description: null, which never happens with OpenAI). When using the Python inference client, these tool_calls never resolve even after functions are called and their return values are included and seemingly properly referenced. Instead, they look like this:

system → user ("what's the weather?") → assistant (tool_calls) → tool (result: "4ºC") → assistant (tool_calls) …

Instead of returning a text completion, the HF inference client returns the same "assistant" message specifying a required tool_calls. In OpenAI, they resolve to a typical "assistant" message with text content, if the function calls have been satisfied and no further calls are required.

Models used that exhibit this behavior:

  • NousResearch/Hermes-3-Llama-3.1-8B
  • Qwen/Qwen2.5-72B-Instruct
  • meta-llama/Meta-Llama-3-8B-Instruct

It's worth noting that Mistral models also error out, specifying that a 9-character alphanumeric string is required for the tool_call_id. Now, the models themselves don't provide such IDs, so we need to supply them ourselves. But even when doing so, the same error occurs, that 9-char identifiers are missing. (e.g. mistralai/Mistral-7B-Instruct-v0.3)

The JavaScript client also fails with the above errors, and also a third: "An error occurred while fetching the blob".

Reproduction

Gist of sample code is here: https://gist.github.com/awmartin/c64c84fbbdc3a9f0c2ce6e5ae0dab3dc

  1. Provide API token and set it to the HF_TOKEN variable
  2. python inference-tool-calls.py

A message results that's unexpected. I expected this to be a typical message with a string content, something like, "It's 4 degrees today." Instead, it just repeats the assistant message with the original tool_call message:

[ChatCompletionOutputComplete(finish_reason='stop', index=0, message=ChatCompletionOutputMessage(role='assistant', content=None, tool_calls=[ChatCompletionOutputToolCall(function=ChatCompletionOutputFunctionDefinition(arguments={'unit': 'Celsius', 'location': 'Philadelphia, PA, US'}, name='get_current_temperature', description=None), id='0', type='function')]), logprobs=None)]

Expected behavior

I expected a message that resolved to something similar to "It's 4 degrees Celsius today" rather than the tool_call message repeated.

One user suggested that manually tracking all the tool_calls with generated IDs, then omitting the tools=tools arguments once they are satisfied, could be an approach. This doesn't seem to work however, as the model inference often fails by ignoring the function call results. I would assume this depends heavily on the model, however.

Logs

System info

- huggingface_hub version: 0.28.1
- Platform: macOS-15.2-arm64-arm-64bit-Mach-O
- Python version: 3.13.1
@hanouticelina
Copy link
Contributor

Hi @awmartin, sorry about this inconvenience. It seems to fail also when using OpenAI client which means it is likely an issue with TGI and not with the InferenceClient.
I see you already opened an issue in https://github.com/huggingface/text-generation-inference, so I'm closing this one as it's not an issue related to huggingface_hub.

@hanouticelina hanouticelina closed this as not planned Won't fix, can't repro, duplicate, stale Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants