Function/tool calling never resolves #2986

awmartin · 2025-02-02T18:23:01Z

Description

When using the inference client with function calling, models seem to never resolve their calls.

As we know, typically, with the OpenAI pattern, the simplest function/tool call is a series of messages of various roles (system, user, assistant, tool) organized like this:

system → user ("what's the weather?") → assistant (tool_calls) → tool (result: "4ºC") → assistant (content: "it's 4ºC")

The HF docs seem to indicate this is the same pattern, although the messages have some minor differences (e.g. description: null, which never happens with OpenAI). When using the Python inference client, these tool_calls never resolve even after functions are called and their return values are included and seemingly properly referenced. Instead, they look like this:

system → user ("what's the weather?") → assistant (tool_calls) → tool (result: "4ºC") → assistant (tool_calls) …

Instead of returning a text completion, the HF inference client returns the same "assistant" message specifying a required tool_calls. In OpenAI, they resolve to a typical "assistant" message with token content if the function calls have been satisfied and no further calls are required.

Models used that exhibit this behavior:

NousResearch/Hermes-3-Llama-3.1-8B
Qwen/Qwen2.5-72B-Instruct
meta-llama/Meta-Llama-3-8B-Instruct

It's worth noting that Mistral models also error out, specifying that a 9-character alphanumeric string is required for the tool_call_id. Now, the models themselves don't provide such IDs, so we need to supply them ourselves. But even when doing so, the same error occurs, that 9-char identifiers are missing. (e.g. mistralai/Mistral-7B-Instruct-v0.3)

The JavaScript client also fails with the above errors, and also a third: "An error occurred while fetching the blob".

System Info

macOS 15.2
Python 3.13.1
huggingface_hub 0.28.1

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

Gist of sample error code is here: https://gist.github.com/awmartin/c64c84fbbdc3a9f0c2ce6e5ae0dab3dc

Provide API token
python inference-tool-calls.py

A message results that's unexpected. I expected this to be a typical message with a string content, something like, "It's 4 degrees today." Instead, it just repeats the assistant message with the original tool_call message:

[ChatCompletionOutputComplete(finish_reason='stop', index=0, message=ChatCompletionOutputMessage(role='assistant', content=None, tool_calls=[ChatCompletionOutputToolCall(function=ChatCompletionOutputFunctionDefinition(arguments={'unit': 'Celsius', 'location': 'Philadelphia, PA, US'}, name='get_current_temperature', description=None), id='0', type='function')]), logprobs=None)]

Expected behavior

I expected a message that resolved to something similar to "It's 4 degrees Celsius today" rather than the tool_call message repeated.

The text was updated successfully, but these errors were encountered:

awmartin · 2025-02-03T18:03:50Z

I think I opened this issue in the wrong repo. Moving it to here: huggingface/huggingface_hub#2829

awmartin · 2025-02-03T21:25:58Z

Reopening as I'm more convinced this is an error with the inference API and not the clients. All the clients (HF JS, HF PY, and OpenAI) fail in the same way.

calycekr · 2025-02-05T02:57:16Z

@awmartin TGI's open AI API compatibility is still lacking compared to vllm.

awmartin · 2025-02-06T04:04:31Z

@calycekr Thanks, I'll check it out!

My workaround for this bug(?) is to remove the "tools" definitions from the follow-up chat completion instance that supplies the tool responses/return values. It seems to work for now for short chats, but I suspect there are edge cases that will fail.

Related, I need to do this as well for vision models that accept OpenAI "image_url" messages. When supplying an image_url, tools are always triggered, seemingly randomly, even though the semantics of the prompt have nothing to do with the tool descriptions. Seems like another bug to report, but I'm not sure if HF's intent is to be OpenAI compatible or if the intent is to be able to provide prompts, images, and tools and have them triggered properly, in a more general or more HF-specific sense.

LikeSundayLikeRain · 2025-02-07T16:27:03Z

I suspect this is because the input message doesn't support tool_calls field, so the model don't know it already generated a tool_call response, so it returns tool_call again.

https://github.com/huggingface/text-generation-inference/blob/main/router/src/lib.rs#L1180

awmartin · 2025-02-08T17:07:35Z

Further description of the problem and my workaround here. These kinds of workarounds will work for simple cases, but when multiple tool calls are required or when images should trigger a tool call, as in OpenAI, they will likely fall short.

qdrddr · 2025-02-20T17:10:07Z

Would that be of any help the LM Studio has implemented MLX. And here is Anemll ANE library to work with MLX it is MIT Licensed. And there's FastMLX with an Apache 2.0 license.

awmartin · 2025-02-22T16:19:03Z

@qdrddr Thanks. I do use LM Studio and MLX models, but I'm not blocked on getting tool calling working in general, I'm hindered by getting it working as well as OpenAI with HF specifically. HF's inference API appears to be broken, as @LikeSundayLikeRain may have found.

The app I'm building isn't macOS-specific, it's web-based, and it's intended to support OpenAI, HF, and arbitrary inference endpoints. So these suggestions certainly may work for local inference setups on macOS like mine, but I haven't yet tested tool calls on them as extensively yet.

But if the MLX implementation serves as a clue for how to help resolve this bug in HF, that's great. Tool behaviors are highly model-dependent, but this bug may hinder the correct behavior even if the model responds properly.

This was referenced Feb 3, 2025

Passing tool results to the LM huggingface/huggingface_hub#2606

Closed

Function/tool calling never resolves huggingface/huggingface_hub#2829

Closed

awmartin closed this as completed Feb 3, 2025

awmartin reopened this Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function/tool calling never resolves #2986

Function/tool calling never resolves #2986

awmartin commented Feb 2, 2025 •

edited

Loading

awmartin commented Feb 3, 2025

awmartin commented Feb 3, 2025

calycekr commented Feb 5, 2025

awmartin commented Feb 6, 2025

LikeSundayLikeRain commented Feb 7, 2025 •

edited

Loading

awmartin commented Feb 8, 2025

qdrddr commented Feb 20, 2025

awmartin commented Feb 22, 2025

Function/tool calling never resolves #2986

Function/tool calling never resolves #2986

Comments

awmartin commented Feb 2, 2025 • edited Loading

Description

System Info

Information

Tasks

Reproduction

Expected behavior

awmartin commented Feb 3, 2025

awmartin commented Feb 3, 2025

calycekr commented Feb 5, 2025

awmartin commented Feb 6, 2025

LikeSundayLikeRain commented Feb 7, 2025 • edited Loading

awmartin commented Feb 8, 2025

qdrddr commented Feb 20, 2025

awmartin commented Feb 22, 2025

awmartin commented Feb 2, 2025 •

edited

Loading

LikeSundayLikeRain commented Feb 7, 2025 •

edited

Loading