-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function/tool calling never resolves #2986
Comments
I think I opened this issue in the wrong repo. Moving it to here: huggingface/huggingface_hub#2829 |
Reopening as I'm more convinced this is an error with the inference API and not the clients. All the clients (HF JS, HF PY, and OpenAI) fail in the same way. |
@awmartin TGI's open AI API compatibility is still lacking compared to vllm. |
@calycekr Thanks, I'll check it out! My workaround for this bug(?) is to remove the "tools" definitions from the follow-up chat completion instance that supplies the tool responses/return values. It seems to work for now for short chats, but I suspect there are edge cases that will fail. Related, I need to do this as well for vision models that accept OpenAI "image_url" messages. When supplying an image_url, tools are always triggered, seemingly randomly, even though the semantics of the prompt have nothing to do with the tool descriptions. Seems like another bug to report, but I'm not sure if HF's intent is to be OpenAI compatible or if the intent is to be able to provide prompts, images, and tools and have them triggered properly, in a more general or more HF-specific sense. |
I suspect this is because the input message doesn't support https://github.com/huggingface/text-generation-inference/blob/main/router/src/lib.rs#L1180 |
Further description of the problem and my workaround here. These kinds of workarounds will work for simple cases, but when multiple tool calls are required or when images should trigger a tool call, as in OpenAI, they will likely fall short. |
Would that be of any help the LM Studio has implemented MLX. And here is Anemll ANE library to work with MLX it is MIT Licensed. And there's FastMLX with an Apache 2.0 license. |
@qdrddr Thanks. I do use LM Studio and MLX models, but I'm not blocked on getting tool calling working in general, I'm hindered by getting it working as well as OpenAI with HF specifically. HF's inference API appears to be broken, as @LikeSundayLikeRain may have found. The app I'm building isn't macOS-specific, it's web-based, and it's intended to support OpenAI, HF, and arbitrary inference endpoints. So these suggestions certainly may work for local inference setups on macOS like mine, but I haven't yet tested tool calls on them as extensively yet. But if the MLX implementation serves as a clue for how to help resolve this bug in HF, that's great. Tool behaviors are highly model-dependent, but this bug may hinder the correct behavior even if the model responds properly. |
Description
When using the inference client with function calling, models seem to never resolve their calls.
As we know, typically, with the OpenAI pattern, the simplest function/tool call is a series of messages of various roles (system, user, assistant, tool) organized like this:
The HF docs seem to indicate this is the same pattern, although the messages have some minor differences (e.g. description: null, which never happens with OpenAI). When using the Python inference client, these tool_calls never resolve even after functions are called and their return values are included and seemingly properly referenced. Instead, they look like this:
Instead of returning a text completion, the HF inference client returns the same "assistant" message specifying a required tool_calls. In OpenAI, they resolve to a typical "assistant" message with token content if the function calls have been satisfied and no further calls are required.
Models used that exhibit this behavior:
It's worth noting that Mistral models also error out, specifying that a 9-character alphanumeric string is required for the
tool_call_id
. Now, the models themselves don't provide such IDs, so we need to supply them ourselves. But even when doing so, the same error occurs, that 9-char identifiers are missing. (e.g. mistralai/Mistral-7B-Instruct-v0.3)The JavaScript client also fails with the above errors, and also a third: "An error occurred while fetching the blob".
System Info
Information
Tasks
Reproduction
Gist of sample error code is here: https://gist.github.com/awmartin/c64c84fbbdc3a9f0c2ce6e5ae0dab3dc
A message results that's unexpected. I expected this to be a typical message with a string content, something like, "It's 4 degrees today." Instead, it just repeats the assistant message with the original tool_call message:
Expected behavior
I expected a message that resolved to something similar to "It's 4 degrees Celsius today" rather than the tool_call message repeated.
The text was updated successfully, but these errors were encountered: