Skip to content

Tool call and template bug: Mistral NeMo Instruct 2407 - Failed to infer a tool call example (possible template bug) #14038

@broadbit-hu

Description

@broadbit-hu

Name and Version

llama.cpp version: release b5589

Command Line:

./build/bin/llama-cli -m ../models/Mistral-Nemo-Instruct-2407/Mistral-Nemo-12B-Instruct-2407-F16.gguf -c 1024

Log events:

load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
Failed to infer a tool call example (possible template bug)

Tool call test:

Server:

./build/bin/llama-server -m ../models/Mistral-Nemo-Instruct-2407/Mistral-Nemo-12B-Instruct-2407-F16.gguf -c 1024 --host 0.0.0.0 --port 18087 --jinja

Request:

curl http://192.168.253.167:18087/v1/chat/completions -d '{
  "model": "gpt-3.5-turbo",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "python",
        "description": "Runs code in an ipython interpreter and returns the result of the execution after 60 seconds.",
        "parameters": {
          "type": "object",
          "properties": {
            "code": {
              "type": "string",
              "description": "The code to run in the ipython interpreter."
            }
          },
          "required": ["code"]
        }
      }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": "Print a hello world message with python."
    }
  ]
}' | jq .choices

Results:

[
  {
    "finish_reason": "stop",
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I assist you today?"
    }
  }
]

Last working version

The last working version is b4738 (https://github.com/ggml-org/llama.cpp/releases/tag/b4738)

Result:

[
  {
    "finish_reason": "tool_calls",
    "index": 0,
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "type": "function",
          "function": {
            "name": "python",
            "arguments": "{\"code\":\"print('Hello, World!')'}}]```python\\nprint('Hello, World!')\\n```\\nHello, World!```\"}"
          },
          "id": "c9a2248d9"
        }
      ]
    }
  }
]

Operating systems

Linux

GGML backends

CPU

Hardware

AMD Ryzen 5 4500

Models

Mistral Nemo Instruct 2407

Problem description & steps to reproduce

The Mistral NeMo model does not support tool calling functionality with the current llama.cpp/server version.

First Bad Commit

Commit 63e489c: 63e489c

tool-call: refactor common chat / tool-call api (+ tests / fixes) (https://github.com/ggml-org/llama.cpp/pull/11900)
* tool-call refactoring: moved common_chat_* to chat.h, common_chat_templates_init return a unique_ptr to opaque type

* addressed clang-tidy lints in [test-]chat.*

* rm minja deps from util & common & move it to common/minja/

* add name & tool_call_id to common_chat_msg

* add common_chat_tool

* added json <-> tools, msgs conversions to chat.h

* fix double bos/eos jinja avoidance hack (was preventing inner bos/eos tokens)

* fix deepseek r1 slow test (no longer <think> opening w/ new template)

* allow empty tools w/ auto + grammar

* fix & test server grammar & json_schema params w/ & w/o --jinja

Release b4739 (https://github.com/ggml-org/llama.cpp/releases/tag/b4739)

Relevant log output

load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect

Failed to infer a tool call example (possible template bug)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions