RuntimeError: Cannot convert token with models.Exllamav2 #1319

Qasimk555 · 2024-12-04T15:21:23Z

Describe the issue as clearly as possible:

Im getting runtime error, whenever I try generate.json on Exvllama2 model. The error is as follows:

RuntimeError: Cannot convert token � (127815) to bytes: �

Steps/code to reproduce the bug:

from outlines import models, generate, samplers
#import exllamav2
from pydantic import BaseModel
import outlines

class User(BaseModel):
    name: str
    last_name: str
    id: int

@outlines.prompt
def chat_template(messages, bos_token="<|begin_of_text|>", custom_tools=None, tools_in_user_message=True, 
                 date_string=None, strftime_now=None, tools=None, add_generation_prompt=False):
    """{{- bos_token }}
{%- if custom_tools is defined %}
    {%- set tools = custom_tools %}
{%- endif %}
{%- if not tools_in_user_message is defined %}
    {%- set tools_in_user_message = true %}
{%- endif %}
{%- if not date_string is defined %}
    {%- if strftime_now is defined %}
        {%- set date_string = strftime_now("%d %b %Y") %}
    {%- else %}
        {%- set date_string = "26 Jul 2024" %}
    {%- endif %}
{%- endif %}
{%- if not tools is defined %}
    {%- set tools = none %}
{%- endif %}

{#- This block extracts the system message, so we can slot it into the right place. #}
{%- if messages[0]['role'] == 'system' %}
    {%- set system_message = messages[0]['content']|trim %}
    {%- set messages = messages[1:] %}
{%- else %}
    {%- set system_message = "" %}
{%- endif %}

{#- System message #}
{{- "<|start_header_id|>system<|end_header_id|>\\n\\n" }}
{%- if tools is not none %}
    {{- "Environment: ipython\\n" }}
{%- endif %}
{{- "Cutting Knowledge Date: December 2023\\n" }}
{{- "Today Date: " + date_string + "\\n\\n" }}
{%- if tools is not none and not tools_in_user_message %}
    {{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }}
    {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
    {{- "Do not use variables.\\n\\n" }}
    {%- for t in tools %}
        {{- t | tojson(indent=4) }}
        {{- "\\n\\n" }}
    {%- endfor %}
{%- endif %}
{{- system_message }}
{{- "<|eot_id|>" }}

{#- Custom tools are passed in a user message with some extra guidance #}
{%- if tools_in_user_message and not tools is none %}
    {#- Extract the first user message so we can plug it in here #}
    {%- if messages | length != 0 %}
        {%- set first_user_message = messages[0]['content']|trim %}
        {%- set messages = messages[1:] %}
    {%- else %}
        {{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }}
    {%- endif %}
    {{- '<|start_header_id|>user<|end_header_id|>\\n\\n' -}}
    {{- "Given the following functions, please respond with a JSON for a function call " }}
    {{- "with its proper arguments that best answers the given prompt.\\n\\n" }}
    {{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
    {{- "Do not use variables.\\n\\n" }}
    {%- for t in tools %}
        {{- t | tojson(indent=4) }}
        {{- "\\n\\n" }}
    {%- endfor %}
    {{- first_user_message + "<|eot_id|>"}}
{%- endif %}

{%- for message in messages %}
    {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}
        {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n'+ message['content'] | trim + '<|eot_id|>' }}
    {%- elif 'tool_calls' in message %}
        {%- if not message.tool_calls|length == 1 %}
            {{- raise_exception("This model only supports single tool-calls at once!") }}
        {%- endif %}
        {%- set tool_call = message.tool_calls[0].function %}
        {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' -}}
        {{- '{"name": "' + tool_call.name + '", ' }}
        {{- '"parameters": ' }}
        {{- tool_call.arguments | tojson }}
        {{- "}" }}
        {{- "<|eot_id|>" }}
    {%- elif message.role == "tool" or message.role == "ipython" %}
        {{- "<|start_header_id|>ipython<|end_header_id|>\\n\\n" }}
        {%- if message.content is mapping or message.content is iterable %}
            {{- message.content | tojson }}
        {%- else %}
            {{- message.content }}
        {%- endif %}
        {{- "<|eot_id|>" }}
    {%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' }}
{%- endif %}"""

# Example usage
example_messages = [
    {"role": "system", "content": "You are a helpful assistant"},

    #{"role": "user", "content": "Hello."}
    {"role": "user", "content": "Create a user profile with the fields name, last_name and id."}
]

prompt = chat_template(
    messages=example_messages,
    date_string="04 Dec 2024",add_generation_prompt=True
    
)
print(prompt)

model = models.exl2(model_path="./Llama-3.2-3B-Instruct-exl2",max_seq_len=2048)

sampler = samplers.multinomial(temperature=0.1)

#generator = generate.text(model,sampler)
generator = generate.json(model,User)
# stop_conditions" :["<|eot_id|>"] not working - ???
kwargs = {"stop_conditions" :['<|eot_id|>'], 'max_new_tokens': 512, "completion_only" :True}
result = generator(
    prompt,
   **kwargs
)
print(result)

Expected result:

{"name":<some name,
"last_name":<some name>,
"id":<some id>}

Error message:

Traceback (most recent call last):
  File "C:\Users\legio\Desktop\llm_sm\outlines_exlv2.py", line 130, in <module>
    generator = generate.json(model,User)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\functools.py", line 889, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines\generate\json.py", line 54, in json
    generator = regex(model, regex_str, sampler)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\functools.py", line 889, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines\generate\regex.py", line 34, in regex
    logits_processor = RegexLogitsProcessor(regex_str, tokenizer=model.tokenizer)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines\processors\structured.py", line 152, in __init__
    guide = RegexGuide.from_regex(regex_string, tokenizer)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines\fsm\guide.py", line 92, in from_regex
    return super().from_regex(
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines_core\fsm\guide.py", line 212, in from_regex
    ) = _create_states_mapping(
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines\fsm\guide.py", line 76, in cached_create_states_mapping
    return uncached_create_states_mapping(regex_string, tokenizer, *args, **kwargs)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines_core\fsm\guide.py", line 141, in create_states_mapping
    return create_states_mapping_from_fsm(regex_fsm, tokenizer, frozen_tokens)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines_core\fsm\guide.py", line 178, in create_states_mapping_from_fsm
    states_to_token_maps, empty_token_ids = create_fsm_index_tokenizer(
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines_core\fsm\regex.py", line 471, in create_fsm_index_tokenizer
    tokens_to_token_ids, empty_token_ids = reduced_vocabulary(tokenizer)
  File "C:\Users\legio\anaconda3\envs\llm_sm\lib\site-packages\outlines_core\fsm\regex.py", line 424, in reduced_vocabulary
    raise RuntimeError(
RuntimeError: Cannot convert token ` �` (127815) to bytes:  �

Outlines/Python version information:

Version information

``` 0.1.7

Python 3.10.15 | packaged by Anaconda, Inc. | (main, Oct 3 2024, 07:22:19) [MSC v.1929 64 bit (AMD64)]

aiohttpx==0.0.12
airportsdata==20241001
annotated-types==0.7.0
anyio==4.6.2.post1
asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1698341106958/work
async-lru==2.0.4
async_openai==0.0.52
attrs==24.2.0
backoff==2.2.1
certifi==2024.8.30
charset-normalizer==3.4.0
cloudpickle==3.1.0
colorama @ file:///home/conda/feedstock_root/build_artifacts/colorama_1666700638685/work
comm @ file:///home/conda/feedstock_root/build_artifacts/comm_1710320294760/work
cramjam==2.9.0
debugpy @ file:///C:/b/abs_c0y1fjipt2/croot/debugpy_1690906864587/work
decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work
diskcache==5.6.3
distro==1.9.0
einops==0.8.0
exceptiongroup @ file:///home/conda/feedstock_root/build_artifacts/exceptiongroup_1720869315914/work
executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1725214404607/work
exllamav2 @ git+https://github.com/lapp0/exllamav2@ce08f16674a67dac9ea6a770650eb02248b8364a
fastparquet==2024.11.0
filelock==3.16.1
flash_attn @ file:///C:/Users/legio/Desktop/llm_sm/flash_attn-2.6.3%2Bcu122torch2.4.1cxx11abiFALSE-cp310-cp310-win_amd64.whl#sha256=0eea9204c7b67d3e5829f10fcce05e11d14d7f264c28c39f24a9357ea76e5601
frozendict==2.4.6
fsspec==2024.10.0
h11==0.14.0
httpcore==1.0.7
httpx==0.27.2
huggingface-hub==0.26.3
idna==3.10
importlib_metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1726082825846/work
interegular==0.3.3
ipykernel @ file:///D:/bld/ipykernel_1719845595208/work
ipython @ file:///D:/bld/ipython_1729866374643/work
jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1731317204262/work
Jinja2==3.1.4
jiter==0.7.1
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1726610684920/work
jupyter_core @ file:///D:/bld/jupyter_core_1710257272359/work
lark==1.2.2
lazyops==0.2.84
loguru==0.7.2
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1713250518406/work
mdurl==0.1.2
MooreLLM==0.1.7
mpmath==1.3.0
nest_asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1705850609492/work
networkx==3.4.2
ninja==1.11.1.2
numpy==2.1.3
openai==1.54.5
outlines==0.1.7
outlines_core==0.1.17
packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1731802491770/work
pandas==2.2.3
parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1712320355065/work
pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work
platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1726613481435/work
prompt_toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1727341649933/work
psutil @ file:///C:/Windows/Temp/abs_b2c2fd7f-9fd5-4756-95ea-8aed74d0039flsd9qufz/croots/recipe/psutil_1656431277748/work
pure_eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1721585709575/work
pycountry==24.6.1
pydantic==2.9.2
pydantic-settings==2.6.1
pydantic_core==2.23.4
Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1714846767233/work
python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1731919281354/work
python-dotenv==1.0.1
pytz==2024.2
pywin32==305.1
PyYAML==6.0.2
pyzmq @ file:///D:/bld/pyzmq_1666828590571/work
referencing==0.35.1
regex==2024.11.6
requests==2.32.3
rich==13.9.4
rpds-py==0.22.0
safetensors==0.4.5
sentencepiece==0.2.0
six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work
sniffio==1.3.1
stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work
sympy==1.13.1
tiktoken==0.8.0
tokenizers==0.20.3
torch==2.4.1+cu121
tornado @ file:///D:/bld/tornado_1666788744359/work
tqdm==4.67.0
traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1713535121073/work
transformers==4.46.3
typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1717802530399/work
tzdata==2024.2
urllib3==2.2.3
wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1704731205417/work
websockets==14.1
win32-setctime==1.1.0
zipp @ file:///home/conda/feedstock_root/build_artifacts/zipp_1731262100163/work

</details>


### Context for the issue:

_No response_

The text was updated successfully, but these errors were encountered:

Qasimk555 added the bug label Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Cannot convert token with models.Exllamav2 #1319

RuntimeError: Cannot convert token with models.Exllamav2 #1319

Qasimk555 commented Dec 4, 2024

RuntimeError: Cannot convert token with models.Exllamav2 #1319

RuntimeError: Cannot convert token with models.Exllamav2 #1319

Comments

Qasimk555 commented Dec 4, 2024

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information: