[Bug]: vLLM throws error when sampling from Cerebras GPT Models #11224

RylanSchaeffer · 2024-12-16T04:32:36Z

Your current environment

The output of `python collect_env.py`

python -u collect_env.py 
Collecting environment information...
Traceback (most recent call last):
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 765, in <module>
    main()
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 744, in main
    output = get_pretty_env_info()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 739, in get_pretty_env_info
    return pretty_str(get_env_info())
                      ^^^^^^^^^^^^^^
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 568, in get_env_info
    vllm_version = get_vllm_version()
                   ^^^^^^^^^^^^^^^^^^
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 273, in get_vllm_version
    from vllm import __version__, __version_tuple__
ImportError: cannot import name '__version_tuple__' from 'vllm' (/lfs/skampere1/0/rschaef/miniconda3/envs/llmonk/lib/python3.11/site-packages/vllm/__init__.py)

Model Input Dumps

No response

🐛 Describe the bug

vLLM throws an error when attempting to use Cerebras's models. Here is a minimal reproduction:

from vllm import LLM, SamplingParams
from vllm.distributed.parallel_state import destroy_model_parallel


model = LLM(model="cerebras/Cerebras-GPT-1.3B", dtype="bfloat16")

model_sampling_params = SamplingParams(
    n=1,
    temperature=1.0,
    max_tokens=64,
    seed=0,
)

output = model.generate(
    prompts=["Please continue the following sentence: The quick brown fox jumps "],
    sampling_params=model_sampling_params,
)

The error is: TypeError: 'NoneType' object is not iterable

It arises here:

    def _verify_embedding_mode(self) -> None:
        architectures = getattr(self.hf_config, "architectures", [])
        self.embedding_mode = any(
            ModelRegistry.is_embedding_model(arch) for arch in architectures)

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

DarkLight1337 · 2024-12-16T07:48:13Z

cerebras/Cerebras-GPT-1.3B doesn't have a valid config.json file, I think. It should have the architectures field like in cerebras/Cerebras-GPT-13B.

DarkLight1337 · 2024-12-16T07:53:50Z

Although HF does keep mappings from model_type to architecture in https://github.com/huggingface/transformers/blob/5615a393691c81e00251e420c73e4d04c6fe22e5/src/transformers/models/auto/modeling_auto.py#L1564, it's not always clear which mapping should be used.

DarkLight1337 · 2024-12-16T07:55:13Z

To solve this, I suggest you pass the architecture name explicitly via --hf-overrides.

RylanSchaeffer · 2025-01-15T04:33:52Z

@DarkLight1337 could you please explain how to modify my above minimal example?

When I try something like:

model = LLM(
    model="cerebras/Cerebras-GPT-1.3B",
    hf_overrides={"model_name_or_path": "cerebras/Cerebras-GPT-1.3B"},
    dtype="bfloat16",
)

I hit the error: `TypeError: EngineArgs.init() got an unexpected keyword argument 'hf_overrides'

DarkLight1337 · 2025-01-15T04:35:12Z

What is your vLLM version? You might have to update it for this to be supported.

RylanSchaeffer · 2025-01-15T04:37:39Z

0.5.4

DarkLight1337 · 2025-01-15T04:39:14Z

Yeah, pretty sure you need to update vLLM.

RylanSchaeffer · 2025-01-15T04:40:01Z

Is there any guarantee of backwards consistency? I've been generating data for a couple months and I need to make sure there's no distribution shift if I change the vllm version

DarkLight1337 · 2025-01-15T04:50:12Z

hf_overrides is a new option that was only added recently. It doesn't change old behavior.

RylanSchaeffer · 2025-01-16T02:40:03Z

I've updated to 0.6.6.post1. Can you please now tell me how to correctly call Cerebras usinng the LLM() class in Python?

RylanSchaeffer · 2025-01-16T02:45:23Z

I'm currently trying:

model = LLM(
    model="cerebras/Cerebras-GPT-1.3B",
    hf_overrides={"architecture": "cerebras/Cerebras-GPT-1.3B"},
    dtype="bfloat16",
)

But this throws:

vllm/model_executor/models/registry.py", line 416, in inspect_model_cls
    for arch in architectures:
TypeError: 'NoneType' object is not iterable

RylanSchaeffer · 2025-01-16T02:50:30Z

To ask a related but separate follow up question, when I try:

model = LLM(
    model="cerebras/Cerebras-GPT-13B",
    hf_overrides={"architecture": "model_type"},
    dtype="bfloat16",
)

I receive the following error: ValueError: Model architectures ['GPT2Model'] are not supported for now.

Since I believe all of the Cerebras models are based on GPT2, what would you advise?

DarkLight1337 · 2025-01-16T03:41:02Z

The "architecture" field should be class name of the model that's implemented in vLLM. In this case, it should be GPT2LMHeadModel as shown in the list of supported models.

RylanSchaeffer · 2025-01-16T04:56:59Z

Can you please provide a correctly functioning minimal working example?

RylanSchaeffer · 2025-01-16T04:57:50Z

model = LLM(
    model="cerebras/Cerebras-GPT-1.3B",
    hf_overrides={"architecture": "GPT2LMHeadModel"},
    dtype="bfloat16",
)

throws the error:

vllm/model_executor/models/registry.py", line 416, in inspect_model_cls
    for arch in architectures:
TypeError: 'NoneType' object is not iterable

DarkLight1337 · 2025-01-16T04:59:00Z

Also, the key should be "architectures" (plural) and you need to pass a list to it. It is basically the same format as HF config.json.

RylanSchaeffer · 2025-01-16T04:59:17Z

Can you please give a functioning minimal working example?

RylanSchaeffer · 2025-01-16T05:00:13Z

model = LLM(
    model="cerebras/Cerebras-GPT-1.3B",
    hf_overrides={"architectures": ["GPT2LMHeadModel"]},
    dtype="bfloat16",
)

RylanSchaeffer · 2025-01-16T05:00:29Z

I'm currently testing this. If it works, I'll close the issue.

RylanSchaeffer added the bug Something isn't working label Dec 16, 2024

RylanSchaeffer changed the title ~~[Bug]: Unable to Use Cerebras GPT Models~~ [Bug]: vLLM throws error when sampling from Cerebras GPT Models Dec 16, 2024

RylanSchaeffer closed this as completed Jan 16, 2025

DarkLight1337 mentioned this issue Jan 16, 2025

[Doc] Add documentation for specifying model architecture #12105

Merged

DarkLight1337 mentioned this issue Jan 23, 2025

[Doc] Troubleshooting errors during model inspection #12351

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: vLLM throws error when sampling from Cerebras GPT Models #11224

[Bug]: vLLM throws error when sampling from Cerebras GPT Models #11224

RylanSchaeffer commented Dec 16, 2024

DarkLight1337 commented Dec 16, 2024 •

edited

Loading

DarkLight1337 commented Dec 16, 2024

DarkLight1337 commented Dec 16, 2024

RylanSchaeffer commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025

RylanSchaeffer commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025

RylanSchaeffer commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025 •

edited

Loading

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

DarkLight1337 commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

DarkLight1337 commented Jan 16, 2025 •

edited

Loading

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

[Bug]: vLLM throws error when sampling from Cerebras GPT Models #11224

[Bug]: vLLM throws error when sampling from Cerebras GPT Models #11224

Comments

RylanSchaeffer commented Dec 16, 2024

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

DarkLight1337 commented Dec 16, 2024 • edited Loading

DarkLight1337 commented Dec 16, 2024

DarkLight1337 commented Dec 16, 2024

RylanSchaeffer commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025

RylanSchaeffer commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025

RylanSchaeffer commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025 • edited Loading

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

DarkLight1337 commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

DarkLight1337 commented Jan 16, 2025 • edited Loading

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

RylanSchaeffer commented Jan 16, 2025

DarkLight1337 commented Dec 16, 2024 •

edited

Loading

DarkLight1337 commented Jan 15, 2025 •

edited

Loading

DarkLight1337 commented Jan 16, 2025 •

edited

Loading