[Bug]: Unable to run meta-llama/Llama-Guard-3-8B-INT8 #6756

xfalcox · 2024-07-24T17:38:09Z

Your current environment

Latest Docker image, RTX 4090

🐛 Describe the bug

docker run --gpus all vllm/vllm-openai:latest --model meta-llama/Llama-Guard-3-8B-INT8
...
[rank0]:     raise ValueError(f"Cannot find any of {keys} in the model's "
[rank0]: ValueError: Cannot find any of ['adapter_name_or_path'] in the model's quantization config.

The text was updated successfully, but these errors were encountered:

mgoin · 2024-07-24T18:49:31Z

@thesues @chenqianfzh It looks like this is an 8bit BNB model. Would it be easy to add support for these checkpoints as well?

chenqianfzh · 2024-07-24T20:44:02Z

@thesues @chenqianfzh It looks like this is an 8bit BNB model. Would it be easy to add support for these checkpoints as well?

It won't be difficult. I will work on it with higher priority.

meihui · 2024-08-06T06:32:43Z

seems version 0.5.4+cu124 is working with bnb 4bit model.

but it says

WARNING 08-06 06:27:07 config.py:254] bitsandbytes quantization is not fully optimized yet. The speed can be slower than non-quantized models.

Will that be a easy fix/support too?

chenqianfzh · 2024-08-12T23:54:27Z

#7445

with this PR, meta-llama/Llama-Guard-3-8B-INT8 is supported.

chenqianfzh · 2024-08-13T00:01:31Z

The speed can be slower than

A lot of quantizations here are not in the optimized method list yet. It is not our top priority now to optimize the speed yet, as we are working to support more quantization features of bnb.

xfalcox added the bug Something isn't working label Jul 24, 2024

mgoin mentioned this issue Aug 13, 2024

support bitsandbytes 8-bit and FP4 quantized models #7445

Merged

mgoin closed this as completed in #7445 Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Unable to run meta-llama/Llama-Guard-3-8B-INT8 #6756

[Bug]: Unable to run meta-llama/Llama-Guard-3-8B-INT8 #6756

xfalcox commented Jul 24, 2024

mgoin commented Jul 24, 2024

chenqianfzh commented Jul 24, 2024

meihui commented Aug 6, 2024

chenqianfzh commented Aug 12, 2024

chenqianfzh commented Aug 13, 2024

[Bug]: Unable to run meta-llama/Llama-Guard-3-8B-INT8 #6756

[Bug]: Unable to run meta-llama/Llama-Guard-3-8B-INT8 #6756

Comments

xfalcox commented Jul 24, 2024

Your current environment

🐛 Describe the bug

mgoin commented Jul 24, 2024

chenqianfzh commented Jul 24, 2024

meihui commented Aug 6, 2024

chenqianfzh commented Aug 12, 2024

chenqianfzh commented Aug 13, 2024