About gemma-2-9b-it  AWQ quantization 

Hi, I saw the AWQ model of gemma-2-9b-it on HF [(solidrust/gemma-2-9b-it-AWQ)](https://huggingface.co/solidrust/gemma-2-9b-it-AWQ) and it works well on vLLM. However, when I tried to quantize same model (google/gemma-2-9b-it) with the code you uploaded on github, the outputs doesn't work well.  I've got this error when I validate the quantized model.
```
'Gemma2LikeModel' object has no attribute '_prepare_4d_causal_attention_mask_with_cache_position'
```
how can I solve it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

About gemma-2-9b-it AWQ quantization #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

About gemma-2-9b-it AWQ quantization #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions