Release GPTQModel v1.9.0 · ModelCloud/GPTQModel

What's Changed

⚡ Offload tokenizer fixes to Toke(n)icer pkg.
⚡ Optimized lm_head quant time and vram usage.
⚡ Optimized DeekSeek v3/R1 model quant vram usage.
⚡ 3x speed-up for Torch kernel when using Pytorch >= 2.5.0 with model.compile().
⚡ New calibration_dataset_concat_size option to enable calibration data concat mode to mimic original GPTQ data packing strategy which may improve quant speed and accuracy for datasets like wikitext2.
🐛 Fixed Optimum compat and XPU/IPEX auto kernel selection regresion in v1.8.1

Fix init arg order and optimum compat by @CSY-ModelCloud in #1240
[FIX][Optimize] lm_head quantize by @ZX-ModelCloud in #1239
[Model] [DeepSpeek] un-merge gate_proj and up_proj by @LRL-ModelCloud in #1241
Use Toke(n)icer by @CL-ModelCloud in #1242
#1244
Add Tokenicer Test by @CL-ModelCloud in #1245
prepare for 1.8.2 release by @Qubitium in #1243
simplify calls to tokenicer by @CL-ModelCloud in #1246
Update requirements.txt by @Qubitium in #1248
fix trust_remote was lost by @CSY-ModelCloud in #1249
fix trust_remote was lost by @CSY-ModelCloud in #1250
prepare for 1.8.5 release by @Qubitium in #1251
fix unit tests & tweak logic for selecting backends by @CSY-ModelCloud in #1253
install tokenicer form git & do ruff by @CSY-ModelCloud in #1254
fix k,v is not a dict by @CSY-ModelCloud in #1255
fix not enough values to unpack (expected 2, got 1) by @CSY-ModelCloud in #1256
fix sglang test requires numpy<2.0 by @CSY-ModelCloud in #1258
fix ipex backend by @jiqing-feng in #1259
ipex should be packable, reverted pr #1259 importer.py changes by @CSY-ModelCloud in #1260
remove sentencepiece by @CSY-ModelCloud in #1261
speed up torch dequantize by @Qubitium in #1262
Add calibration_dataset_concat_size option/mode by @LRL-ModelCloud in #1257
add transformers test by @CSY-ModelCloud in #1264
Add kernel torch.compile hook by @Qubitium in #1265
[FIX]fix vl model prepare_dataset by @LRL-ModelCloud in #1266

Full Changelog: v1.8.1...v1.9.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v1.9.0

What's Changed

Contributors