Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
qwopqwop200 authored Mar 28, 2023
1 parent eff2102 commit 4c15f16
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ GPTQ is SOTA one-shot weight quantization method

## New Features
**Changed to use only pytorch instead of the current cuda kernel.
It has no impact on memory usage. There is a slowdown below 128 length(If you use Transformers' use_cache, seq_len is effectively close to 1.), but much faster at 128 and above.**
It has no impact on memory usage. There is a slowdown below 128 length(If you use Transformers' use_cache, length is effectively close to 1.), but much faster at 128 and above.**

Changed to support new features proposed by [GPTQ](https://github.com/IST-DASLab/gptq#new-features).

Expand Down

0 comments on commit 4c15f16

Please sign in to comment.