Add minimum capability requirement for AWQ #1064

WoosukKwon · 2023-09-16T22:13:56Z

Closes #1063

esmeetu · 2023-09-17T00:06:29Z

@WoosukKwon I think setup.py file might need add compute capability check whether to build quant kernel or not.

WoosukKwon · 2023-09-17T00:35:23Z

@esmeetu Thanks for the suggestion! I've instead added a guard to prevent compilation for the supported GPUs.

zhuohan123

Left some comments.

zhuohan123 · 2023-09-17T06:28:16Z

csrc/quantization/awq/dequantize.cuh

+namespace vllm {
+namespace awq {


Why is this namespace needed?

They are optional, but for better coding convention. From google cpp style guide:

With few exceptions, place code in a namespace.

Namespace prevents naming conflicts, so it's pretty useful for external code like the AWQ kernels.

zhuohan123 · 2023-09-17T06:29:05Z

vllm/model_executor/model_loader.py

+        capability = torch.cuda.get_device_capability()
+        capability = capability[0] * 10 + capability[1]
+        if capability < quant_config.get_min_capability():
+            raise ValueError(
+                f"The quantization method {model_config.quantization} is not "
+                "supported for the current GPU. "
+                f"Minimum capability: {quant_config.get_min_capability()}. "
+                f"Current capability: {capability}.")


Why do we need the assert false in C++ if we have the check here?

Just saw the comments in the PR. Can we just change setup.py instead of the C++ files?

It's problematic when we want to build the wheel for all GPU architectures (e.g., for pypi publication or building docker image). In such a case, we cannot selectively include the extension according to the architecture. Therefore, I believe this is an easier solution, and in fact we already used this kind of guard for bfloat16 attention kernels, which do not support Turing and Volta GPUs.

zhuohan123

LGTM! Thanks for the fix!

Add minimum capability for awq

9edd24d

WoosukKwon requested a review from zhuohan123 September 16, 2023 22:13

Add guard for unsupported GPUs

70a1af4

zhuohan123 reviewed Sep 17, 2023

View reviewed changes

WoosukKwon mentioned this pull request Sep 17, 2023

Can't build vLLM from Docker due to AWQ's minimum architecture requirements - TORCH_CUDA_ARCH_LIST does not help #1070

Closed

WoosukKwon requested a review from zhuohan123 September 17, 2023 11:08

zhuohan123 approved these changes Sep 18, 2023

View reviewed changes

WoosukKwon merged commit 2b1c116 into main Sep 18, 2023
2 checks passed

WoosukKwon deleted the awq-warning branch September 18, 2023 19:02

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Add minimum capability requirement for AWQ (vllm-project#1064)

25bd639

sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024

Add minimum capability requirement for AWQ (vllm-project#1064)

2a5e962

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add minimum capability requirement for AWQ #1064

Add minimum capability requirement for AWQ #1064

WoosukKwon commented Sep 16, 2023

esmeetu commented Sep 17, 2023

WoosukKwon commented Sep 17, 2023

zhuohan123 left a comment

zhuohan123 Sep 17, 2023

WoosukKwon Sep 17, 2023 •

edited

Loading

zhuohan123 Sep 17, 2023

zhuohan123 Sep 17, 2023 •

edited

Loading

WoosukKwon Sep 17, 2023

zhuohan123 left a comment

Add minimum capability requirement for AWQ #1064

Add minimum capability requirement for AWQ #1064

Conversation

WoosukKwon commented Sep 16, 2023

esmeetu commented Sep 17, 2023

WoosukKwon commented Sep 17, 2023

zhuohan123 left a comment

Choose a reason for hiding this comment

zhuohan123 Sep 17, 2023

Choose a reason for hiding this comment

WoosukKwon Sep 17, 2023 • edited Loading

Choose a reason for hiding this comment

zhuohan123 Sep 17, 2023

Choose a reason for hiding this comment

zhuohan123 Sep 17, 2023 • edited Loading

Choose a reason for hiding this comment

WoosukKwon Sep 17, 2023

Choose a reason for hiding this comment

zhuohan123 left a comment

Choose a reason for hiding this comment

WoosukKwon Sep 17, 2023 •

edited

Loading

zhuohan123 Sep 17, 2023 •

edited

Loading