Skip to content

Commit

Permalink
[Kernel] Tuned FP8 Kernels for Ada Lovelace (vllm-project#6677)
Browse files Browse the repository at this point in the history
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
  • Loading branch information
2 people authored and kylesayrs committed Aug 17, 2024
1 parent f2362f1 commit 18087ae
Show file tree
Hide file tree
Showing 6 changed files with 877 additions and 490 deletions.
2 changes: 1 addition & 1 deletion benchmarks/cutlass_benchmarks/w8a8_benchmarks.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from vllm import _custom_ops as ops
from vllm.utils import FlexibleArgumentParser

DEFAULT_MODELS = list(WEIGHT_SHAPES.keys())[1:]
DEFAULT_MODELS = list(WEIGHT_SHAPES.keys())
DEFAULT_BATCH_SIZES = [1, 16, 32, 64, 128, 256, 512]
DEFAULT_TP_SIZES = [1]

Expand Down
Loading

0 comments on commit 18087ae

Please sign in to comment.