Skip to content

Commit

Permalink
Update benchmarks.sh
Browse files Browse the repository at this point in the history
  • Loading branch information
jainapurva committed Oct 4, 2024
1 parent b16772d commit 2bb4c89
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 3 deletions.
1 change: 1 addition & 0 deletions torchao/_models/llama/benchmarks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ export MODEL_REPO=meta-llama/Meta-Llama-3.1-8B
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --write_result benchmark_results.txt
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --quantization int8wo --write_result benchmark_results.txt
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --quantization int4wo-64 --write_result benchmark_results.txt
# Runs on H100, float8 is not supported on CUDA arch < 8.9
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --quantization float8wo --write_result benchmark_results.txt
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --quantization float8dq-tensor --write_result benchmark_results.txt
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --quantization float8dq-wo --write_result benchmark_results.txt
Expand Down
3 changes: 0 additions & 3 deletions torchao/quantization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,9 +139,6 @@ change_linear_weights_to_int8_dqtensors(model)
from torchao.quantization import quantize_, float8_dynamic_activation_float8_weight
from torchao.quantization.observer import PerTensor
quantize_(model, float8_dynamic_activation_float8_weight(granularity=PerTensor()))
from torchao.quantization.observer import PerTensor
quantize_(model, float8_dynamic_activation_float8_weight(granularity=PerTensor()))

```

#### A16W6 Floating Point WeightOnly Quantization
Expand Down

0 comments on commit 2bb4c89

Please sign in to comment.