Skip to content

Commit

Permalink
Fix : fix doc fp8 (#36173)
Browse files Browse the repository at this point in the history
* fix

* fix
  • Loading branch information
MekkCyber authored Feb 13, 2025
1 parent b079dd1 commit b41591d
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/source/en/quantization/finegrained_fp8.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ pip install --upgrade accelerate torch
By default, the weights are loaded in full precision (torch.float32) regardless of the actual data type the weights are stored in such as torch.float16. Set `torch_dtype="auto"` to load the weights in the data type defined in a model's `config.json` file to automatically load the most memory-optimal data type.

```py
from transformers import FP8Config, AutoModelForCausalLM, AutoTokenizer
from transformers import FineGrainedFP8Config, AutoModelForCausalLM, AutoTokenizer

model_name = "meta-llama/Meta-Llama-3-8B"
quantization_config = FP8Config()
quantization_config = FineGrainedFP8Config()
quantized_model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto", quantization_config=quantization_config)

tokenizer = AutoTokenizer.from_pretrained(model_name)
Expand Down

0 comments on commit b41591d

Please sign in to comment.