Skip to content

Commit

Permalink
[Doc] Update examples to remove SparseAutoModelForCausalLM (vllm-proj…
Browse files Browse the repository at this point in the history
…ect#12062)

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Bowen Wang <abmfy@icloud.com>
  • Loading branch information
kylesayrs authored and abmfy committed Jan 24, 2025
1 parent 23735fb commit 4218bba
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 10 deletions.
11 changes: 5 additions & 6 deletions docs/source/features/quantization/fp8.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,16 +54,15 @@ The quantization process involves three main steps:

### 1. Loading the Model

Use `SparseAutoModelForCausalLM`, which wraps `AutoModelForCausalLM`, for saving and loading quantized models:
Load your model and tokenizer using the standard `transformers` AutoModel classes:

```python
from llmcompressor.transformers import SparseAutoModelForCausalLM
from transformers import AutoTokenizer
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct"

model = SparseAutoModelForCausalLM.from_pretrained(
MODEL_ID, device_map="auto", torch_dtype="auto")
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID, device_map="auto", torch_dtype="auto",
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
```

Expand Down
7 changes: 3 additions & 4 deletions docs/source/features/quantization/int8.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,13 @@ The quantization process involves four main steps:

### 1. Loading the Model

Use `SparseAutoModelForCausalLM`, which wraps `AutoModelForCausalLM`, for saving and loading quantized models:
Load your model and tokenizer using the standard `transformers` AutoModel classes:

```python
from llmcompressor.transformers import SparseAutoModelForCausalLM
from transformers import AutoTokenizer
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct"
model = SparseAutoModelForCausalLM.from_pretrained(
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID, device_map="auto", torch_dtype="auto",
)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
Expand Down

0 comments on commit 4218bba

Please sign in to comment.