From 06df81f26223af5794a0f4d4f12851acdcc41b51 Mon Sep 17 00:00:00 2001 From: Kyle Sayers Date: Wed, 15 Jan 2025 01:36:01 -0500 Subject: [PATCH] [Doc] Update examples to remove SparseAutoModelForCausalLM (#12062) Signed-off-by: Kyle Sayers --- docs/source/features/quantization/fp8.md | 11 +++++------ docs/source/features/quantization/int8.md | 7 +++---- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/docs/source/features/quantization/fp8.md b/docs/source/features/quantization/fp8.md index da49cd2747228..1398e8a324201 100644 --- a/docs/source/features/quantization/fp8.md +++ b/docs/source/features/quantization/fp8.md @@ -54,16 +54,15 @@ The quantization process involves three main steps: ### 1. Loading the Model -Use `SparseAutoModelForCausalLM`, which wraps `AutoModelForCausalLM`, for saving and loading quantized models: +Load your model and tokenizer using the standard `transformers` AutoModel classes: ```python -from llmcompressor.transformers import SparseAutoModelForCausalLM -from transformers import AutoTokenizer +from transformers import AutoTokenizer, AutoModelForCausalLM MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct" - -model = SparseAutoModelForCausalLM.from_pretrained( - MODEL_ID, device_map="auto", torch_dtype="auto") +model = AutoModelForCausalLM.from_pretrained( + MODEL_ID, device_map="auto", torch_dtype="auto", +) tokenizer = AutoTokenizer.from_pretrained(MODEL_ID) ``` diff --git a/docs/source/features/quantization/int8.md b/docs/source/features/quantization/int8.md index 82a15d76d352f..592a60d3988b2 100644 --- a/docs/source/features/quantization/int8.md +++ b/docs/source/features/quantization/int8.md @@ -30,14 +30,13 @@ The quantization process involves four main steps: ### 1. Loading the Model -Use `SparseAutoModelForCausalLM`, which wraps `AutoModelForCausalLM`, for saving and loading quantized models: +Load your model and tokenizer using the standard `transformers` AutoModel classes: ```python -from llmcompressor.transformers import SparseAutoModelForCausalLM -from transformers import AutoTokenizer +from transformers import AutoTokenizer, AutoModelForCausalLM MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct" -model = SparseAutoModelForCausalLM.from_pretrained( +model = AutoModelForCausalLM.from_pretrained( MODEL_ID, device_map="auto", torch_dtype="auto", ) tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)