-
Notifications
You must be signed in to change notification settings - Fork 211
What is the system requirement to run the sample code? #531
Comments
The CPU can support the avx512 instruction set, and we recommend the RAM bigger than 16GB. |
Ok.... My CPU does not support avx512. Maybe I should have bought AMD CPU. What can I use? |
Don't worry, we just supported AVX2 fp32 ( |
AVX_VNNI support has been added in #565. You can enable it by setting @yuchengliu1 are working to squeeze some more performance based on the hybrid architecture. Feel free to reopen the issue if you have any further questions. |
Hi @DDEle and @zhenwei-intel ,
Even tried disabling AVX 512 as, Can you please help? Thanks. |
Hi @amir1m, |
What is the system requirement to run the following sample code?
from transformers import AutoTokenizer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig
model_name = "Intel/neural-chat-7b-v1-1" # Hugging Face model_id or local model
config = WeightOnlyQuantConfig(compute_dtype="int8", weight_dtype="int4")
prompt = "Once upon a time, a little girl"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=config)
gen_tokens = model.generate(inputs, max_new_tokens=300)
outputs = tokenizer.batch_decode(gen_tokens)
The text was updated successfully, but these errors were encountered: