What is the system requirement to run the sample code? #531

sungkim11 · 2023-10-23T22:12:43Z

What is the system requirement to run the following sample code?

from transformers import AutoTokenizer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig

model_name = "Intel/neural-chat-7b-v1-1" # Hugging Face model_id or local model
config = WeightOnlyQuantConfig(compute_dtype="int8", weight_dtype="int4")
prompt = "Once upon a time, a little girl"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids

model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=config)
gen_tokens = model.generate(inputs, max_new_tokens=300)
outputs = tokenizer.batch_decode(gen_tokens)

zhenwei-intel · 2023-10-24T04:56:17Z

The CPU can support the avx512 instruction set, and we recommend the RAM bigger than 16GB.

sungkim11 · 2023-10-24T05:05:22Z

Ok.... My CPU does not support avx512. Maybe I should have bought AMD CPU. What can I use?

DDEle · 2023-10-24T05:14:49Z

Don't worry, we just supported AVX2 fp32 (compute_dtype="fp32") inference in #493. In addition, if you are using a 12th+ Core™ Processor, stay tuned as we are supporting int8 inference with AVX_VNNI.

DDEle · 2023-11-16T08:40:14Z

AVX_VNNI support has been added in #565. You can enable it by setting comput_dtype to int8, which should outperform llama.cpp a lot on 12/13th gen Core CPUs.

@yuchengliu1 are working to squeeze some more performance based on the hybrid architecture.

Feel free to reopen the issue if you have any further questions.

amir1m · 2023-11-20T18:24:33Z

Hi @DDEle and @zhenwei-intel ,
I am trying to build the graph. I followed the steps from README. After cmake .. -G Ninja , when I run ninja , getting following error:

/home/datascience/intel-extension-for-transformers/intel_extension_for_transformers/llm/library/jblas/jblas/kernel_avx512_bf16.h:24:32: error: attribute(target("avx512bf16")) is unknown
#pragma GCC target("avx512bf16")

Even tried disabling AVX 512 as, cmake -G Ninja -DNE_AVX512=OFF -DNE_AVX512_VBMI=OFF -DNE_AVX512_VNNI=OFF , still the same results.
I am running Linux on Intel(R) Xeon(R) Platinum 8167M CPU @ 2.00GHz

Can you please help?

Thanks.

DDEle · 2023-11-21T06:25:48Z

Hi @amir1m,
Please check #726 (comment) for detailed explanation and solution.

kevinintel assigned changwangss and zhenwei-intel and unassigned changwangss Oct 24, 2023

DDEle closed this as completed Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the system requirement to run the sample code? #531

What is the system requirement to run the sample code? #531

sungkim11 commented Oct 23, 2023

zhenwei-intel commented Oct 24, 2023

sungkim11 commented Oct 24, 2023

DDEle commented Oct 24, 2023 •

edited

Loading

DDEle commented Nov 16, 2023 •

edited

Loading

amir1m commented Nov 20, 2023 •

edited

Loading

DDEle commented Nov 21, 2023

What is the system requirement to run the sample code? #531

What is the system requirement to run the sample code? #531

Comments

sungkim11 commented Oct 23, 2023

zhenwei-intel commented Oct 24, 2023

sungkim11 commented Oct 24, 2023

DDEle commented Oct 24, 2023 • edited Loading

DDEle commented Nov 16, 2023 • edited Loading

amir1m commented Nov 20, 2023 • edited Loading

DDEle commented Nov 21, 2023

DDEle commented Oct 24, 2023 •

edited

Loading

DDEle commented Nov 16, 2023 •

edited

Loading

amir1m commented Nov 20, 2023 •

edited

Loading