[LLM Runtime] Extend API for GGUF #1218

Zhenzhong1 · 2024-01-31T07:51:50Z

Type of Change

[LLM Runtime] Extend API for GGUF

Description

Add the model_type arg:
model_type can be set manually or empty.
If the model_type is empty, it will be set according to the HF configuration automatically.

model_name = "TheBloke/Mistral-7B-v0.1-GGUF"
model_file = "mistral-7b-v0.1.Q4_0.gguf" 
tokenizer_name = "mistralai/Mistral-7B-v0.1"

prompt = "Once upon a time"
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
model = AutoModelForCausalLM.from_pretrained(model_name, model_file = model_file, model_type = "llama")
outputs = model.generate(inputs, max_new_tokens=300)

Error tests:

model = AutoModelForCausalLM.from_pretrained(model_name, model_file = model_file, model_type = "PIG")

Correct tests:

model = AutoModelForCausalLM.from_pretrained(model_name, model_file = model_file)

Expected Behavior & Potential Risk

N/A

How has this PR been tested?

Manualluy

Dependency Change?

N/A

a32543254

LGTM

intel_extension_for_transformers/transformers/modeling/modeling_auto.py

hshen14 · 2024-01-31T14:05:17Z

Suggest removing streamer in the sample code to simplify the example further. We've done in the main page.

for more information, see https://pre-commit.ci

Zhenzhong1 · 2024-02-01T03:38:15Z

Suggest removing streamer in the sample code to simplify the example further. We've done in the main page.

Fixed.

Zhenzhong1 · 2024-02-01T04:02:44Z

Suggest removing streamer in the sample code to simplify the example further. We've done in the main page.

@hshen14 @a32543254
I have checked it. If we removed streamer, it will be no text output.

Need to print output.

outputs = model.generate(inputs, max_new_tokens=300)
print(outpus)

But as the picutre shown, the output is logit_id. It means we still need to add the deocder if we want the ouput to be text.

outputs = model.generate(inputs,max_new_tokens=300)
print(outputs)
for i in outputs:
    print(tokenizer.decode(i))

Zhenzhong1 requested a review from PenghuiCheng as a code owner January 31, 2024 07:51

Zhenzhong1 requested a review from a32543254 January 31, 2024 07:51

add the model_type arg

7a112e3

a32543254 approved these changes Jan 31, 2024

View reviewed changes

Zhenzhong1 requested a review from lvliang-intel January 31, 2024 07:59

a32543254 reviewed Jan 31, 2024

View reviewed changes

intel_extension_for_transformers/transformers/modeling/modeling_auto.py Outdated Show resolved Hide resolved

intel_extension_for_transformers/transformers/modeling/modeling_auto.py Outdated Show resolved Hide resolved

hshen14 reviewed Jan 31, 2024

View reviewed changes

intel_extension_for_transformers/transformers/modeling/modeling_auto.py Outdated Show resolved Hide resolved

Zhenzhong1 force-pushed the zhenzhong/APIextend branch 3 times, most recently from 581435a to 4034e21 Compare February 1, 2024 03:23

Zhenzhong1 and others added 3 commits January 31, 2024 19:25

fixed the py format issue

4034e21

[pre-commit.ci] auto fixes from pre-commit.com hooks

be86641

for more information, see https://pre-commit.ci

Merge branch 'main' into zhenzhong/APIextend

50255cb

Zhenzhong1 requested review from intellinjun, zhenwei-intel and VincyZhang February 2, 2024 03:53

VincyZhang merged commit 7733d44 into main Feb 2, 2024
15 checks passed

VincyZhang deleted the zhenzhong/APIextend branch February 2, 2024 07:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM Runtime] Extend API for GGUF #1218

[LLM Runtime] Extend API for GGUF #1218

Zhenzhong1 commented Jan 31, 2024 •

edited

Loading

a32543254 left a comment

hshen14 commented Jan 31, 2024

Zhenzhong1 commented Feb 1, 2024

Zhenzhong1 commented Feb 1, 2024

[LLM Runtime] Extend API for GGUF #1218

[LLM Runtime] Extend API for GGUF #1218

Conversation

Zhenzhong1 commented Jan 31, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

a32543254 left a comment

Choose a reason for hiding this comment

hshen14 commented Jan 31, 2024

Zhenzhong1 commented Feb 1, 2024

Zhenzhong1 commented Feb 1, 2024

Zhenzhong1 commented Jan 31, 2024 •

edited

Loading