Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

[LLM Runtime] Extend API for GGUF #1218

Merged
merged 4 commits into from
Feb 2, 2024
Merged

[LLM Runtime] Extend API for GGUF #1218

merged 4 commits into from
Feb 2, 2024

Conversation

Zhenzhong1
Copy link
Contributor

@Zhenzhong1 Zhenzhong1 commented Jan 31, 2024

Type of Change

[LLM Runtime] Extend API for GGUF

Description

Add the model_type arg:
model_type can be set manually or empty.
If the model_type is empty, it will be set according to the HF configuration automatically.

model_name = "TheBloke/Mistral-7B-v0.1-GGUF"
model_file = "mistral-7b-v0.1.Q4_0.gguf" 
tokenizer_name = "mistralai/Mistral-7B-v0.1"

prompt = "Once upon a time"
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
model = AutoModelForCausalLM.from_pretrained(model_name, model_file = model_file, model_type = "llama")
outputs = model.generate(inputs, max_new_tokens=300)

Error tests:

model = AutoModelForCausalLM.from_pretrained(model_name, model_file = model_file, model_type = "PIG")

image

Correct tests:

model = AutoModelForCausalLM.from_pretrained(model_name, model_file = model_file)

image

Expected Behavior & Potential Risk

N/A

How has this PR been tested?

Manualluy

Dependency Change?

N/A

Copy link
Contributor

@a32543254 a32543254 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hshen14
Copy link
Contributor

hshen14 commented Jan 31, 2024

Suggest removing streamer in the sample code to simplify the example further. We've done in the main page.

@Zhenzhong1 Zhenzhong1 force-pushed the zhenzhong/APIextend branch 3 times, most recently from 581435a to 4034e21 Compare February 1, 2024 03:23
@Zhenzhong1
Copy link
Contributor Author

Suggest removing streamer in the sample code to simplify the example further. We've done in the main page.

Fixed.

@Zhenzhong1
Copy link
Contributor Author

Suggest removing streamer in the sample code to simplify the example further. We've done in the main page.

@hshen14 @a32543254
I have checked it. If we removed streamer, it will be no text output.

Need to print output.

outputs = model.generate(inputs, max_new_tokens=300)
print(outpus)

image

But as the picutre shown, the output is logit_id. It means we still need to add the deocder if we want the ouput to be text.

outputs = model.generate(inputs,max_new_tokens=300)
print(outputs)
for i in outputs:
    print(tokenizer.decode(i))

image

@VincyZhang VincyZhang merged commit 7733d44 into main Feb 2, 2024
15 checks passed
@VincyZhang VincyZhang deleted the zhenzhong/APIextend branch February 2, 2024 07:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants