Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

[LLM] Support woq model save and load #1211

Merged
merged 4 commits into from
Feb 1, 2024
Merged

[LLM] Support woq model save and load #1211

merged 4 commits into from
Feb 1, 2024

Conversation

changwangss
Copy link
Contributor

@changwangss changwangss commented Jan 30, 2024

Type of Change

PR is ready now, please review and merge,
Here is how to use:

#save
numactl -m 0 -C 0-55 pyhon run_generation.py --model facebook/opt-125m --woq
#load and do performance or benchmark
numactl -m 0 -C 0-55 pyhon run_generation.py --model "saved_results" --accuracy --batch_size 56
numactl -m 0 -C 0-55 pyhon run_generation.py --model "saved_results" --benchmark --batch_size 1

Description

detail description
JIRA ticket: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: Wang, Chang <chang1.wang@intel.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants