[LLM] Support woq model save and load #1211

changwangss · 2024-01-30T13:10:38Z

Type of Change

PR is ready now, please review and merge,
Here is how to use:

#save
numactl -m 0 -C 0-55 pyhon run_generation.py --model facebook/opt-125m --woq
#load and do performance or benchmark
numactl -m 0 -C 0-55 pyhon run_generation.py --model "saved_results" --accuracy --batch_size 56
numactl -m 0 -C 0-55 pyhon run_generation.py --model "saved_results" --benchmark --batch_size 1

Description

detail description
JIRA ticket: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

Signed-off-by: changwangss <chang1.wang@intel.com>

intel_extension_for_transformers/llm/quantization/nn/modules.py

Signed-off-by: changwangss <chang1.wang@intel.com>

Signed-off-by: Wang, Chang <chang1.wang@intel.com>

support woq save and load

d9a8267

Signed-off-by: changwangss <chang1.wang@intel.com>

changwangss requested a review from PenghuiCheng as a code owner January 30, 2024 13:10

zhewang1-intc reviewed Jan 31, 2024

View reviewed changes

intel_extension_for_transformers/llm/quantization/nn/modules.py Outdated Show resolved Hide resolved

changwangss added 2 commits January 31, 2024 01:13

remove resize

b8ccd63

Signed-off-by: changwangss <chang1.wang@intel.com>

remove comment

df534df

Signed-off-by: Wang, Chang <chang1.wang@intel.com>

hshen14 approved these changes Jan 31, 2024

View reviewed changes

Merge branch 'main' into wangchang/saveload

dab40ce

VincyZhang merged commit 1c8078f into main Feb 1, 2024
15 checks passed

VincyZhang deleted the wangchang/saveload branch February 1, 2024 03:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM] Support woq model save and load #1211

[LLM] Support woq model save and load #1211

changwangss commented Jan 30, 2024 •

edited

Loading

[LLM] Support woq model save and load #1211

[LLM] Support woq model save and load #1211

Conversation

changwangss commented Jan 30, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

changwangss commented Jan 30, 2024 •

edited

Loading