Qbits woq ref impl for debug #1248

zhewang1-intc · 2024-02-02T07:55:01Z

Type of Change

feature or bug fix or documentation or others: feature
API changed or not: yes
add pack_weight info acquire interface to get some meta-data like N, K, BLKSIZE, G_IDX etc...
for usage, pls refer to the packq ut.

Description

detail description
JIRA ticket: https://jira.devtools.intel.com/browse/NLPTOOLKIU-1187

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR
woq_linear will dispatch to ref impl once QBITS_DEBUG env_var be set.

How has this PR been tested?

how to reproduce the test (including hardware information): Intel Xeon 8480+ & Intel Core i9 10900

Dependency Change?

any library dependency introduced or removed: No

a32543254

LGTM

changwangss · 2024-02-04T02:39:31Z

validated PR, the results are same w./w.o debug mode
opt woq:

numactl -m 0 -C 0-55 python run_generation.py --model facebook/opt-125m --woq --benchmark --batch_size 1

['Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun. She liked to go to the movies, and she liked to go to the beach. She liked to go to the movies, and she liked to go to the']

export QBITS_DEBUG=1

['Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun. She liked to go to the movies, and she liked to go to the beach. She liked to go to the movies, and she liked to go to the']

GPTQ opt woq

python run_generation.py --model facebook/opt-125m --woq --woq_algo "GPTQ" --gptq_pad_max_length 128 --gptq_use_max_length --gptq_block_size 16 --woq_weight_dtype "int4_clip" --output_dir "gptqq" --benchmark --batch_size 1

['Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun. When she was a little girl, she liked to go to the movies. She liked to go to the movies. She liked to go to the movies. She']

export QBITS_DEBUG=1

['Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun. When she was a little girl, she liked to go to the movies. She liked to go to the movies. She liked to go to the movies. She']

Co-authored-by: Lv, Liang1 <liang1.lv@intel.com>

zhewang1-intc added 3 commits February 2, 2024 12:00

packw info acquire sekelton

5356354

packw info acquire done.

d9f36e5

add woq ref impl in qbits frontend

fcaad87

zhewang1-intc requested review from PenghuiCheng and a32543254 as code owners February 2, 2024 07:55

update ref

e4901f5

zhewang1-intc requested a review from changwangss February 2, 2024 08:21

zhewang1-intc force-pushed the qbits-enhancement branch from 70fd727 to 758a9d7 Compare February 2, 2024 09:52

typo

c17854b

zhewang1-intc force-pushed the qbits-enhancement branch from 758a9d7 to c17854b Compare February 3, 2024 01:07

a32543254 approved these changes Feb 4, 2024

View reviewed changes

changwangss approved these changes Feb 4, 2024

View reviewed changes

PenghuiCheng approved these changes Feb 4, 2024

View reviewed changes

VincyZhang merged commit 18d36ef into main Feb 4, 2024
15 checks passed

VincyZhang deleted the qbits-enhancement branch February 4, 2024 03:38

VincyZhang pushed a commit to VincyZhang/intel-extension-for-transformers that referenced this pull request Feb 5, 2024

fix hard code bug for llama model (intel#1248)

037ce8b

Co-authored-by: Lv, Liang1 <liang1.lv@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qbits woq ref impl for debug #1248

Qbits woq ref impl for debug #1248

zhewang1-intc commented Feb 2, 2024 •

edited

Loading

a32543254 left a comment

changwangss commented Feb 4, 2024 •

edited

Loading

Qbits woq ref impl for debug #1248

Qbits woq ref impl for debug #1248

Conversation

zhewang1-intc commented Feb 2, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

a32543254 left a comment

Choose a reason for hiding this comment

changwangss commented Feb 4, 2024 • edited Loading

zhewang1-intc commented Feb 2, 2024 •

edited

Loading

changwangss commented Feb 4, 2024 •

edited

Loading