Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

[LLM Runtime] Optmized dropout operator #707

Merged
merged 7 commits into from
Nov 20, 2023
Merged

[LLM Runtime] Optmized dropout operator #707

merged 7 commits into from
Nov 20, 2023

Conversation

zhewang1-intc
Copy link
Contributor

Type of Change

feature or bug fix or documentation or others: feature
API changed or not: Yes
user can call optimized fwd/bwd dropout operator in QBits, 3X+ perf improvement compared to pytorch.
e.g.

torch.ops.qbits_customop.qbits_dropout_fwd(weight, p)
torch.ops.qbits_customop.qbits_dropout_bwd(grad, mask)

Description

detail description
JIRA ticket: https://jira.devtools.intel.com/browse/NLPTOOLKIU-958
optmized fwd/bwd dropout operator under AVX512/AVX2 ISA, including a high-performance rand-generator.
change QBits log from cout to torch::LOG.

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)
tested on 335T(spr) & desktop(Core i9-10900)

Dependency Change?

any library dependency introduced or removed
No.

@VincyZhang
Copy link
Contributor

Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
@VincyZhang VincyZhang self-requested a review as a code owner November 17, 2023 09:50
@VincyZhang
Copy link
Contributor

there is something wrong when build binary, please check @zhewang1-intc
https://inteltf-jenk.sh.intel.com/job/nlp-toolkit-release-wheel-build/6928/consoleFull
image

@hshen14 hshen14 merged commit 7d276cd into main Nov 20, 2023
15 checks passed
@hshen14 hshen14 deleted the optmized_dropout branch November 20, 2023 13:01
zhenwei-intel added a commit that referenced this pull request Nov 21, 2023
sywangyi pushed a commit that referenced this pull request Nov 21, 2023
* pass ut on avx2/avx512
* add relative tests
* fix gcc11 compile
Signed-off-by: Wenxin Zhang <wenxin.zhang@intel.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants