Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bf16] Refine BF16 amp-o1 logic #39815

Merged
merged 7 commits into from
Feb 28, 2022

Conversation

zhangbo9674
Copy link
Contributor

@zhangbo9674 zhangbo9674 commented Feb 22, 2022

PR types

New features

PR changes

Others

Describe

完善 bf16 amp-o1实现逻辑:

  • 原逻辑:O1模式下:白名单op跑在BF16下,其他op跑在FP32下(pr

该该逻辑与view机制存在不兼容问题:如unsqueeze op 不在白名单中,在amp-o1下若其input是BF16,执行前会自动插入cast将input转为FP32,与view机制存在兼容问题。

  • 现逻辑:白名单op跑BF16,不再白名单中的op,会根据其input类型确定kernel类型,若input全部为BF16,则跑BF16 kernel,否则跑FP32 kernel。同时提供黑名单策略。

图片

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhangbo9674 zhangbo9674 merged commit 18ee051 into PaddlePaddle:develop Feb 28, 2022
@zhangbo9674 zhangbo9674 deleted the dev/bf16_refine_o1 branch March 2, 2023 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants