Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Semi Auto] Softmax SPMD Rule #55196

Merged
merged 68 commits into from
Jul 12, 2023

Conversation

JZ-LIANG
Copy link
Contributor

@JZ-LIANG JZ-LIANG commented Jul 6, 2023

PR types

Function optimization

PR changes

Others

Description

Pcard-70448

SPMD rule for softmax like ops (softmax, log_softmax).
sharding on normlized axis will be support in future.

@paddle-bot
Copy link

paddle-bot bot commented Jul 6, 2023

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

weight_ndim,
weight_dims_mapping.size(),
phi::errors::InvalidArgument(
"Mismatch of Y's tensor size: [%d] and Y's dims_mapping size [%d].",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Y -> w?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 71 to 72
// padding_idx s not supported by c_embedding kernl.
// (TODO) might be could reshard as replicated when padding_idx != -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hard to understand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the implement of c_embedding kernel require the padding item is in the of table, which means "padding_idx = -1" and not allow being set by user.

to support vocab parallel of embedding op efficiently, we need to use c_embedding kernel, therefore, we make a precondition check that "padding_idx = -1" when embedding table is sharded by vocab axis.

<< str_join(out_dims_mapping) << "], partial_on_dims: ["
<< str_join(partial_on_dims) << "]";

return {{x_dist_attr_src, weight_dist_attr_src}, {output_dist_attr_dst}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infer inputs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JZ-LIANG JZ-LIANG merged commit 885d1ae into PaddlePaddle:develop Jul 12, 2023
cqulilujia pushed a commit to cqulilujia/Paddle that referenced this pull request Jul 24, 2023
* resolute input sharding conflict maybe

* fixed comment

---------

Co-authored-by: Yichen Zhang <zhangyichen03@baidu.com>
Co-authored-by: zhiqiu <chenqiuliang@baidu.com>
wz1qqx pushed a commit to wz1qqx/Paddle that referenced this pull request Jul 31, 2023
* resolute input sharding conflict maybe

* fixed comment

---------

Co-authored-by: Yichen Zhang <zhangyichen03@baidu.com>
Co-authored-by: zhiqiu <chenqiuliang@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants