Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add number count op #39224

Merged
merged 11 commits into from
Mar 15, 2022
Merged

add number count op #39224

merged 11 commits into from
Mar 15, 2022

Conversation

sljlp
Copy link
Contributor

@sljlp sljlp commented Jan 25, 2022

PR types

New features

PR changes

OPs

Describe

新增 number count op该OP用于对输入的number计数
目前主要用于MOE中,作为对每个expert所处理的tokens计数

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot-old
Copy link

paddle-bot-old bot commented Feb 4, 2022

Sorry to inform you that d249ceb's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

from paddle.fluid.framework import in_dygraph_mode

__all__ = [ #noqa
'expert_count'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种属于对外暴露的接口,内部开发中的api不用放到__all__
开发中的api先放到paddle.incubate底下吧,
或者前面加下划线
paddle.distributed.utils._expert_count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

]


def expert_count(gate_idx, n_expert):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种op太专用了,能不能用通用的tensor操作的op来实现count功能?
会导致op数量过多,在做多硬件适配的时候是很大的负担。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以通过多个已有的通用tensor操作的组合实现,但是性能和显存上不如直接调用这种op算子更高效。
实际上,这个op的功能是对输入的数据计数,虽然命名带有expert前缀,但是相对来说还是比较通用的。我可以通过把这个OP名改成一个更合理的名字,使他成为一个通用的OP。
对于其他的算子,我们需要再斟酌一下。

Copy link
Member

@ForFishes ForFishes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

from paddle.fluid.framework import in_dygraph_mode


def _number_count(gate_idx, upper_range):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OP定义与PR描述不符,另外PR描述过于简单,请说清该PR的作用与目的


@unittest.skipIf(not core.is_compiled_with_cuda(),
"core is not compiled with CUDA")
class TestExpertCountAPI(unittest.TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

单测名称没有修改

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,下个PR修改,MOE模块会有多个算子的PR需要提交

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

以及再仔细检查下其他地方

@sljlp sljlp changed the title add expert count op add number count op Mar 15, 2022
@@ -0,0 +1,55 @@
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.

@@ -0,0 +1,80 @@
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

@Ligoml Ligoml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for docs

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ForFishes ForFishes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ForFishes ForFishes merged commit 9bdee43 into PaddlePaddle:develop Mar 15, 2022
liqitong-a pushed a commit to liqitong-a/Paddle that referenced this pull request Mar 17, 2022
* add expert count op

add ut for expert_count

* update UT only for cuda

* fix for rocm

* update ut

* add moe module

* add expert count op

add ut for expert_count

* update UT only for cuda

* update ut

* add moe module

* make expert count private

* rename expert count op

Co-authored-by: hlygit66666 <2570058140@qq.com>
@sljlp sljlp deleted the expert_count_op branch July 22, 2022 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants