[Semi Auto] Refactor Replicated Rule #56839

JZ-LIANG · 2023-08-31T06:18:44Z

PR types

Function optimization

PR changes

Others

Description

Pcard-70448

default_data_parallel rule : hack rule for support hybrid parallel quickly, should be mapped to a specific op on purpose.
replicated rule: bottom line rule for all ops that have NO specific rule.
Adapt Api for PHI format in dygraph.
Adapt Api to distinguish argument of single tensor and argument of vector of single tensor in static graph.

paddle-bot · 2023-08-31T06:18:50Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-bot · 2023-08-31T06:18:52Z

❌ The PR is not created using PR's template. You can refer to this Demo.
Please use PR's template, it helps save our maintainers' time so that more developers get helped.

…o semi-auto/default-rule

…t-rule

chenwhql

LGTM overall

chenwhql · 2023-09-07T04:06:36Z

paddle/fluid/pybind/auto_parallel_py.cc

@@ -375,6 +376,44 @@ void BindAutoParallel(py::module *m) {
             }
             return self.InferForward(ctx);
           })
+      .def("infer_forward", /*for op that have vector argument*/
+           [](const phi::distributed::SpmdRule &self,
+              const std::vector<std::pair<int, int>> input_ranges,


这里需要用引用吗

chenwhql · 2023-09-07T04:07:43Z

paddle/fluid/pybind/auto_parallel_py.cc

+             paddle::small_vector<phi::distributed::DistMetaTensor,
+                                  phi::kInputSmallVectorSize>
+                 ins;
+             for (auto range : input_ranges) {


这里是否也需要用引用避免拷贝

修改了， THX~

chenwhql · 2023-09-07T05:11:46Z

paddle/phi/infermeta/spmd_rules/default_data_parallel.h

+ * all the input and ouput tensors is the batch dimension (broadcast dimension),
+ * therefore, if any tensor's first dimension is sharded, the sharding would be
+ * propagating to all the other tensors (for tensor first dimension). All the
+ *other axes of tensors would be set as unshard (-1).


注释句首建议加个空格对齐

chenwhql · 2023-09-07T05:14:02Z

paddle/phi/infermeta/spmd_rules/replicated.h

+
+#include "paddle/phi/core/distributed/auto_parallel/dist_meta_tensor.h"
+#include "paddle/phi/core/distributed/type_defs.h"
+#include "paddle/phi/infermeta/spmd_rules/utils.h"


utils.h是不是在.cc中include就可以了

不太行。因为下面phi 接口方法的定义中，需要把 helper template 实例化，所以必须include helper template 的实现，无法通过 forward declaration 引入 helper。

zhiqiu · 2023-09-07T12:30:06Z

paddle/fluid/pybind/auto_parallel_py.cc

+              const std::vector<phi::Attribute> &attrs) {
+             /*
+             to distingish between single tensor argument and vector argument of
+             one tensor: start - end == 0: single tensor start - end == 1:


Commonly, the range is [start, end), i.e., start is inclusive and end is exclusive. Maybe wo need another way to distingish between tensor and [tensor],

Yes，we plan to use PyObject* to deal with variadic arguments Python end and parser them in C++ end in Next PR

zhiqiu · 2023-09-07T12:37:59Z

paddle/phi/core/distributed/auto_parallel/inferspmd_utils.cc

  inputs_.emplace_back(std::move(input));
+  int index = static_cast<int>(inputs_.size());


exchange line 21 and 22 to get right index?

fixed, it is a bug, thx~

zhiqiu

LGTM

risemeup1

LGTM

XiaoguangHu01

LGTM

* adapt general spmd rule * polish details * add new rules * bugfix for set_partial * bugfix * unitest * adapt for argument for tensor and vector of tensor --------- Co-authored-by: Chen Weihang <chenweihang@baidu.com>

chenwhql and others added 4 commits August 31, 2023 02:12

adapt general spmd rule

ba1e4bf

polish details

5917bb2

add new rules

e9a4455

bugfix

8a58468

JZ-LIANG added 11 commits August 31, 2023 14:30

register

459d633

bugfix for set_partial

4f21444

Merge remote-tracking branch 'cwh/ap/add_general_spmd_rule_utils' int…

5526841

…o semi-auto/default-rule

bugfix

3c744a7

unitest

69fce04

unitest

78d6d90

static done

92aadb5

adapt for dy graph

c6571f9

clean

44605fb

adapt for argument for tensor and vector of tensor

946c991

Merge remote-tracking branch 'upstream/develop' into semi-auto/defaul…

4a328d4

…t-rule

JZ-LIANG requested a review from chenwhql September 7, 2023 02:40

JZ-LIANG assigned zhiqiu Sep 7, 2023

chenwhql previously approved these changes Sep 7, 2023

View reviewed changes

pass as reference

5e2a99b

JZ-LIANG dismissed chenwhql’s stale review via 5e2a99b September 7, 2023 08:06

zhiqiu reviewed Sep 7, 2023

View reviewed changes

JZ-LIANG added 3 commits September 11, 2023 14:13

bugfix

c690246

forward declaration

7a6a8d4

bugfix

a63a1ca

zhiqiu approved these changes Sep 12, 2023

View reviewed changes

risemeup1 approved these changes Sep 12, 2023

View reviewed changes

XiaoguangHu01 approved these changes Sep 12, 2023

View reviewed changes

JZ-LIANG merged commit 29900a0 into PaddlePaddle:develop Sep 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Semi Auto] Refactor Replicated Rule #56839

[Semi Auto] Refactor Replicated Rule #56839

JZ-LIANG commented Aug 31, 2023 •

edited

Loading

paddle-bot bot commented Aug 31, 2023

paddle-bot bot commented Aug 31, 2023

chenwhql left a comment

chenwhql Sep 7, 2023

JZ-LIANG Sep 7, 2023

chenwhql Sep 7, 2023

JZ-LIANG Sep 7, 2023

chenwhql Sep 7, 2023

JZ-LIANG Sep 7, 2023

chenwhql Sep 7, 2023 •

edited by JZ-LIANG

Loading

JZ-LIANG Sep 7, 2023 •

edited

Loading

zhiqiu Sep 7, 2023

JZ-LIANG Sep 11, 2023

zhiqiu Sep 7, 2023

JZ-LIANG Sep 11, 2023

zhiqiu left a comment

risemeup1 left a comment

XiaoguangHu01 left a comment

		inputs_.emplace_back(std::move(input));
		int index = static_cast<int>(inputs_.size());

[Semi Auto] Refactor Replicated Rule #56839

[Semi Auto] Refactor Replicated Rule #56839

Conversation

JZ-LIANG commented Aug 31, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Aug 31, 2023

paddle-bot bot commented Aug 31, 2023

chenwhql left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chenwhql Sep 7, 2023 • edited by JZ-LIANG Loading

Choose a reason for hiding this comment

JZ-LIANG Sep 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiqiu left a comment

Choose a reason for hiding this comment

risemeup1 left a comment

Choose a reason for hiding this comment

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

JZ-LIANG commented Aug 31, 2023 •

edited

Loading

chenwhql Sep 7, 2023 •

edited by JZ-LIANG

Loading

JZ-LIANG Sep 7, 2023 •

edited

Loading