[Semi-Auto] Add reshape spmd rule #55177

pkuzyc · 2023-07-05T13:01:41Z

PR types

New features

PR changes

Others

Description

Pcard-70448
Add reshape spmd rule for auto parallel. This rule infers the output's distributed attribute with the following two steps:

Compute the transformation from the original shape to the target shape.
Compute the output's distributed attribute according to the transformation from step 1.

paddle-bot · 2023-07-05T13:01:46Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-ci-bot · 2023-07-20T03:12:04Z

Sorry to inform you that 4370e87's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

paddle-ci-bot · 2023-08-01T03:06:29Z

Sorry to inform you that 7b96d26's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

JZ-LIANG · 2023-08-02T07:23:44Z

paddle/fluid/distributed/auto_parallel/spmd_rules/reshape_spmd_rule.cc

+    const std::vector<int64_t>& src_shape,
+    const std::vector<int64_t>& tgt_shape) {
+  std::vector<DimTrans*> ret;
+  int64_t src_size = std::accumulate(


src_size --> src_numel / src_nelem,
src_size is ambiguous with src_shape

Done, use total_elem_num_src now.

JZ-LIANG · 2023-08-02T09:34:08Z

paddle/fluid/distributed/auto_parallel/spmd_rules/dim_trans.cc

+  // get the size of each output dimension, and get the
+  // map from sharded input dimensions to output dimensions.
+  std::vector<int64_t> dim_map_src2tgt(ndim, -1);
+  std::vector<int64_t> out_shape(dim_trans.size());


it is redundant to compute the output shape again here.
only DimTrans::Type::SPLIT need maintain the output shape segment.

Done, remove the computing output parts now.

JZ-LIANG · 2023-08-02T09:40:34Z

paddle/fluid/distributed/auto_parallel/spmd_rules/dim_trans.h

+  int64_t split_id() const;
+
+  // get the splitted shape of the split_id_ dimension
+  int64_t local_split_shape();


would be better local_split_shape --> local_axis_size ？
shape: [a,b,c]
axis_size: the value of a or b or c

Done, rename to "local_splitted_shape_value".

JZ-LIANG · 2023-08-02T09:56:18Z

paddle/fluid/distributed/auto_parallel/spmd_rules/dim_trans.cc

+  }
+
+  // if one input dimension is sharded on a
+  // unshardable mesh we need to reshard the input.


to trick to calculate the input_dims_mapping_dst.
not need to introduce "reshard" into InferSPMD;
directly use shardable vector to remove "sharded" in input_dims_mapping_src.

Done, remove the "reshard" word.

JZ-LIANG · 2023-08-02T10:02:43Z

paddle/fluid/distributed/auto_parallel/spmd_rules/dim_trans.cc

+  std::vector<int64_t> dim_map_src2tgt(ndim, -1);
+  std::vector<int64_t> out_shape(dim_trans.size());
+  for (int64_t i = 0, n = dim_trans.size(); i < n; i++) {
+    std::pair<int64_t, DimTrans*> dim_size =


too trick to calculate the output_dims_mapping.
should main dim_map_tgt2src, and unshardedable map for output axis.

Done, remove the redundant code. Using "dim_map_tgt2src" will meet some bugs when input_dims_mapping should be set to replicated, so keep "dim_map_src2tgt" now, and it is more intuitive.

JZ-LIANG · 2023-08-02T12:04:40Z

paddle/fluid/distributed/auto_parallel/spmd_rules/reshape_spmd_rule.cc

+  TensorDistAttr output_dist_attr(input_specs[0].dist_attr());
+  output_dist_attr.set_dims_mapping(dims_mapping_vec[1]);
+
+  VLOG(4) << "Reshape: input_shape: [" << str_join(src_shape)


TODO: think about print useful info about tensor axes and dims_mapping for debug:

idea1: construct einsum notation for debug and giving corresponding axes between input and output a specific character, therefore user could be notified that those axes are related.

idea2: print out the DimTrans and make the info readable.

Done, print the transformation info.

JZ-LIANG · 2023-08-02T12:31:35Z

test/auto_parallel/spmd_rules/test_reshape_rule.py

+
+        self.attrs = {"shape": [1, 72, 48, 4, 6]}
+
+    def test_reshape_infer_forward(self):


unitest should include following cases:

input axis directly map to output axis

multiple input axes merge into single output axis, with shard on first input axis

multiple input axes merge into single output axis, with shard on axis other than first input axis

single input axis split into multiple output axes, with first output axis dividable

single input axis split into multiple output axes, with first output axis non-dividable

multiple input axes transform into multiple output axis, with shard on first input axis/shard on input axis other than the first axis/ first output axis dividable/ first output axis non-dividable

zhiqiu · 2023-08-09T12:52:49Z

paddle/fluid/distributed/auto_parallel/spmd_rules/dim_trans.cc

+namespace distributed {
+namespace auto_parallel {
+
+static std::vector<DimTrans*> all_dim_trans;


why use static global vector?

Here "static" indicates that the global variable can only be used in this file. The global vector is used to store all transformation objects so that we can free them after inferring distributed attributes.

pkuzyc closed this Jul 5, 2023

pkuzyc reopened this Jul 5, 2023

pkuzyc force-pushed the reshape_rule branch 2 times, most recently from a5efe9d to 4370e87 Compare July 12, 2023 09:21

pkuzyc force-pushed the reshape_rule branch from 4370e87 to 7b96d26 Compare July 24, 2023 12:41

JZ-LIANG reviewed Aug 2, 2023

View reviewed changes

zhiqiu reviewed Aug 9, 2023

View reviewed changes

pkuzyc added 7 commits August 11, 2023 15:44

add reshape spmd rule

b2964fb

add unit test for reshape spmd rule

54c9388

bug fix

43bdee9

replace the print_info function with to_string

e961690

fix typo

388ef75

bug fix

02e2545

add handling for "0" in target shape

79a4881

pkuzyc force-pushed the reshape_rule branch from 7b96d26 to 6b60d9f Compare August 11, 2023 07:44

remove the part of computing size in dim_trans.cc

22e2e49

pkuzyc force-pushed the reshape_rule branch from 6b60d9f to 22e2e49 Compare August 11, 2023 07:53

JZ-LIANG requested review from zhiqiu and JZ-LIANG August 11, 2023 08:06

JZ-LIANG approved these changes Aug 14, 2023

View reviewed changes

JZ-LIANG merged commit a97b507 into PaddlePaddle:develop Aug 14, 2023

pkuzyc deleted the reshape_rule branch February 6, 2024 02:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Semi-Auto] Add reshape spmd rule #55177

[Semi-Auto] Add reshape spmd rule #55177

pkuzyc commented Jul 5, 2023

paddle-bot bot commented Jul 5, 2023

paddle-ci-bot bot commented Jul 20, 2023

paddle-ci-bot bot commented Aug 1, 2023

JZ-LIANG Aug 2, 2023

pkuzyc Aug 11, 2023

JZ-LIANG Aug 2, 2023

pkuzyc Aug 11, 2023

JZ-LIANG Aug 2, 2023

pkuzyc Aug 11, 2023

JZ-LIANG Aug 2, 2023

pkuzyc Aug 11, 2023

JZ-LIANG Aug 2, 2023

pkuzyc Aug 11, 2023 •

edited

Loading

JZ-LIANG Aug 2, 2023

pkuzyc Aug 11, 2023

JZ-LIANG Aug 2, 2023

pkuzyc Aug 11, 2023

zhiqiu Aug 9, 2023

pkuzyc Aug 11, 2023 •

edited

Loading


		self.attrs = {"shape": [1, 72, 48, 4, 6]}

		def test_reshape_infer_forward(self):

[Semi-Auto] Add reshape spmd rule #55177

[Semi-Auto] Add reshape spmd rule #55177

Conversation

pkuzyc commented Jul 5, 2023

PR types

PR changes

Description

paddle-bot bot commented Jul 5, 2023

paddle-ci-bot bot commented Jul 20, 2023

paddle-ci-bot bot commented Aug 1, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuzyc Aug 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkuzyc Aug 11, 2023 • edited Loading

Choose a reason for hiding this comment

pkuzyc Aug 11, 2023 •

edited

Loading

pkuzyc Aug 11, 2023 •

edited

Loading