[AutoParallel] Add sequence parallel for llama #59822

GhostScreaming · 2023-12-08T03:01:43Z

PR types

Others

PR changes

Others

Description

Pcard-73145

Add sequence parallel for llama.

… fix_backward_reshard

…e/Paddle into test_sp

paddle-bot · 2023-12-08T03:01:48Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… test_sp

JZ-LIANG · 2023-12-08T06:20:36Z

test/auto_parallel/hybrid_strategy/semi_auto_parallel_llama_model.py

@@ -192,6 +192,26 @@ def forward(
            shape=target_key_value_shape
        )

+        if self.config.sequence_parallel:
+            query_states = dist.reshard(


reshard before projection while tranpose after projection

… test_sp

JZ-LIANG · 2023-12-08T11:58:33Z

test/auto_parallel/hybrid_strategy/semi_auto_parallel_llama_model.py

@@ -238,6 +250,12 @@ def forward(
        else:
            attn_output = outputs

+        if self.config.sequence_parallel:
+            attn_output = paddle.transpose(attn_output, [1, 0, 2])
+            attn_output = dist.reshard(


reshard should after out_projection

JZ-LIANG · 2023-12-08T13:00:55Z

test/auto_parallel/hybrid_strategy/semi_auto_parallel_llama_model.py

@@ -731,6 +772,13 @@ def forward(

        # if labels is None，means we need full output, instead of tensor_parallel_output
        logits = self.lm_head(hidden_states)
+        if self.config.sequence_parallel:


# reshard should before lm_head if self.config.sequence_parallel: hidden_states = dist.reshard( hidden_states, get_mesh(-1), [dist.Shard(1), dist.Replicate()] ) # [S, B, H] -> [B, S, H] hidden_states = paddle.transpose(hidden_states, [1, 0, 2]) logits = self.lm_head(hidden_states)

… test_sp

JZ-LIANG

LGTM

GhostScreaming and others added 16 commits December 6, 2023 20:14

[AutoParallel] Fix problems of sequence parallel in dynamic mode.

c185964

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

d302229

… fix_backward_reshard

Polish code.

f4cb01f

Remove TODO in transpose.cc

81d2199

Polish code.

74e8033

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

41cdd55

… fix_backward_reshard

Remove useless modification.

4de6ed7

Polish code.

e79197e

Polish code.

f647ef3

Remove useless modification.

5a1cf9f

Allow partial status flow

c8caf0c

add 3D auto_parallel test.

6b90d53

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

4ae381b

… fix_backward_reshard

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2ccb14e

… fix_backward_reshard

add 3d test and fix reshard bug.

06410aa

Merge commit 'refs/pull/59726/head' of https://github.com/PaddlePaddl…

1180c9d

…e/Paddle into test_sp

GhostScreaming added 2 commits December 8, 2023 14:05

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2893385

… test_sp

Add sequence parallel for llama.

d6c38d9

GhostScreaming changed the title ~~[WIP] Add sequence parallel for llama.~~ [AutoParallel] Add sequence parallel for llama Dec 8, 2023

GhostScreaming force-pushed the test_sp branch from 5972d6f to d6c38d9 Compare December 8, 2023 06:12

JZ-LIANG reviewed Dec 8, 2023

View reviewed changes

GhostScreaming added 3 commits December 8, 2023 14:22

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

522de43

… test_sp

Polish code according to review comments.

1e46ace

Fix bug of backward set in_grad dist_attr.

732230b

GhostScreaming mentioned this pull request Dec 8, 2023

[AutoParallel] Add testcase for sequence parallel in dygraph mode. #59841

Merged

Polish.

0ce56d8

JZ-LIANG reviewed Dec 8, 2023

View reviewed changes

GhostScreaming added 2 commits December 8, 2023 21:11

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

ec12fd0

… test_sp

Change place where sp call reshard

e22be22

XieYunshen approved these changes Dec 9, 2023

View reviewed changes

JZ-LIANG approved these changes Dec 9, 2023

View reviewed changes

JZ-LIANG merged commit be090bd into PaddlePaddle:develop Dec 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoParallel] Add sequence parallel for llama #59822

[AutoParallel] Add sequence parallel for llama #59822

GhostScreaming commented Dec 8, 2023 •

edited

Loading

paddle-bot bot commented Dec 8, 2023

JZ-LIANG Dec 8, 2023

GhostScreaming Dec 8, 2023

JZ-LIANG Dec 8, 2023

JZ-LIANG Dec 8, 2023

JZ-LIANG left a comment

[AutoParallel] Add sequence parallel for llama #59822

[AutoParallel] Add sequence parallel for llama #59822

Conversation

GhostScreaming commented Dec 8, 2023 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Dec 8, 2023

JZ-LIANG Dec 8, 2023

Choose a reason for hiding this comment

GhostScreaming Dec 8, 2023

Choose a reason for hiding this comment

JZ-LIANG Dec 8, 2023

Choose a reason for hiding this comment

JZ-LIANG Dec 8, 2023

Choose a reason for hiding this comment

JZ-LIANG left a comment

Choose a reason for hiding this comment

GhostScreaming commented Dec 8, 2023 •

edited

Loading