Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoParallel] Add paddle.distributed.reshard python API. #57293

Merged
merged 8 commits into from
Sep 19, 2023

Conversation

GhostScreaming
Copy link
Contributor

@GhostScreaming GhostScreaming commented Sep 13, 2023

PR types

Others

PR changes

APIs

Description

Pcard-73145

Add paddle.distributed.reshard(dist_tensor, dist_attr) python API. It supports reshard for DistTensor.

Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

新增API review时还需要同时在docs提一个中文文档的PR

paddle/fluid/pybind/auto_parallel_py.cc Outdated Show resolved Hide resolved
python/paddle/distributed/auto_parallel/api.py Outdated Show resolved Hide resolved
@GhostScreaming
Copy link
Contributor Author

中文Doc文档:PaddlePaddle/docs#6192

Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里看下是不是auto_parallel下面的单测都是需要手动加CMakeLists的,目前覆盖率不够是不是这个单测CI没跑到

@@ -67,6 +67,7 @@
from .auto_parallel import shard_op # noqa: F401
from .auto_parallel.api import shard_tensor # noqa: F401
from .auto_parallel.api import dtensor_from_fn # noqa: F401
from .auto_parallel.api import reshard # noqa: F401
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是还需要将reshard加到all list中

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thx.

@@ -186,6 +186,8 @@ if(WITH_DISTRIBUTE AND WITH_GPU)
py_test_modules(test_engine_save_load MODULES test_engine_save_load)
py_test_modules(test_rule_based_tuner MODULES test_rule_based_tuner)
py_test_modules(test_dist_tensor MODULES test_dist_tensor)
py_test_modules(test_reshard_api MODULES test_reshard_api)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个CMakeLists里面对不同性质单测做了划分,reshard是个多卡单测,看看是不是和其他reshard_x_to_x单测放到一起,最好也加个超时控制,多卡单测容易超时

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thx.

Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@LiYuRio LiYuRio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@GhostScreaming GhostScreaming merged commit 4c7fc29 into PaddlePaddle:develop Sep 19, 2023
3 checks passed
Frida-a pushed a commit to Frida-a/Paddle that referenced this pull request Oct 14, 2023
…le#57293)

* Add paddle.distributed.reshard API. It supports reshard for DistTensor.

* Polish code with review comments.

* Fix problem of in_dynamic_mode

* Fix some problems according to review comments.

* Set test_reshard_api as multi-cards testcase. And set its timeout.
danleifeng pushed a commit to danleifeng/Paddle that referenced this pull request Nov 14, 2023
…le#57293)

* Add paddle.distributed.reshard API. It supports reshard for DistTensor.

* Polish code with review comments.

* Fix problem of in_dynamic_mode

* Fix some problems according to review comments.

* Set test_reshard_api as multi-cards testcase. And set its timeout.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants