-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/Recv op python with guard #7793
Feature/Recv op python with guard #7793
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please clear the code before merge.
|
||
if __name__ == "__main__": | ||
unittest.main() | ||
# test = TestRecvOp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please delete the annotated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Since current CI do not turn on WITH_DISTRIBUTE
, I also added a switch to turn off the test of recv_op.
o = layers.scale(x=x, scale=10.0) | ||
main.global_block().create_var( | ||
name=o.name, psersistable=False, dtype=o.dtype, shape=o.shape) | ||
print main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use print
in the unit test.
append_batch_size=False) | ||
fluid.initializer.Constant(value=1.0)(x, main.global_block()) | ||
layers.Send("127.0.0.1:6174", [x], [x]) | ||
print main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete print
.
… recv_op_python_with_guard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Does this PR fix #7700 ?
python/paddle/v2/fluid/layers/io.py
Outdated
self.outputs = [] | ||
self.endpoint = endpoint | ||
self.fan_in = fan_in | ||
# FIXME(typhoonzero): Add this switch is stupid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe write a more precise comment? I could not understand what this means :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks. If you are busy I can take #7700 as well. It's not high priority anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll focus on performance optimization.
@@ -161,7 +161,6 @@ class RecvOpMaker : public framework::OpProtoAndCheckerMaker { | |||
public: | |||
RecvOpMaker(OpProto *proto, OpAttrChecker *op_checker) | |||
: OpProtoAndCheckerMaker(proto, op_checker) { | |||
AddInput("RX", "(Tensor) Input tensor to be optimized").AsDuplicable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this line is removed, I thought the OP need to explicitly specify its input, otherwise the framework won't know about this dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current recv_op
should be called listen_and_serv
, which is often the only operator in a server program's main block (block 0). Unlike send/recv it does not appear in the calculation blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, I think I misunderstood the code.
@@ -161,7 +161,6 @@ class RecvOpMaker : public framework::OpProtoAndCheckerMaker { | |||
public: | |||
RecvOpMaker(OpProto *proto, OpAttrChecker *op_checker) | |||
: OpProtoAndCheckerMaker(proto, op_checker) { | |||
AddInput("RX", "(Tensor) Input tensor to be optimized").AsDuplicable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete the Inputs
in RecvOp may cause all the book demo under distributed_book
folder can not run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
… recv_op_python_with_guard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM++
This is part of job of #6508