Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port fully connected operator #3927

Merged
merged 17 commits into from
Sep 18, 2017
Merged

Conversation

Xreki
Copy link
Contributor

@Xreki Xreki commented Sep 6, 2017

resolve #3926

  • fix SegFault during checking gradient
    • In python, when we define an Operator, all the inputs and outputs, including those AsIntermediate outputs, should be explicitly specified. In TestFCGradOp, operator should be defined as:
     op = Operator(
                   "fc",
                   X="X",
                   W="W",
                   b="b",
                   Out="Out",
                   mul_out="mul_out",
                   add_out="add_out",
                   activation="sigmoid")
  • support multiple inputs and weights
    • Do not try to call operators with inplace arguments, such as Y = X + Y. In my first implementation, I construct FCOp as a series of operators as follows:

      1. mul_out = X[0] * W[0]
      2. add_out = X[1] * W[1]
      3. mul_out = mul_out + add_out
      4. add_out = add_out + b
      5. Y = Act(add_out)
    • The unittest failed when constructed an add operator during the backward process. I found two add operators were automatically created and the DebugString are:

Op(add), inputs:{X[mul_out@GRAD@RENAME@1@0, mul_out@GRAD@RENAME@1@1], outputs:{Out[mul_out@GRAD]}}
Op(add), inputs:{X[add_out@GRAD@RENAME@1@0, add_out@GRAD@RENAME@1@1], outputs:{Out[add_out@GRAD]}}

Thus I changed the implementation of FCOp to:
1. mul_out[0] = X[0] * W[0]
2. mul_out[1] = X[1] * W[1]
3. sum_out = mul_out[0] + mul_out[1]
4. add_out = sum_out + b
5. Y = Act(add_out)

@Xreki Xreki changed the title Port fully connected operator [WIP] Port fully connected operator Sep 6, 2017
@qingqing01 qingqing01 requested review from reyoung and removed request for xinghai-sun September 11, 2017 13:26
@Xreki Xreki changed the title [WIP] Port fully connected operator Port fully connected operator Sep 13, 2017
Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work except some name conventions.

"sum", {{"X", {mul_out}}}, {{"Out", {Output("SumOut")}}}, {}));
} else {
AppendOp(framework::OpRegistry::CreateOp(
"identity", {{"X", {mul_out[0]}}}, {{"Y", {Output("SumOut")}}}, {}));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identity operator is not needed by fc. We could just remove SumOut, like

outputs_["SumOut"].clear();

But it could be done in another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried such kind of implementation, "renaming SumOut to EMPTY and using mul_out[0] instead", but so sad to meet some problem. I will try again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works. I think there may be some error in my previous implementation.


auto activation = Attr<std::string>("activation");
AppendOp(framework::OpRegistry::CreateOp(
activation, {{"X", {Output(add_out)}}}, {{"Y", {Output("Y")}}}, {}));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reference https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/name_convention.md

It seems that our Output prefer to Out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that. But I prefer to use Y 😞 Whatever, I will rename it.

Thus, the output Out is a 2-D matrix of size (M x N).
Activation type can be set to `identity` (default), `sigmoid` or `softmax`.
)DOC");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

USE_OP(rowwise_add);
USE_NO_KERNEL_OP(identity);
USE_OP(sigmoid);
USE_OP(softmax);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

179-183行不用加了,可参考 #4119

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"Output(Out) of FCOp should not be null.");

auto x = Inputs("X");
auto w = Inputs("W");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the following code only use x.size() and w.size()
so here can use

size_t x_size = Inputs("X").size();
size_t w_size = Inputs("W").size();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X and W are duplicatable inputs, and x[i] and w[i] are used in line 73 which creates mul operators. w.size() is only used in the PADDLE_ENFORCE_EQ statement in line 39.

@Xreki Xreki merged commit ec9a55a into PaddlePaddle:develop Sep 18, 2017
@Xreki Xreki deleted the core_add_fc_op branch November 14, 2018 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port Fully Connected operator
5 participants