-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Inception-v4 for image classification #539
Conversation
return paddle.layer.concat(input=[b0_pool0, b1_conv1, b2_conv3]) | ||
|
||
|
||
def inception_v4(input, class_dim): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried this configuration and it can work without errors. But there are some points I am not sure whether appropriate. It seems that the original inception-v4 model uses input with size 3 * 299 * 299
, while the size 3 * 224 * 224
here will lead to some diffs like padding sizes and not be rigorously same. Additionally, I just compare the configuration with the TF implementation and I see some blocks like Incpetion_A
and Recution_A
have different structures with the TF implementation. I am not sure if it is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your reviews. The original inception-v4 model proposed in this paper (https://arxiv.org/pdf/1602.07261.pdf) indeed uses 3 * 299 * 299
as the input image size. However, the paddle examples of image classification utilize the flower dataset as training images and their size is 3 * 224 * 224
. Thus, I changed the input size of inception-v4. But the structure of the model is all the same as the original paper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add contents in README to illustrate the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the note in README:
Inception-v4模型可以通过下面的代码获取, 本例中使用的模型输入大小为`3 * 224 * 224` (原文献中使用的输入大小为`3 * 299 * 299`)
|
||
|
||
def Inception_A(input, depth): | ||
b0_pool0 = paddle.layer.img_pool( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just check with the TF implementation, and it seems that there are some diffs in inception_a
, like branch0
excludes pooling
and branch3
includes pooling
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the TF implementation of inception-v4 (https://github.com/tensorflow/models/blob/master/research/slim/nets/inception_v4.py). In fact, this structure is the same as that in TF because the order of branches do not affect the structure. I add the branch as the Figure 4 in the original paper (https://arxiv.org/pdf/1602.07261.pdf) from left to right. However, the TF implementation does not follow the order as that in the figure. But the order of branches does not matter. The structure is in fact the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get. Sorry for this and the late reply.
return paddle.layer.concat(input=[b0_conv0, b1_conv0, b2_conv1, b3_conv2]) | ||
|
||
|
||
def Inception_B(input, depth): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to Inception_A
, the network structure here has some diffs in with the TF implementation of Inception_B
, like branch0
excludes pooling
and branch3
includes pooling
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
return paddle.layer.concat(input=[b0_conv0, b1_conv0, b2_conv2, b3_conv4]) | ||
|
||
|
||
def Inception_C(input, depth): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to Inception_A
, the network structure here has some diffs in with the TF implementation of Inception_C
, like branch0
excludes pooling
and branch3
includes pooling
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
input=[b0_conv0, b1_conv0, b2_conv1, b2_conv2, b3_conv3, b3_conv4]) | ||
|
||
|
||
def Reduction_A(input): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The network structure here has some diffs in with the TF implementation of Reduction_A
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
return paddle.layer.concat(input=[b0_pool0, b1_conv0, b2_conv2]) | ||
|
||
|
||
def Reduction_B(input): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The network structure here has some diffs in with the TF implementation of Reduction_B
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
resolve #538