-
Notifications
You must be signed in to change notification settings - Fork 665
Add inception-resnet-v2 model #64
base: master
Are you sure you want to change the base?
Conversation
Cool! |
Regarding momentum, I think we match the Caffe style momentum by setting the dampening to 0. i.e. by setting dampening to 0 we compute: |
Thank you for comments! I will check the training result with |
… all layers in inception-resnet-v2
@lim0606 why changeCAddTable(true) to CAddTable() ? |
Hi, Since identity layers pass memory addresses of their input tensors directly to next layer, CAddTable(true) seems to cause a problem, changing the values of inputs in residual layers. In the case of cls task, the effect was minor; therefore, i didn't notice the problem for a long time... ;( However, when I tried to apply the model to other types of tasks, having additional layers branching feature output to several paths, the model gave me nan within some iterations. Sincerely, Jaehyun Lim |
@lim0606 |
Hi, Since the first child is the identity layer, which directly refer its input as output, writing values over the first child of CAddTable(true) becomes writing values over the input of the second child. Furthermore, resnet as well as inception-resnet-v2 consist of the residual layers, having identity path; therefore, CAddTable(true)s at different layers access the same memory addresses and change their values in a single forward path. Best regards, Jaehyun |
@lim0606 |
For personal interests, I and my friend Sunghun Kang (shuni@kaist.ac.kr) trained inception-resnet-v2 (http://arxiv.org/abs/1602.07261) from scratch based on torch, esp. facebook's training scripts (https://github.com/facebook/fb.resnet.torch).
This PR might not be a proper one for this repository, but I’d like to share this for someone who may be interested in this model too.
See, https://github.com/lim0606/torch-inception-resnet-v2 for more details about the trained model.