Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the reason for having separate LayerSetUp and Reshape? #1385

Closed
netheril96 opened this issue Oct 31, 2014 · 9 comments
Closed

What is the reason for having separate LayerSetUp and Reshape? #1385

netheril96 opened this issue Oct 31, 2014 · 9 comments

Comments

@netheril96
Copy link
Contributor

The latest dev branch of caffe splits the initialization code of each layer into two functions called LayerSetUp and Reshape. However, from the documentation I fail to infer the reason for this split and the distinction between the two. They are always called consecutively, so what purpose does it serve to separate two? And when I am writing a new layer, exactly what should be put in LayerSetUp and what goes into Reshape?

I tried to read the source code for guidance, but here is the documentation for LayerSetUp:

This method should do one-time layer specific setup. This includes reading and processing relevent parameters from the layer_param_. Setting up the shapes of top blobs and internal buffers should be done in Reshape, which will be called before the forward pass to adjust the top blob sizes.

And here is the implementation of caffe::SoftmaxWithLossLayer< Dtype >::LayerSetUp:

LossLayer<Dtype>::LayerSetUp(bottom, top);
softmax_bottom_vec_.clear();
softmax_bottom_vec_.push_back(bottom[0]);
softmax_top_vec_.clear();
softmax_top_vec_.push_back(&prob_);
softmax_layer_->SetUp(softmax_bottom_vec_, softmax_top_vec_);

Contrary to the documentation, caffe::SoftmaxWithLossLayer< Dtype >::LayerSetUp does set up the internal "buffers" (here another layer embedded within), and it calls SetUp function, which in turn calls Reshape so the top blobs do have their shape changed in this function.

I am completely baffled.

@bhack
Copy link
Contributor

bhack commented Oct 31, 2014

Have you seen https://github.com/BVLC/caffe/wiki/Development?

@netheril96
Copy link
Contributor Author

@bhack That wiki page only reiterates what have already been stated in the code comments. It does not answer why such distinction is ever needed. The seemingly contradictory implementation in SoftmaxWithLossLayer isn't clarified either.

@bhack
Copy link
Contributor

bhack commented Oct 31, 2014

The wiki page tell you that setup is optional and is called only one-time. Reshape is not one time.

@netheril96
Copy link
Contributor Author

@bhack Is it guaranteed that the same vector of blobs bottom and top will be called every time?

@bhack
Copy link
Contributor

bhack commented Oct 31, 2014

Reshape is called when network need a reshape

@longjon
Copy link
Contributor

longjon commented Nov 1, 2014

  • Reshape is called before every forward pass; LayerSetUp is only called once at initalization. This allows networks to change their blob shapes while running.
  • The inclusion of a SoftmaxLayer within SoftmaxLossLayer is not a buffer per se; other layers (e.g., ConvolutionLayer) have buffers whose shape depends on the shapes of their bottoms, and these are the buffers that need to be reshaped in Reshape.
  • The call to SetUp instead of LayerSetUp when including SoftmaxLayer does involve a redundant call to Reshape, but it also includes some checks that SetUp performs but LayerSetUp doesn't. Reshape is idempotent; calling it an extra time is not an error (though a cleaner refactoring that doesn't do that might be possible).
  • For the gory details and history, see On-the-fly net resizing, without reallocation (where possible) #594.

@netheril96
Copy link
Contributor Author

@longjon It is nice to hear explanation from the one who originally proposes the change. But I still have some questions. In caffe::SoftmaxWithLossLayer< Dtype >::LayerSetUp there is this line:

softmax_bottom_vec_.push_back(bottom[0]);

But what if the next call, or next next next call of Reshape is on a different bottom and bottom[0] points to another blob? In that case, the embedded layer will still compute the loss from the old blob. Won't that be a potential bug?

if instead it is actually a precondition for calling Reshape, that is, to supply the same arguments as LayerSetUp is called with, it is better to explicitly state that in the code comments.

I asked this because I am writing a layer similar to SoftmaxWithLossLayer in that it also has an embedded layer. I need to know if I can count on the vector of blobs bottom not changing, even if the shape of individual blobs may change.

@longjon
Copy link
Contributor

longjon commented Nov 2, 2014

Yes, the layers do in general assume that the bottom and top blobs will not change between calls, and the net implementation guarantees this. We should probably either document this fact or tweak the layer interface to make it more obvious; we could have the layers store the top and bottom passed into setup, making this assumption impossible to violate.

@netheril96
Copy link
Contributor Author

@longjon Thanks. That clarifies my last bit of confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants