What is the reason for having separate `LayerSetUp` and `Reshape`? #1385

netheril96 · 2014-10-31T07:58:37Z

The latest dev branch of caffe splits the initialization code of each layer into two functions called LayerSetUp and Reshape. However, from the documentation I fail to infer the reason for this split and the distinction between the two. They are always called consecutively, so what purpose does it serve to separate two? And when I am writing a new layer, exactly what should be put in LayerSetUp and what goes into Reshape?

I tried to read the source code for guidance, but here is the documentation for LayerSetUp:

This method should do one-time layer specific setup. This includes reading and processing relevent parameters from the layer_param_. Setting up the shapes of top blobs and internal buffers should be done in Reshape, which will be called before the forward pass to adjust the top blob sizes.

And here is the implementation of caffe::SoftmaxWithLossLayer< Dtype >::LayerSetUp:

LossLayer<Dtype>::LayerSetUp(bottom, top);
softmax_bottom_vec_.clear();
softmax_bottom_vec_.push_back(bottom[0]);
softmax_top_vec_.clear();
softmax_top_vec_.push_back(&prob_);
softmax_layer_->SetUp(softmax_bottom_vec_, softmax_top_vec_);

Contrary to the documentation, caffe::SoftmaxWithLossLayer< Dtype >::LayerSetUp does set up the internal "buffers" (here another layer embedded within), and it calls SetUp function, which in turn calls Reshape so the top blobs do have their shape changed in this function.

I am completely baffled.

The text was updated successfully, but these errors were encountered:

bhack · 2014-10-31T08:27:24Z

Have you seen https://github.com/BVLC/caffe/wiki/Development?

netheril96 · 2014-10-31T08:31:04Z

@bhack That wiki page only reiterates what have already been stated in the code comments. It does not answer why such distinction is ever needed. The seemingly contradictory implementation in SoftmaxWithLossLayer isn't clarified either.

bhack · 2014-10-31T08:37:54Z

The wiki page tell you that setup is optional and is called only one-time. Reshape is not one time.

netheril96 · 2014-10-31T08:45:48Z

@bhack Is it guaranteed that the same vector of blobs bottom and top will be called every time?

bhack · 2014-10-31T09:04:14Z

Reshape is called when network need a reshape

longjon · 2014-11-01T00:46:46Z

Reshape is called before every forward pass; LayerSetUp is only called once at initalization. This allows networks to change their blob shapes while running.
The inclusion of a SoftmaxLayer within SoftmaxLossLayer is not a buffer per se; other layers (e.g., ConvolutionLayer) have buffers whose shape depends on the shapes of their bottoms, and these are the buffers that need to be reshaped in Reshape.
The call to SetUp instead of LayerSetUp when including SoftmaxLayer does involve a redundant call to Reshape, but it also includes some checks that SetUp performs but LayerSetUp doesn't. Reshape is idempotent; calling it an extra time is not an error (though a cleaner refactoring that doesn't do that might be possible).
For the gory details and history, see On-the-fly net resizing, without reallocation (where possible) #594.

netheril96 · 2014-11-01T12:14:20Z

@longjon It is nice to hear explanation from the one who originally proposes the change. But I still have some questions. In caffe::SoftmaxWithLossLayer< Dtype >::LayerSetUp there is this line:

softmax_bottom_vec_.push_back(bottom[0]);

But what if the next call, or next next next call of Reshape is on a different bottom and bottom[0] points to another blob? In that case, the embedded layer will still compute the loss from the old blob. Won't that be a potential bug?

if instead it is actually a precondition for calling Reshape, that is, to supply the same arguments as LayerSetUp is called with, it is better to explicitly state that in the code comments.

I asked this because I am writing a layer similar to SoftmaxWithLossLayer in that it also has an embedded layer. I need to know if I can count on the vector of blobs bottom not changing, even if the shape of individual blobs may change.

longjon · 2014-11-02T04:36:53Z

Yes, the layers do in general assume that the bottom and top blobs will not change between calls, and the net implementation guarantees this. We should probably either document this fact or tweak the layer interface to make it more obvious; we could have the layers store the top and bottom passed into setup, making this assumption impossible to violate.

netheril96 · 2014-11-02T05:00:16Z

@longjon Thanks. That clarifies my last bit of confusion.

netheril96 closed this as completed Nov 2, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the reason for having separate `LayerSetUp` and `Reshape`? #1385

What is the reason for having separate `LayerSetUp` and `Reshape`? #1385

netheril96 commented Oct 31, 2014

bhack commented Oct 31, 2014

netheril96 commented Oct 31, 2014

bhack commented Oct 31, 2014

netheril96 commented Oct 31, 2014

bhack commented Oct 31, 2014

longjon commented Nov 1, 2014

netheril96 commented Nov 1, 2014

longjon commented Nov 2, 2014

netheril96 commented Nov 2, 2014

What is the reason for having separate LayerSetUp and Reshape? #1385

What is the reason for having separate LayerSetUp and Reshape? #1385

Comments

netheril96 commented Oct 31, 2014

bhack commented Oct 31, 2014

netheril96 commented Oct 31, 2014

bhack commented Oct 31, 2014

netheril96 commented Oct 31, 2014

bhack commented Oct 31, 2014

longjon commented Nov 1, 2014

netheril96 commented Nov 1, 2014

longjon commented Nov 2, 2014

netheril96 commented Nov 2, 2014

What is the reason for having separate `LayerSetUp` and `Reshape`? #1385

What is the reason for having separate `LayerSetUp` and `Reshape`? #1385