Reducing memory usage during inference #498

venkai · 2018-04-12T10:06:35Z

Hi,

I am working with a very deep fully convolutional architecture that currently takes up ~12G memory for a single 600 x 800 image during inference. To reduce memory, I modified net.cpp to allow duplicate top blobs during inference (gist here), and rewrote the prototxt for inference to use minimal unique activations. While this works and lowers memory usage (the net can now handle a 1280 x 720 image), the reduction in memory usage however is far less than I anticipated.

For instance, consider a simple feed-forward network without any branches: A1->A2->A3->A4->....->An
Ex: Conv->BN->ReLU->Conv->BN->ReLU->...
If all activations from A1 to An are of the same size, then we should be able to do inference by storing only 2 activations in memory and juggling computation between the two, like X->Y->X->Y->X->...
In practice however, the memory taken by this network is much much greater than that of 2 activations.

I suspect this is because of unique internal buffers used in each layer and/or a seperate workspace being used for each cudnn-convolution layer. In vanilla caffe, there was a trick to make the internal buffers static (here). This however doesn't work with cudnn. Is there something similar we can do here with internal buffers ? Also, is it possible to use a global workspace for all convolution layers, like in MxNet ?

Thnx

The text was updated successfully, but these errors were encountered:

drnikolaev · 2018-04-12T19:06:26Z

Hi @venkai , wait a minute. We do use global space for all CuDNN Convolution layers:
https://github.com/NVIDIA/caffe/blob/caffe-0.17/include/caffe/util/gpu_memory.hpp#L176-L181
It's a vector of N global spaces where N is the number of GPUs used.
And thanks for the gist, I'll explore it as soon as I can.

drnikolaev · 2018-06-04T21:53:55Z

cuDNN flow is clean, closing.

drnikolaev closed this as completed Jun 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing memory usage during inference #498

Reducing memory usage during inference #498

venkai commented Apr 12, 2018

drnikolaev commented Apr 12, 2018

drnikolaev commented Jun 4, 2018

Reducing memory usage during inference #498

Reducing memory usage during inference #498

Comments

venkai commented Apr 12, 2018

drnikolaev commented Apr 12, 2018

drnikolaev commented Jun 4, 2018