Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

algorithm selection return wrong result: needs 140632G working space. #353

Closed
CFAndy opened this issue Jun 12, 2017 · 2 comments
Closed

Comments

@CFAndy
Copy link

CFAndy commented Jun 12, 2017

Caffe 0.16
run Resnet with FP16math + FP16Data
11224 I0611 08:14:37.694576 80384 cudnn_conv_layer.cpp:834] [4] Conv Algos (F,BD,BF): 'layer_512_2_conv3' with space 5.28G/1 1 1 1 (limit 2.97G, req 140632G)
11225 I0611 08:14:37.696094 80380 cudnn_conv_layer.cpp:834] [0] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.96G, req 0.06G)
11226 I0611 08:14:37.749905 80387 cudnn_conv_layer.cpp:834] [7] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11227 I0611 08:14:37.760154 80385 cudnn_conv_layer.cpp:834] [5] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11228 I0611 08:14:37.787997 80382 cudnn_conv_layer.cpp:834] [2] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11229 I0611 08:14:37.788635 80380 cudnn_conv_layer.cpp:834] [0] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.96G, req 0.06G)
11230 I0611 08:14:37.795770 80383 cudnn_conv_layer.cpp:834] [3] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11231 I0611 08:14:37.803072 80386 cudnn_conv_layer.cpp:834] [6] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1 1 1 (limit 2.96G, req 140632G)
11232 I0611 08:14:37.823436 80381 cudnn_conv_layer.cpp:834] [1] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11233 I0611 08:14:37.835868 80384 cudnn_conv_layer.cpp:834] [4] Conv Algos (F,BD,BF): 'layer_512_3_conv1' with space 5.28G/1 1p 1 1 (limit 2.97G, req 140632G)
11234 I0611 08:14:37.841663 80387 cudnn_conv_layer.cpp:834] [7] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11235 I0611 08:14:37.854826 80385 cudnn_conv_layer.cpp:834] [5] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11236 I0611 08:14:37.881242 80382 cudnn_conv_layer.cpp:834] [2] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11237 I0611 08:14:37.886878 80383 cudnn_conv_layer.cpp:834] [3] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11238 I0611 08:14:37.894661 80386 cudnn_conv_layer.cpp:834] [6] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.96G, req 140632G)
11239 I0611 08:14:37.916431 80381 cudnn_conv_layer.cpp:834] [1] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11240 I0611 08:14:37.929443 80380 cudnn_conv_layer.cpp:834] [0] Conv Algos (F,BD,BF): 'layer_512_3_conv3' with space 5.28G/1 1 1 1 (limit 2.96G, req 0.06G)
11241 I0611 08:14:37.929920 80384 cudnn_conv_layer.cpp:834] [4] Conv Algos (F,BD,BF): 'layer_512_3_conv2' with space 5.28G/1 7 5 5 (limit 2.97G, req 140632G)
11242 F0611 08:14:37.944598 80380 cudnn_conv_layer.cu:129] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM
11243 *** Check failure stack trace: ***
11244 @ 0x7fe7720a3daa (unknown)
11245 @ 0x7fe7720a3ce4 (unknown)
11246 @ 0x7fe7720a36e6 (unknown)
11247 @ 0x7fe7720a6687 (unknown)
11248 @ 0x7fe772c8f2ad caffe::CuDNNConvolutionLayer<>::Backward_gpu()
11249 @ 0x7fe7728d525d caffe::Layer<>::Backward()
11250 @ 0x7fe772b65279 caffe::Net::BackwardFromToAu()
11251 @ 0x7fe772b65585 caffe::Net::Backward()
11252 @ 0x7fe772b65801 caffe::Net::ForwardBackward()
11253 @ 0x7fe772b94278 caffe::Solver::Step()
11254 @ 0x7fe772b94fb0 caffe::Solver::Solve()
11255 @ 0x7fe772bdb71b caffe::P2PSync::InternalThreadEntry()
11256 @ 0x7fe772b9b472 caffe::InternalThread::entry()
11257 @ 0x7fe772b9c0d4 boost::detail::thread_data<>::run()
11258 @ 0x7fe7688eea4a (unknown)
11259 I0611 08:14:37.980826 80387 cudnn_conv_layer.cpp:834] [7] Conv Algos (F,BD,BF): 'layer_512_3_conv3' with space 5.28G/1 1 1 1 (limit 2.97G, req 140632G)
11260 @ 0x7fe7604e8184 start_thread
11261 @ 0x7fe77117a37d (unknown)
11262 @ (nil) (unknown)
11263 Aborted

@drnikolaev
Copy link

Hi @ChenFengAndy it's been fixed already but we need some time to deliver new release. Thank you for reporting this.

@drnikolaev
Copy link

Fixed in v0.16.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants