Caffe and cuda streams #5855

msarett · 2017-08-18T01:36:41Z

It seems that, for the most part, caffe does not make use of cuda streams in the gpu implementation. This means that all of the operations are synchronized on the default cuda stream.

This is a reasonable implementation if we're not worried about concurrency. But if we wanted to share the gpu among host threads, it's not optimal. Ex: If we try to run inference on two different caffe models from two different host threads, they will constantly block each other on the default cuda stream, even though they are completely independent.

Is this a known and accepted limitation of caffe? Or is there any plan to move computation into cuda streams to avoid blocking the entire gpu?

Thanks for your thoughts!

Noiredd added question parallelism labels Feb 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caffe and cuda streams #5855

Caffe and cuda streams #5855

msarett commented Aug 18, 2017

Caffe and cuda streams #5855

Caffe and cuda streams #5855

Comments

msarett commented Aug 18, 2017