-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenCL Backend #2195
OpenCL Backend #2195
Conversation
commit build system changes and header files
The OpenCL build target is configured in Makefile and Makefile.config.OpenCL. clBLAS is used as the backend for all BLAS functionas that require it. All GPU tests succesfully complete using AMD or nVidia OpenCL.
Have you looked at #610? |
Hi, I am interested in your OpenCL implementation of Caffe. Have you trained data set like MNIST, CIFAR, IMAGENET using your code? I encountered some problems when using your project to train cifar10 network. I would like to know whether you have run into the same issues. could you please kindly give me some instructions ? Thanks. |
Hello,
GPU unit tests that are provided with Caffe. I am certain that the tests Can you please provide some more details that will help me to understand Robert On 03/26/2015 02:56 AM, kuke wrote:
|
Hi @lunochod,
He is referring to several defacto-standard datasets that are often used by researchers when reporting performance of their models, and that are included as examples with Caffe. If you look here and scroll down to "Examples", you'll see links to MNIST, CIFAR, and IMAGENET tutorials that provide step-by-step instructions for running those example models. So the expectation is they would perform the same under OpenCL as CUDA or CPU. Sorry if that was already obvious to you; maybe you were just asking him to provide details of what's going wrong in his case. Thanks for this contribution! I too am interested in being able to run on OpenCL devices. |
To put a finer point on bhack's allusion to PR 610: the direction they were going was to have an abstract interface By contrast, it looks like the approach of this PR is to use compile time switch Among the good things about the approach of 610 is that it fixes the situation where there is separate |
|
||
# uncomment one | ||
CPU_ONLY := 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the various Makefile.config.* files, CPU_ONLY := 1 is shown. This is confusing; what is the intended effect if CPU_ONLY is 1 in some and 0 in others? I think there should be a single authoritative source where the decision is made. I'm not sure how all this hangs together, but I'm thinking that since you introduced USE_OPENCL and USE_CUDA, CPU_ONLY should be removed. If neither USE_OPENCL nor USE_CUDA is specified, then what was called CPU_ONLY is what you are left with. I.e. one always gets the CPU implementation. USE_OPENCL and USE_CUDA will determine whether those implementations are available. But maybe you had something else in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hand-crafted makefiles are confusing. My intention was to have three
device dependent Makefiles:
Makefile.config.CPU
Makefile.config.CUDA
Makefile.config.OpenCL
and allow variables in Makefile.config to select which file should be
included:
CPU_ONLY := 1 // build CPU only version
USE_CUDA := 1 // build CUDA version
USE_OPENCL := 1 // build OpenCL version
The OpenCL version still requires the CPU_ONLY flag to be set in order
to prevent running into CUDA code. The reason is that Caffe branches
between the CPU code and the CUDA code using the preprocessor:
#ifdef CPU_ONLY
#else
#endif
and I haven't fixed that everywhere yet.
In the meantime I add support to build the OpenCL backend using cmake
which works better and compiles faster.
Robert
On 03/30/2015 09:47 AM, jyegerlehner wrote:
In Makefile
#2195 (comment):+# uncomment one
+CPU_ONLY := 1In the various Makefile.config.* files, CPU_ONLY := 1 is shown. This
is confusing; what is the intended effect if CPU_ONLY is 1 in some and
0 in others? I think there should be a single authoritative source
where the decision is made. I'm not sure how all this hangs together,
but I'm thinking that since you introduced USE_OPENCL and USE_CUDA,
CPU_ONLY should be removed. If neither USE_OPENCL nor USE_CUDA is
specified, then what was called CPU_ONLY is what you are left with.
I.e. one always gets the CPU implementation. USE_OPENCL and USE_CUDA
will determine whether those implementations are available. But maybe
you had something else in mind.—
Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/pull/2195/files#r27407792.
Hi @lunochod, I spent some time hacking on this code to get the tests running and (mostly) passing. Looks to me like you did lots of brilliant work, and it was exciting to see the Caffe tests running on the Radeon GPU. There were a number of changes I needed to make to get there, see: |
TEST_OBJS := $(TEST_CXX_OBJS) | ||
TEST_BINS := $(TEST_CXX_BINS) | ||
ALL_WARNS := $(ALL_CXX_WARNS) | ||
TEST_FILTER := --gtest_filter="*GPU*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this gtest filter is right. Vast majority of tests that run on the GPU don't have the string "GPU" in their name. With this filter, only about 30 tests were being run. To get them all to run I commented this out.
Couple more things.
TestGPUWrite looked like this:
I haven't tried looking into that yet. CCC shows the Catalyst "Driver Packaging Version" as 13.35.1005. clinfo shows
I can commit some time helping bring this forward in whatever form the Brewers would find acceptable. |
By the way there's a ton of lint. Run make lint. The enforced style is a restrictive, a pain and I don't like it but we live with it. There's also lots of tabs. Need two spaces instead. |
make sure cifar10 runs performance improvements
I like this new collaborative mood. |
@lunochod
should probably be checked before compiling for an OpenCL device that was detected by the OpenCL implementation. |
@nirvik-d has provided the command line output of 'clinfo'. It lists one platform and two devices CL_PLATFORM_NAME : AMD Accelerated Parallel Processing The source code for clinfo is available on GitHub: https://github.com/Oblomov/clinfo/blob/master/src/clinfo.c It shows that the device listing doesn't exclude devices that are not available. The information provided by clinfo is the same my implementation finds when OpenCL gets initialized at startup, which means that there is no 3rd device that is not available as you suspect. But check for CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE here: https://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clGetDeviceInfo.html This property is reported to be zero by clinfo for the GPU on the user's platform, which means that it doesn't have support for native double as required by the application and hence that is the reason for the error. Thanks for trying to help Fabian. |
Hi @robert, just a quick observation, you are working in what is now a very crowded space. Seems like the caffe OpenCL ports are now largely feature complete, but Theano and Chainer are still CUDA-specific:
|
For all the followers of this PR a nice performance thread on benchmark results was started at soumith/convnet-benchmarks#56 |
Hi, method: for (ClPlatformsIter it = cl_platforms.begin(); it != cl_platforms.end();
++it) {
std::tr1::shared_ptr<OpenCLPlatform> pp = std::tr1::shared_ptr<
OpenCLPlatform>(
new OpenCLPlatform(
*it));
if (!pp->Query()) {
LOG(ERROR)<< "failed to query platform.";
return false;
}
platforms_.push_back(
pp);
} For each iteration, I fixed this method by myself and it works. I thought it was just a minor bug - other guys' work! - so I didn't make branches, requests, all complex things.... whatever. Thanks. |
@sungbin0105 Good catch. Thanks, I'll fix that. Robert |
Can caffe process 'device_query' command?
|
(personal observation: this is such a crazy-long PR :-P Maybe worth creating as a separate fork, with its own issue tracker? Just an idea :-) ) |
Got error with both ViennaCL 1.6.2 and 1.7.0 versions.
Caffe examples also fails. |
@lunochod thanks for your continued efforts on this! I hope that all the OpenCL branches out there now can be coordinated in the future. Happy hacking. |
hey, i'm on kubuntu 14.04 with a radeon 290x gpu. i can run the cifar test just fine, but the unit tests keep causing a system-wide failure at a certain point. typescripting the process reveals this test is the culprit
i'm unsure why this particular test would fail. any insight? |
Hi, ./include/caffe/util/OpenCL/OpenCLMemory.hpp:44:3: error: ‘caffe::OpenCLMemory::OpenCLMemory(const caffe::OpenCLMemory&)’ is private Which old commit should I use? Could someone help? |
AMD just released Caffe built with HIP and HCC (https://bitbucket.org/multicoreware/hccaffe/overview). I'm just wondering how many people who have been helping out with this project would support the new effort from AMD. |
@naibaf7 What is the policy now? I think that not make sense to have Opencl PR opened against master now that we have an official Opencl branch. |
@briansp2020 I would be interested in running Caffe on AMD hardware. Do you have any performance numbers for HcCaffe in comparison with the OpenCL branch? I do not expect it to be as fast as Caffe with cuDNN but it would be nice if it matched the CUDA implementation without cuDNN... |
Pull requests on OpenCL should now be made against https://github.com/BVLC/caffe/tree/opencl |
Robert, how can I cite your opencl caffe fork? |
(need your surname too, probably) |
This is you? https://www.linkedin.com/in/robert-engel-131b87107 |
That is Robert Sent from my iPhone
|
Can I get this to work on FreeBSD which has 0 support for ROCm ? I have an AMD GPU RX580 and would like to run caffe on OpenCL for AMD GPU on FreeBSD 13.1. Also getting this OpenCL error after Thanks. |
About
The proposed changes add OpenCL support to Caffe. All GPU functions can be executed using AMD GPUs w/ OpenCL 1.2 or 2.0 as well as nVidia GPUs w/ OpenCL 1.1.
Build Instructions
https://github.com/lunochod/caffe/wiki/OpenCL-Backend
OpenCL Tests
All GPU tests successfully complete using this OpenCL version of Caffe.
Performance and Stability
The main goal was to provide an OpenCL port to the Caffe community. As such it is not yet optimized for performance or stability.
Help Wanted
Let's make it better and faster together.