Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenCL] Refactor cl_program generation #7834

Merged
merged 6 commits into from
May 1, 2021
Merged

[OpenCL] Refactor cl_program generation #7834

merged 6 commits into from
May 1, 2021

Conversation

csullivan
Copy link
Contributor

I have encountered a few pathological bugs in the opencl compiler provided on the snapdragon android platform (e.g. opencl compiler hung for 5+ hours in call to clBuildProgram, and non-deterministic emission of cl_a6x_cmdbuf_mgr_submit_ibs). I've isolated them into a minimal reproducible example, and find that they occur only when all kernels are created from a single cl_program. If instead a cl_program is created for each kernel, these issues are avoided.

This PR proposes the addition of a kernel primitive delimiter to be added to the OpenCL code generation, and for the OpenCL module runtime to utilize this delimiter to build and cache separate cl_programs for each generated kernel source.

for each kernel. This can avoid pathological bugs in the
vendor specific OpenCL compiler that may be triggered
with large programs.
src/runtime/opencl/opencl_module.cc Outdated Show resolved Hide resolved
Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nits, other parts lgtm

src/runtime/opencl/opencl_common.h Show resolved Hide resolved
@tqchen tqchen merged commit 2215d73 into apache:main May 1, 2021
@tqchen
Copy link
Member

tqchen commented May 1, 2021

Thanks @csullivan This is merged

umangyadav pushed a commit to umangyadav/tvm that referenced this pull request May 5, 2021
* Refactor OpenCL runtime module to build separate cl_programs
for each kernel. This can avoid pathological bugs in the
vendor specific OpenCL compiler that may be triggered
with large programs.

* clang-format

* Remove check on program size when deconstructing.

* Refactor into SplitKernels method.

* Limit number of loops for kernel parsing

* Add return doc for SplitKernels per CR.
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
* Refactor OpenCL runtime module to build separate cl_programs
for each kernel. This can avoid pathological bugs in the
vendor specific OpenCL compiler that may be triggered
with large programs.

* clang-format

* Remove check on program size when deconstructing.

* Refactor into SplitKernels method.

* Limit number of loops for kernel parsing

* Add return doc for SplitKernels per CR.
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
* Refactor OpenCL runtime module to build separate cl_programs
for each kernel. This can avoid pathological bugs in the
vendor specific OpenCL compiler that may be triggered
with large programs.

* clang-format

* Remove check on program size when deconstructing.

* Refactor into SplitKernels method.

* Limit number of loops for kernel parsing

* Add return doc for SplitKernels per CR.
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request May 6, 2021
* Refactor OpenCL runtime module to build separate cl_programs
for each kernel. This can avoid pathological bugs in the
vendor specific OpenCL compiler that may be triggered
with large programs.

* clang-format

* Remove check on program size when deconstructing.

* Refactor into SplitKernels method.

* Limit number of loops for kernel parsing

* Add return doc for SplitKernels per CR.
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request May 11, 2021
* Refactor OpenCL runtime module to build separate cl_programs
for each kernel. This can avoid pathological bugs in the
vendor specific OpenCL compiler that may be triggered
with large programs.

* clang-format

* Remove check on program size when deconstructing.

* Refactor into SplitKernels method.

* Limit number of loops for kernel parsing

* Add return doc for SplitKernels per CR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants