Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR] Return error code from kernels in SplitHostDevice #15241

Merged
merged 7 commits into from
Jul 18, 2023

Conversation

Lunderberg
Copy link
Contributor

Some codegen types delegate to CodeGenCPU for their compute kernels, as they may delegate work to packed functions. Because CodeGenCPU assumes that it can return an error code at any point (e.g. when launching a parallel for loop), the compute kernel should return an error code.

This PR resolves a potential bug introduced in #15127, which removed the hard-coded override of return type.

The unit tests in this PR rely on changes made in #15239, to allow TVMScript to represent the call into a compute kernel with non-void return type.

Prior to this commit, the return type of all internal function calls
was hard-coded as `"void"`.  After this commit, the `GlobalVar`
representing the internal function has type annotation based on the
callee's signature, which is then used as the return type of the
internal call.
@tvm-bot
Copy link
Collaborator

tvm-bot commented Jul 5, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

Some codegen types delegate to `CodeGenCPU` for their compute kernels,
as they may delegate work to packed functions.  Because `CodeGenCPU`
assumes that it can return an error code at any point (e.g. when
launching a parallel for loop), the compute kernel should return an
error code.
@Lunderberg Lunderberg force-pushed the split_host_device_with_error_code branch from 032ced1 to af722ad Compare July 5, 2023 20:49
@tqchen
Copy link
Member

tqchen commented Jul 7, 2023

@Lunderberg can you clarify an example that leverages CodegenCPU for kernel. Just want to make sure that our existing path of GPU kernel separated codegen continues to function as error there are propagated by call packed mechanism

@Lunderberg
Copy link
Contributor Author

can you clarify an example that leverages CodegenCPU for kernel.

@tqchen Certainly. This mainly comes up in a few edge cases found when debugging a single-module lowering flow (#14985), used for #14862. The issue arose when a kDLExtDev target or a custom TIRToRuntime hook was implemented by subclassing CodeGenCPU or CodeGenCHost (e.g. CodeGenCMSISNN). In those cases, the base class assumes that it is safe to return an error code (e.g. in CodeGenCHost::PrintGetFuncFromBackend, even if that occurs within a portion that has been separated into an independent function.

These cases are mostly suppressed by the fix in #15102, but can still happen if there's an explicit T.target("my_custom_extension", host="llvm"). In those cases, the compute kernels occur within a function generated by "my_custom_extension", with the DLTensor-unpacking should still be handled by the usual LLVM codegen.

Just want to make sure that our existing path of GPU kernel separated codegen continues to function as error there are propagated by call packed mechanism

Definitely agreed. I updated the original PR to limit the int32_t return type to targets that may be executed on the CPU, so that the separated GPU kernels are unaffected. This is sufficient for the functionality in #14862, while avoiding changes to the GPU path.

@tqchen tqchen merged commit 2eca9f0 into apache:main Jul 18, 2023
6 checks passed
junrushao pushed a commit to junrushao/tvm that referenced this pull request Jul 24, 2023
* [TVMScript] Handle parsing of PrimFunc calls with non-void return

Prior to this commit, the return type of all internal function calls
was hard-coded as `"void"`.  After this commit, the `GlobalVar`
representing the internal function has type annotation based on the
callee's signature, which is then used as the return type of the
internal call.

* Update CallNode return type in MakeUnpackedAPI

* [TIR] Return error code from kernels in SplitHostDevice

Some codegen types delegate to `CodeGenCPU` for their compute kernels,
as they may delegate work to packed functions.  Because `CodeGenCPU`
assumes that it can return an error code at any point (e.g. when
launching a parallel for loop), the compute kernel should return an
error code.

* [TIR] Remove builtin::ret(0) from device-side kernel

* Restrict the int32 return type to targets that need to propagate errors

* Updated unit tests for CPU-specific checks
junrushao pushed a commit to junrushao/tvm that referenced this pull request Jul 27, 2023
* [TVMScript] Handle parsing of PrimFunc calls with non-void return

Prior to this commit, the return type of all internal function calls
was hard-coded as `"void"`.  After this commit, the `GlobalVar`
representing the internal function has type annotation based on the
callee's signature, which is then used as the return type of the
internal call.

* Update CallNode return type in MakeUnpackedAPI

* [TIR] Return error code from kernels in SplitHostDevice

Some codegen types delegate to `CodeGenCPU` for their compute kernels,
as they may delegate work to packed functions.  Because `CodeGenCPU`
assumes that it can return an error code at any point (e.g. when
launching a parallel for loop), the compute kernel should return an
error code.

* [TIR] Remove builtin::ret(0) from device-side kernel

* Restrict the int32 return type to targets that need to propagate errors

* Updated unit tests for CPU-specific checks
junrushao pushed a commit to junrushao/tvm that referenced this pull request Jul 30, 2023
* [TVMScript] Handle parsing of PrimFunc calls with non-void return

Prior to this commit, the return type of all internal function calls
was hard-coded as `"void"`.  After this commit, the `GlobalVar`
representing the internal function has type annotation based on the
callee's signature, which is then used as the return type of the
internal call.

* Update CallNode return type in MakeUnpackedAPI

* [TIR] Return error code from kernels in SplitHostDevice

Some codegen types delegate to `CodeGenCPU` for their compute kernels,
as they may delegate work to packed functions.  Because `CodeGenCPU`
assumes that it can return an error code at any point (e.g. when
launching a parallel for loop), the compute kernel should return an
error code.

* [TIR] Remove builtin::ret(0) from device-side kernel

* Restrict the int32 return type to targets that need to propagate errors

* Updated unit tests for CPU-specific checks
@Lunderberg Lunderberg deleted the split_host_device_with_error_code branch July 31, 2023 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants