[TIR] Return error code from kernels in SplitHostDevice #15241

Lunderberg · 2023-07-05T16:35:33Z

Some codegen types delegate to CodeGenCPU for their compute kernels, as they may delegate work to packed functions. Because CodeGenCPU assumes that it can return an error code at any point (e.g. when launching a parallel for loop), the compute kernel should return an error code.

This PR resolves a potential bug introduced in #15127, which removed the hard-coded override of return type.

The unit tests in this PR rely on changes made in #15239, to allow TVMScript to represent the call into a compute kernel with non-void return type.

Prior to this commit, the return type of all internal function calls was hard-coded as `"void"`. After this commit, the `GlobalVar` representing the internal function has type annotation based on the callee's signature, which is then used as the return type of the internal call.

tvm-bot · 2023-07-05T16:35:36Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @Hzfengsy, @junrushao, @quic-sanirudh, @shingjan _{See #10317 for details}

_{Generated by tvm-bot}

…nto HEAD

Some codegen types delegate to `CodeGenCPU` for their compute kernels, as they may delegate work to packed functions. Because `CodeGenCPU` assumes that it can return an error code at any point (e.g. when launching a parallel for loop), the compute kernel should return an error code.

tqchen · 2023-07-07T18:32:24Z

@Lunderberg can you clarify an example that leverages CodegenCPU for kernel. Just want to make sure that our existing path of GPU kernel separated codegen continues to function as error there are propagated by call packed mechanism

Lunderberg · 2023-07-07T19:33:08Z

can you clarify an example that leverages CodegenCPU for kernel.

@tqchen Certainly. This mainly comes up in a few edge cases found when debugging a single-module lowering flow (#14985), used for #14862. The issue arose when a kDLExtDev target or a custom TIRToRuntime hook was implemented by subclassing CodeGenCPU or CodeGenCHost (e.g. CodeGenCMSISNN). In those cases, the base class assumes that it is safe to return an error code (e.g. in CodeGenCHost::PrintGetFuncFromBackend, even if that occurs within a portion that has been separated into an independent function.

These cases are mostly suppressed by the fix in #15102, but can still happen if there's an explicit T.target("my_custom_extension", host="llvm"). In those cases, the compute kernels occur within a function generated by "my_custom_extension", with the DLTensor-unpacking should still be handled by the usual LLVM codegen.

Just want to make sure that our existing path of GPU kernel separated codegen continues to function as error there are propagated by call packed mechanism

Definitely agreed. I updated the original PR to limit the int32_t return type to targets that may be executed on the CPU, so that the separated GPU kernels are unaffected. This is sufficient for the functionality in #14862, while avoiding changes to the GPU path.

* [TVMScript] Handle parsing of PrimFunc calls with non-void return Prior to this commit, the return type of all internal function calls was hard-coded as `"void"`. After this commit, the `GlobalVar` representing the internal function has type annotation based on the callee's signature, which is then used as the return type of the internal call. * Update CallNode return type in MakeUnpackedAPI * [TIR] Return error code from kernels in SplitHostDevice Some codegen types delegate to `CodeGenCPU` for their compute kernels, as they may delegate work to packed functions. Because `CodeGenCPU` assumes that it can return an error code at any point (e.g. when launching a parallel for loop), the compute kernel should return an error code. * [TIR] Remove builtin::ret(0) from device-side kernel * Restrict the int32 return type to targets that need to propagate errors * Updated unit tests for CPU-specific checks

Lunderberg added 5 commits July 5, 2023 15:48

Update CallNode return type in MakeUnpackedAPI

692b174

Merge branch 'tvmscript_subroutine_call_returning_nonvoid_pr_15239' i…

698a8e5

…nto HEAD

[TIR] Remove builtin::ret(0) from device-side kernel

1851d60

Restrict the int32 return type to targets that need to propagate errors

af722ad

Lunderberg force-pushed the split_host_device_with_error_code branch from 032ced1 to af722ad Compare July 5, 2023 20:49

Updated unit tests for CPU-specific checks

904d1dc

tqchen approved these changes Jul 18, 2023

View reviewed changes

tqchen merged commit 2eca9f0 into apache:main Jul 18, 2023
6 checks passed

Lunderberg deleted the split_host_device_with_error_code branch July 31, 2023 17:03

ysh329 mentioned this pull request Oct 18, 2023

[Release] v0.14.0 Release Candidate Notes #15948

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TIR] Return error code from kernels in SplitHostDevice #15241

[TIR] Return error code from kernels in SplitHostDevice #15241

Lunderberg commented Jul 5, 2023

tvm-bot commented Jul 5, 2023

tqchen commented Jul 7, 2023

Lunderberg commented Jul 7, 2023

[TIR] Return error code from kernels in SplitHostDevice #15241

[TIR] Return error code from kernels in SplitHostDevice #15241

Conversation

Lunderberg commented Jul 5, 2023

tvm-bot commented Jul 5, 2023

tqchen commented Jul 7, 2023

Lunderberg commented Jul 7, 2023