Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] ContinueStmt failed to trigger on CUDA backend #8057

Closed
jim19930609 opened this issue May 22, 2023 · 0 comments · Fixed by #8113
Closed

[Bug] ContinueStmt failed to trigger on CUDA backend #8057

jim19930609 opened this issue May 22, 2023 · 0 comments · Fixed by #8113
Assignees

Comments

@jim19930609
Copy link
Contributor

jim19930609 commented May 22, 2023

Issue from forum: https://forum.taichi-lang.cn/t/topic/4342

Reproduce

import taichi as ti
ti.init(arch=ti.gpu)
img = ti.field( ti.i32, (2,2))
@ti.kernel
def K():
    for i,j in img:
        img[i,j] = 0
        if  i> 0  or j> 0:continue
        img[i,j] = 1

    for i,j in img:
        print(i,j,img[i,j])

img.fill(2)
K()

Output

[CUDA - incorrect]

[Taichi] version 1.7.0, llvm 15.0.4, commit 5712f644, linux, python 3.10.6
[Taichi] Starting on arch=cuda
0 0 1
0 1 2
1 0 2
1 1 2

[CPU - reference]

[Taichi] version 1.7.0, llvm 15.0.4, commit 5712f644, linux, python 3.10.6
[Taichi] Starting on arch=x64
0 0 1
0 1 0
1 0 0
1 1 0
@github-project-automation github-project-automation bot moved this to Untriaged in Taichi Lang May 22, 2023
@jim19930609 jim19930609 self-assigned this May 26, 2023
@jim19930609 jim19930609 moved this from Untriaged to Todo in Taichi Lang May 26, 2023
@github-project-automation github-project-automation bot moved this from Todo to Done in Taichi Lang Jun 2, 2023
jim19930609 added a commit that referenced this issue Jun 2, 2023
…floadedStmts on LLVM backend (#8113)

Issue: fix #8057

### Brief Summary

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 33e01dc</samp>

This pull request refactors the codegen modules and the IR cloning
mechanism to support the new IR structure that wraps offloaded tasks in
`Block` nodes. It updates the `compile_task` functions of different
backends to take an `IRNode` pointer instead of an `OffloadedStmt`
pointer. It also adds a new `clone` function to
`taichi/analysis/clone.cpp` and a new `special` field to `IRNode` to
facilitate the cloning process.

### Walkthrough

<!--
copilot:walkthrough
-->
### <samp>🤖 Generated by Copilot at 33e01dc</samp>

* Add a new function `clone` to deep copy statements
([link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-4d1cd675a2ba91408b0e3fe7187f44bb1eaaf9ce1c316a7de3f4a044558ac2eaR137-R144),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-f6bc75768d2e24c782fefa45a7232d0e2b2bae091e697040e7f442a77d80ad45R80))
* Modify the function `compile_task` and its subclasses to take an
`IRNode` pointer instead of an `OffloadedStmt` pointer as the last
argument
([link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-f45ad00d5bebbc900cf84ab5b4f0b634fe8ca7a99770788c2cb5b055612c1e85L475-R477),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-2dca4a15a3c5e43017b458e9a19d8edabade6e4489c719aea88819e9dc34c285L25-R25),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-75d426482e598b420f6f3bed213844bbfcca397be6d3a3371c9ff8128275a6fdL66-R66),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-b6e86fbdf536db46b321f67942f66d809c213a4142ceb9f5f81d016684c2d5c8L237-R239),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-9bc17077ecc24636db6f9c92d14d55bdcdb87c18379d72a06a9e84486d20b060L27-R27),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-50537ad5ea3b900c0d55a088f3cc285986340ad68c9b96fea481187c4dce49eaL762-R764),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-15ab2a7729595aeafa4c5a77a6c97e9a7e9d43ffd49210b86d940b6394e58475L25-R25),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-b26bcd66b9334fe882df039d1c619f16f0f91f6c43c3872b6cbb2688b8e7b749L275-R279),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-d4ff49adc577edb7d11d8e69b2fb1e97162aaf01ef896e588a35cc8a20e061b1L32-R32))
* Add a new field `special` to class `IRNode` to indicate whether the
node is a special node that should not be cloned or modified by certain
passes
([link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-61484fa2a50e309478017fb2a436198aa4b0afdf72a4039bf574fc4f2aedbe4eR228-R229))
* Set the `special` field of the root node of the IR to `true` before
cloning the offloaded statements and to `false` after cloning them in
the function `offload_to_executable` in
`taichi/transforms/compile_to_offloads.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bR255-R256),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL271-R273))
* Wrap the cloned offloaded statements in a `Block` node before passing
them to `compile_task` in the functions `compile_kernel_to_module` in
`taichi/codegen/codegen.cpp` and `compile` in
`taichi/codegen/dx12/codegen_dx12.cpp`
([link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-54b0d16247300543741692be7cec4b05993efa661e71ad3d69d5da85fcbc7782L89-R92),
[link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-b26bcd66b9334fe882df039d1c619f16f0f91f6c43c3872b6cbb2688b8e7b749L254-R265))
* Use the `task_name` method of the `IRNode` to get the name of the
offloaded task in the function `compile` in
`taichi/codegen/dx12/codegen_dx12.cpp`, instead of assuming it is an
`OffloadedStmt`
([link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-b26bcd66b9334fe882df039d1c619f16f0f91f6c43c3872b6cbb2688b8e7b749L254-R265))
* Remove the call to `scalarize` and `full_simplify` passes in the
function `offload_to_executable` in
`taichi/transforms/compile_to_offloads.cpp`, which are not needed after
cloning the offloaded statements
([link](https://github.com/taichi-dev/taichi/pull/8113/files?diff=unified&w=0#diff-8fde186587db97b3bbc8a856e59bc4467b30257335b0fad064b4eebd521a912bL271-R273))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
1 participant