Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct field misbehavior on gfx backends #8133

Open
listerily opened this issue Jun 5, 2023 · 5 comments
Open

Struct field misbehavior on gfx backends #8133

listerily opened this issue Jun 5, 2023 · 5 comments
Assignees

Comments

@listerily
Copy link
Contributor

listerily commented Jun 5, 2023

Describe the bug
Struct field misbehavior on gfx backends.

To Reproduce
function k() is expected to return 0, but it returns 1 instead.

import taichi as ti

ti.init(ti.vulkan)

@ti.dataclass
class S:
    i: ti.i32
    b: ti.i16

sf = S.field(shape=(10,  2))

sf[0, 0].b = 1

@ti.kernel
def k() -> int:
    return sf[0, 0].i

print(k())

Log/Screenshots

[Taichi] version 1.6.0, llvm 15.0.4, commit f1c6fbbd, linux, python 3.11.3
[Taichi] Starting on arch=vulkan
1

Process finished with exit code 0

Additional comments
There may be some problems related to field layout. If we replace the type of b with ti.i32, such error would not appear.

@github-project-automation github-project-automation bot moved this to Untriaged in Taichi Lang Jun 5, 2023
@lin-hitonami lin-hitonami moved this from Untriaged to Todo in Taichi Lang Jun 9, 2023
@jim19930609
Copy link
Contributor

It's not an issue with Struct Field only, but a more generic error with placing i32 and i16 on the same Dense SNode:

import taichi as ti

ti.init(ti.vulkan)

i = ti.field(ti.i32)
b = ti.field(ti.i16)

dense = ti.root.dense(ti.ij,(10, 2))
dense.place(i)
dense.place(b)

@ti.kernel
def test2() -> ti.i32:
    b[0, 0] = 2
    return i[0, 0]

x = test2()
print(x)

@jim19930609
Copy link
Contributor

jim19930609 commented Jun 19, 2023

Verified that the compiled SNodeTree looks fine, which is also consistent with access instructions in the Kernel:

Compiled SNodeTree:

[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@124] SNodeDescriptor
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@125] * snode=2
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@127] * type=S2 (is_place=true)
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@128] * cell_stride=4
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@130] * num_cells_per_container=1
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@131] * container_stride=4
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@133] * total_num_cells_from_root=20
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@134] 
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@124] SNodeDescriptor
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@125] * snode=3
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@127] * type=S3 (is_place=true)
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@128] * cell_stride=2
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@130] * num_cells_per_container=1
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@131] * container_stride=2
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@133] * total_num_cells_from_root=20
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@134] 
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@124] SNodeDescriptor
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@125] * snode=1
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@127] * type=S1 (is_place=false)
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@128] * cell_stride=6
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@130] * num_cells_per_container=20
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@131] * container_stride=120
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@133] * total_num_cells_from_root=20
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@134] 
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@124] SNodeDescriptor
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@125] * snode=0
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@127] * type=S0 (is_place=false)
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@128] * cell_stride=120
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@130] * num_cells_per_container=1
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@131] * container_stride=120
[T 06/19/23 10:45:32.821 510446] [snode_struct_compiler.cpp:compute_snode_size@133] * total_num_cells_from_root=1

Kernel:

[I 06/19/23 10:45:32.835 510446] [ir_printer.cpp:operator()@892] [test2_c76_0] Simplified IV:
kernel {
  $0 = offloaded  
  body {
    <i16> $1 = const 2
    <*gen> $2 = get root [S0root][root]
    <i32> $3 = const 0
    <*gen> $4 = [S0root][root]::lookup($2, $3) activate = false
    <*gen> $5 = get child [S0root->S1dense] $4
    <*gen> $6 = [S1dense][dense]::lookup($5, $3) activate = false
    <*i16> $7 = get child [S1dense->S3place<i16>] $6
    $8 : global store [$7 <- $1]
    <*i32> $9 = get child [S1dense->S2place<i32>] $6
    <i32> $10 = global load $9
    $11 : return tmp10
  }
}

@jim19930609
Copy link
Contributor

Generated SpirV code also looks fine:

SpirV crossed to glsl:

layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in;

layout(binding = 2, std430) buffer root_buffer_0_u16_ptr
{
    uint16_t _m0[];
} root_buffer_0_u16;

layout(binding = 3, std430) buffer root_buffer_0_u32_ptr
{
    uint _m0[];
} root_buffer_0_u32;

layout(binding = 1, std430) buffer rets_t
{
    int _m0;
} rets;

void main()
{
    if (gl_GlobalInvocationID.x == 0u)
    {
        root_buffer_0_u16._m0[0u] = 2us;
        rets._m0 = int(root_buffer_0_u32._m0[0u]);
    }
}

@jim19930609
Copy link
Contributor

jim19930609 commented Jun 19, 2023

Now the top-1 suspect is that we made a mistake in binding ti.i32 (aka root_buffer_0_u32_ptr) and ti.i16 (aka root_buffer_0_u16_ptr) Place SNodes,

@jim19930609
Copy link
Contributor

jim19930609 commented Jun 19, 2023

Okay I don't think it's easy to support mixed dtypes of variant number of bits. GFX backend used to support mixed dtypes with the same number of bits, say ti.f32 and ti.i32, by storing them into an array of ti.u32 and then cast the loaded value or the value to store accordingly:

SpirVField drawio

Now things are different when it comes to ti.i16 and ti.i32:

SpirVField drawio2

To store them into a single root buffer, we'll have to make an array of ti.u16:

SpirVField drawio3

But instead of casting the loaded value, this time we'll have to cast the pointer before OpLoad. Such pointer cast, to my knowledge, isn't supported in SpirV.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants