Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The method from_numpy() not working on Mac when using Vulkan backend #6295

Closed
Linyou opened this issue Oct 11, 2022 · 4 comments
Closed

The method from_numpy() not working on Mac when using Vulkan backend #6295

Linyou opened this issue Oct 11, 2022 · 4 comments
Assignees
Labels
potential bug Something that looks like a bug but not yet confirmed

Comments

@Linyou
Copy link
Contributor

Linyou commented Oct 11, 2022

Describe the bug
In the latest night-build release, the method from_numpy() is still not working on mac with Vulkan backend, but functional on Metal.

To Reproduce

import taichi as ti
import numpy as np

ti.init(arch=ti.vulkan)

@ti.kernel
def from_numpy(dst: ti.template(), src: ti.types.ndarray()):
    for I in ti.grouped(src):
        dst[I] = src[I]

def from_numpy_python(dst, src):
    for i in range(dst.shape[0]):
        dst[i] = src[i]

def print_array(name, arr, n):
    print(f"{name}: [", end="")
    for i in range(n):
        if i < n - 1:
            print(arr[i], end=", ")
        else:
            print(arr[i], end="]\n")

a = ti.field(dtype=ti.f32, shape=(10,))
float32 = np.ones((10,)).astype(np.float32)
print("float32: ", float32)

a.from_numpy(float32)
print_array('taichi-numpy', a, 10)

from_numpy(a, float32)
print_array('kernel-numpy', a, 10)

from_numpy_python(a, float32)
print_array('python-numpy', a, 10)

Log/Screenshots
Using Vulkan backend

$ python taichi_test.py
[Taichi] version 1.1.4, llvm 10.0.0, commit a3eb8d17, osx, python 3.9.13
[mvk-info] MoltenVK version 1.1.11, supporting Vulkan version 1.1.224.

float32:  [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
taichi-numpy: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
kernel-numpy: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
python-numpy: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

Using Metal backend

[Taichi] version 1.1.4, llvm 10.0.0, commit a3eb8d17, osx, python 3.9.13
[Taichi] Starting on arch=metal
float32:  [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
taichi-numpy: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
kernel-numpy: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
python-numpy: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

Additional comments
The code even failed when I try to use a kernel to transfer the data from NumPy to taichi, and the only way to get a correct result is by using python for-loop. Also, I found that the method stopped working after the taichi-nightly-1.1.3.post20220828 release.

@Linyou Linyou added the potential bug Something that looks like a bug but not yet confirmed label Oct 11, 2022
@taichi-gardener taichi-gardener moved this to Untriaged in Taichi Lang Oct 11, 2022
@strongoier strongoier self-assigned this Oct 14, 2022
@strongoier strongoier moved this from Untriaged to Todo in Taichi Lang Oct 14, 2022
@strongoier
Copy link
Contributor

Also, I found that the method stopped working after the taichi-nightly-1.1.3.post20220828 release.

Thanks for sharing this! I would also like to share my findings based on this info.

taichi-nightly-1.1.3.post20220828 is at commit 2374362 and contains molten-vk 1.1.10, while taichi-nightly-1.1.3.post20220829 is at commit fb62f1c and contains molten-vk 1.1.11. On macOS 12.3 both wheels work fine, so I think the problem only happens on macOS with a higher version. I tried a few combinations of environment setup on macOS 12.5 & 6:

molten-vk 1.1.10 molten-vk 1.1.11
2374362 correct wrong
fb62f1c correct wrong

It seems that the problem only appears with molten-vk >= 1.1.11 on macOS >= 12.5 (or 12.4, I don't have a machine for that lol). @bobcao3 Do you have any idea about what is happening?

@bobcao3
Copy link
Collaborator

bobcao3 commented Oct 21, 2022

Maybe macOS relaxed synchronization or something weird, i think only way to figure this out is to debug this on higher macOS versions.

( We should start narrowing down the problem by adding a bunch of ti.sync() and try whether that fixes things )

@strongoier
Copy link
Contributor

strongoier commented Oct 21, 2022

I'm testing with the following minimum script and it seems that ti.sync() doesn't help:

import numpy as np
import taichi as ti

ti.init(ti.vulkan, offline_cache=False)

@ti.kernel
def foo(a: ti.types.ndarray()) -> ti.i32:
    # adding ti.sync() here doesn't help
    return a[1]
    # there's an automatic sync for kernels with return values

x = np.array([4, 6])
print(foo(x))  # 0 (wrong)
print(x)  # [4, 6]

I'm now thinking about whether there could be an address alignment issue.

@feisuzhu feisuzhu moved this from Todo to In Progress in Taichi Lang Oct 21, 2022
@strongoier
Copy link
Contributor

After bisecting MoltenVK commits, I found that the problem was introduced in KhronosGroup/MoltenVK#1638, which enabled support for VK_KHR_buffer_device_address and VK_EXT_buffer_device_address on macOS 12.5. To make a healthy release, I'll submit a PR to stop using the feature on macOS. We can diagnose further about whether the problem comes from the feature itself or the use of the feature in Taichi in the future. cc: @bobcao3 @ailzhang

strongoier added a commit that referenced this issue Oct 24, 2022
…6415)

See
#6295 (comment).

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
jim19930609 pushed a commit to jim19930609/taichi that referenced this issue Oct 25, 2022
…aichi-dev#6415)

See
taichi-dev#6295 (comment).

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Repository owner moved this from In Progress to Done in Taichi Lang Oct 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential bug Something that looks like a bug but not yet confirmed
Projects
Status: Done
Development

No branches or pull requests

3 participants