Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fill unexpectedly crashes with CUDA backend #6953

Closed
turbo0628 opened this issue Dec 21, 2022 · 2 comments · Fixed by #6992
Closed

Fill unexpectedly crashes with CUDA backend #6953

turbo0628 opened this issue Dec 21, 2022 · 2 comments · Fixed by #6992
Assignees
Labels
bug We've confirmed that this is an BUG

Comments

@turbo0628
Copy link
Member

turbo0628 commented Dec 21, 2022

import taichi as ti

ti.init(arch=ti.cuda, device_memory_GB=2) # if we give 2.1, it works well.

arr = ti.ndarray(ti.f32, 128 * 1024 * 1024) # This should only use 1GB of GPU memory
arr.fill(1.2345)

fails with CUDA backend, err msg:

[E 12/21/22 22:03:04.230 65462] [cuda_driver.h:operator()@88] CUDA Error CUDA_ERROR_INVALID_VALUE: invalid argument while calling memsetd32 (cuMemsetD32_v2)

The regression happened between v1.0.2 and v1.0.3

image

@turbo0628 turbo0628 added the bug We've confirmed that this is an BUG label Dec 21, 2022
@taichi-gardener taichi-gardener moved this to Untriaged in Taichi Lang Dec 21, 2022
@turbo0628
Copy link
Member Author

device memory has to be 4x larger than ndarray to work normally, minimal reprod:

import taichi as ti

ti.init(arch=ti.cuda, print_ir=True, device_memory_GB=1.1) # crashes until we raise device memory to 4.1GB

n = 1024 * 256 * 1024 # 1GB

arr = ti.ndarray(ti.f32, n)

arr.fill(0.125)

@turbo0628
Copy link
Member Author

Root cause: The fill bug is due to the change in ndarray element size definitions.

The pass-in size argument is expected to be used for i32 addressing, but it now counts bytes. In the above example the size has value of 1024 * 1024 * 1024 but expected 1024 * 256 * 1024.

@feisuzhu feisuzhu moved this from Untriaged to In Progress in Taichi Lang Dec 30, 2022
@github-project-automation github-project-automation bot moved this from In Progress to Done in Taichi Lang Dec 31, 2022
turbo0628 added a commit that referenced this issue Dec 31, 2022
Fixes #6953

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
feisuzhu pushed a commit to feisuzhu/taichi that referenced this issue Jan 5, 2023
Fixes taichi-dev#6953

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
quadpixels pushed a commit to quadpixels/taichi that referenced this issue May 13, 2023
Fixes taichi-dev#6953

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug We've confirmed that this is an BUG
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant