Nested Operations on "large" fields fails on Vulkan and OpenGL, works on Cuda #3544

whorfin · 2021-11-18T03:11:52Z

Describe the bug
I have a set of kernels which perform nested operations over fields. When the field operations involve computed field access, in my case a periodic access , and the inner loop is "too big", Vulkan crashes with a "failed to submit command buffer". OpenGL just exits with no warnings. Cuda and CPU are fine.

Here is a minimal repro case; "compute()" is what fails, with symptoms showing up right when it would have finished, seemingly in the next operations.

To Reproduce

On Vulkan and OpenGL, this fails (on nvidia GTX 1070) while Cuda ["ti.gpu"] works fine.
Changing the fieldHeight from 768 to 512, it succeeds on all three.

Replacing samplePeriodic() with a straight field access also runs on all targets. While that would be suitable for this trivial repro-case, I have an application where it is not, hence this report.

#!/usr/bin/python3
import sys

import math

import taichi as ti

import numpy as np

#ti.init(arch=ti.gpu, excepthook=True)

ti.init(arch=ti.vulkan, excepthook=True, log_level=ti.TRACE)

#ti.init(arch=ti.opengl, excepthook=True, log_level=ti.TRACE)

fieldWidth = 1024
fieldHeight = 768    # this works with cuda, fails with vulkan and silently exits with opengl
#fieldHeight = 512   # this works with cuda and vulkan and opengl

@ti.func
def samplePeriodic(field: ti.template(), u, v):
    P = ti.Vector([int(u), int(v)])
    shape = ti.Vector(field.shape)
    P = ti.mod(P, shape)
    return field[int(P)]

@ti.kernel
def initialize():
    for x in range(in_field.shape[0]):
        for y in range(in_field.shape[1]):
            in_field[x,y] = ti.sin(x/10 * math.pi) * ti.sin(y/5 * math.pi)

@ti.kernel
def compute():
    for px, py in in_field:
        F = 0.
        for x in range(out_field.shape[0]):
            for y in range(out_field.shape[1]):
                Q = samplePeriodic(in_field, x, y)
                F += Q
        out_field[px, py] = F


in_field = ti.field(ti.f32, shape=(fieldWidth, fieldHeight))
out_field = ti.field(ti.f32, shape=(fieldWidth, fieldHeight))

initialize()
print("Wait...", end="")
sys.stdout.flush()
compute()
ti.sync()
print()
ti.imshow(out_field.to_numpy())

Log/Screenshots

00vulkan-fail.txt
00opengl-fail.txt

[Taichi] version 0.8.5, llvm 10.0.0, commit 45c6ad48, win, python 3.9.7
[Taichi] Starting on arch=vulkan

Additional comments

[Taichi] version 0.8.5, llvm 10.0.0, commit 45c6ad48, win, python 3.9.7

*******************************************
**      Taichi Programming Language      **
*******************************************

Docs:   https://docs.taichi.graphics/
GitHub: https://github.com/taichi-dev/taichi/
Forum:  https://forum.taichi.graphics/

Taichi system diagnose:

python: 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)]
system: win32
executable: C:\Users\whorfin\AppData\Local\Programs\Python\Python39\python.exe
platform: Windows-10-10.0.19043-SP0
architecture: 64bit WindowsPE
uname: uname_result(system='Windows', node='katar', release='10', version='10.0.19043', machine='AMD64')
locale: en_US.cp1252
PATH: C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64\compiler;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;c:\bin;C:\Program Files\Intel\WiFi\bin\;C:\Program Files\Common Files\Intel\WirelessCommon\;C:\Program Files\PuTTY\;C:\oracle\instantClient-11.2\64bit;C:\WINDOWS\System32\OpenSSH\;C:\Program Files (x86)\Calibre2\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\Program Files\4KScope 7;C:\Users\whorfin\AppData\Local\Microsoft\WindowsApps;C:\Users\whorfin\AppData\Local\Programs\Fiddler;C:\Users\whorfin\AppData\Local\Microsoft\WindowsApps;C:\Program Files (x86)\Nmap;C:\Users\whorfin\AppData\Local\Programs\Python\Python39\Lib\site-packages\taichi\lib
PYTHONPATH: ['C:\\Users\\whorfin\\AppData\\Local\\Programs\\Python\\Python39', 'C:\\Users\\whorfin\\AppData\\Local\\Programs\\Python\\Python39\\python39.zip', 'C:\\Users\\whorfin\\AppData\\Local\\Programs\\Python\\Python39\\DLLs', 'C:\\Users\\whorfin\\AppData\\Local\\Programs\\Python\\Python39\\lib', 'C:\\Users\\whorfin\\AppData\\Local\\Programs\\Python\\Python39', 'C:\\Users\\whorfin\\AppData\\Local\\Programs\\Python\\Python39\\lib\\site-packages']

`lsb_release` not available: [WinError 2] The system cannot find the file specified


import: <module 'taichi' from 'C:\\Users\\whorfin\\AppData\\Local\\Programs\\Python\\Python39\\lib\\site-packages\\taichi\\__init__.py'>

cc: False
cpu: True
metal: False
opengl: True
cuda: True
vulkan: True

`glewinfo` not available: [WinError 2] The system cannot find the file specified

`nvidia-smi` not available: [WinError 2] The system cannot find the file specified
[Taichi] version 0.8.5, llvm 10.0.0, commit 45c6ad48, win, python 3.9.7

[Taichi] version 0.8.5, llvm 10.0.0, commit 45c6ad48, win, python 3.9.7
[Taichi] Starting on arch=x64

[Taichi] version 0.8.5, llvm 10.0.0, commit 45c6ad48, win, python 3.9.7
[Taichi] Starting on arch=opengl

[Taichi] version 0.8.5, llvm 10.0.0, commit 45c6ad48, win, python 3.9.7
[Taichi] Starting on arch=cuda

[Taichi] version 0.8.5, llvm 10.0.0, commit 45c6ad48, win, python 3.9.7

*******************************************
**      Taichi Programming Language      **
*******************************************

Docs:   https://docs.taichi.graphics/
GitHub: https://github.com/taichi-dev/taichi/
Forum:  https://forum.taichi.graphics/

Running example minimal ...
[Taichi] Starting on arch=x64
42.0
>>> Running time: 0.29s
42

Consider attaching this log when maintainers ask about system information.
>>> Running time: 30.37s

PS - I 🖤 Taichi

The text was updated successfully, but these errors were encountered:

whorfin · 2021-11-20T03:25:25Z

Addendum:
When run under Linux on an Intel i915, Vulkan fails at a fieldHeight of 512 but works at 256.
More importantly and hopefully helpfully, there is some additional information on what might be going on:

MESA-INTEL: error: ../src/intel/vulkan/anv_device.c:3316: GPU hung on one of our command buffers (VK_ERROR_DEVICE_LOST)

RuntimeError: [vulkan_device.cpp:submit@1375] Vulkan Error : -4 : failed to submit command buffer
[T 11/19/21 19:11:50.047 5867] [program.cpp:finalize@450] Program finalizing...
[T 11/19/21 19:11:50.048 5867] [program.cpp:finalize@495] Program (0x2a0dfe0) finalized_.
python3: /home/dev/taichi/external/VulkanMemoryAllocator/include/vk_mem_alloc.h:10831: void VmaDeviceMemoryBlock::Destroy(VmaAllocator): Assertion `m_pMetadata->IsEmpty() && "Some allocations were not freed before destruction of this memory block!"' failed.

I do notice with my "working properly" Vulkan targets in taichi, if I run something complicated with low FPS, the window system [both Windows and Linux] seems to "lock up" while kernels are running in between "frames" of compute. This is also true for OpenGL targets, but not for CUDA.

Perhaps this is relevant?

k-ye · 2021-11-22T07:09:05Z

Hmm, @bobcao3 @g1n0st could you help take a look on this? I don't expect it will be a problem writing a plain loop with 512 iterations in either SPIR-V or GLSL. So maybe something to do with our codegen (and Vulkan resource management)?

g1n0st · 2021-11-22T07:15:23Z

Okay, I will check it later Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Ye Kuang ***@***.***> Sent: Monday, November 22, 2021 3:09:17 PM To: taichi-dev/taichi ***@***.***> Cc: Chang Yu ***@***.***>; Mention ***@***.***> Subject: Re: [taichi-dev/taichi] Nested Operations on "large" fields fails on Vulkan and OpenGL, works on Cuda (Issue #3544) Hmm, @bobcao3<https://github.com/bobcao3> @g1n0st<https://github.com/g1n0st> could you help take a look on this? I don't expect it will be a problem writing a plain loop with 512 iterations in either SPIR-V or GLSL. So maybe something to do with our codegen (and Vulkan resource management)? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#3544 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIE5PHBUFTUQOKZKOAPEDSTUNHUB3ANCNFSM5IIRSIUA>.

g1n0st · 2021-11-26T07:16:43Z

Sorry for the late reply... I cannot reproduce this in my 2060... vulkan backend just works fine...

g1n0st · 2021-11-26T07:22:50Z

Maybe related to resource management? @bobcao3

whorfin · 2021-11-26T23:16:33Z

Sorry for the late reply... I cannot reproduce this in my 2060... vulkan backend just works fine...

Since lower numbers are needed on "weaker" GPUs to avoid the problem and the 2060 >> my 1070, I wonder if you would see this if you increased fieldWidth to 2048 or 4096 and increased fieldHeight correspondingly?

whorfin · 2021-11-29T20:29:45Z

Confirming the above:
On a Quadro P6000, I needed to set fieldWidth to 2048 to get the crash
On a Quadro RTX 6000, I needed to set fieldWidth = 2048 and fieldHeight = 2048 to get failure

On Linux with nvidia cards, I did not see the "GPU hung" message either from CLI or in dmesg which I observed with the i915 machine

bobcao3 · 2021-12-17T08:25:18Z

Hi, this seems to be an issue with the drivers' compiler where it tries to unroll the loop, and then it's simply too much for it to handle. I don't have an immediate fix in mind, I feel like without the compiler's proper support for big loops there's not much we can do. Maybe try to split it into multiple kernels? It seems you can do atomic accumulation into the resulting buffer from the code snippet.

whorfin · 2021-12-19T01:13:12Z

For my particular use case, I was able to tile the inner loop using atomic accumulation, and call the parameterized kernel multiple times from python context. Thanks for the suggestion.
I was surprised to find that without a ti.sync() between the kernel calls, I got bogus results.
This was true for both vulkan and opengl backends.
I guess this is related to #3791?

bobcao3 · 2021-12-22T09:06:55Z

For my particular use case, I was able to tile the inner loop using atomic accumulation, and call the parameterized kernel multiple times from python context. Thanks for the suggestion. I was surprised to find that without a ti.sync() between the kernel calls, I got bogus results. This was true for both vulkan and opengl backends. I guess this is related to #3791?

#3791 is in the latest release, can you try to update and test it out on the Vulkan backend?

whorfin · 2021-12-22T18:31:10Z

🎉
Fixed! ie the ti.sync() between kernel calls is no longer needed to get correct behavior with v0.8.8.
Tested with both Vulkan and OpenGL backends

One note - I now get a whole bunch of these with Vulkan backend:

[W 12/22/21 10:18:45.781 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:45.883 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:45.928 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:46.104 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:46.137 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:46.154 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:46.198 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:46.213 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:46.235 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
[W 12/22/21 10:18:46.259 8568] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device.cpp:rw_buf
fer@513] Overriding last binding
... [etc]

whorfin · 2021-12-22T21:03:33Z

However...
Sadly, v0.8.8 made things worse on my i915 with Linux
Kernel tiling which worked fine on v0.8.7 now fails
And worse, turning the tile size down further now seems to cause a crash in libvulkan_intel.so

[E 12/22/21 12:15:47.442 56865] Received signal 11 (Segmentation fault)


***********************************
* Taichi Compiler Stack Traceback *
***********************************
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::Logger::error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x785d34) [0x7f9b6b630d34]
/lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7f9b6ff33210]
/usr/lib/x86_64-linux-gnu/libvulkan_intel.so(+0x484155) [0x7f9b1d7ad155]
/usr/lib/x86_64-linux-gnu/libvulkan_intel.so(+0x10699c) [0x7f9b1d42f99c]
/usr/lib/x86_64-linux-gnu/libvulkan_intel.so(+0x108a61) [0x7f9b1d431a61]
/usr/lib/x86_64-linux-gnu/libvulkan_intel.so(+0x189d75) [0x7f9b1d4b2d75]
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0xaa8f27) [0x7f9b6b953f27]
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::vulkan::VulkanPipeline::create_compute_pipeline(taichi::lang::vulkan::VulkanPipeline::Params const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::vulkan::VulkanPipeline::VulkanPipeline(taichi::lang::vulkan::VulkanPipeline::Params const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::vulkan::VulkanDevice::create_pipeline(taichi::lang::PipelineSourceDesc const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::vulkan::CompiledTaichiKernel::CompiledTaichiKernel(taichi::lang::vulkan::CompiledTaichiKernel::Params const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::vulkan::VkRuntime::register_taichi_kernel(taichi::lang::vulkan::VkRuntime::RegisterParams)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::compile_to_executable(taichi::lang::Kernel*, taichi::lang::vulkan::VkRuntime*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::VulkanProgramImpl::compile(taichi::lang::Kernel*, taichi::lang::OffloadedStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Program::compile(taichi::lang::Kernel&, taichi::lang::OffloadedStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Kernel::compile()
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Kernel::operator()(taichi::lang::Kernel::LaunchContextBuilder&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x3fc95a) [0x7f9b6b2a795a]
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x3b643e) [0x7f9b6b26143e]
/usr/bin/python3: PyCFunction_Call
/usr/bin/python3: _PyObject_MakeTpCall
/usr/bin/python3: ) [0x50b485]
/usr/bin/python3: PyObject_Call
/usr/bin/python3: ) [0x59cb20]
/usr/bin/python3: _PyObject_MakeTpCall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyObject_FastCallDict
/usr/bin/python3: _PyObject_Call_Prepend
/usr/bin/python3: ) [0x59cafb]
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: PyEval_EvalCode
/usr/bin/python3: ) [0x67cd01]
/usr/bin/python3: ) [0x67cd7f]
/usr/bin/python3: ) [0x67ce21]
/usr/bin/python3: PyRun_SimpleFileExFlags
/usr/bin/python3: Py_RunMain
/usr/bin/python3: Py_BytesMain
/lib/x86_64-linux-gnu/libc.so.6: __libc_start_main
/usr/bin/python3: _start

Internal error occurred. Check out this page for possible solutions:
https://docs.taichi.graphics/lang/articles/misc/install

I can't replicate the crash in my simple test code, but am seeing something very odd there nontheless. Will post when I have something worth looking at.

whorfin · 2021-12-22T22:40:41Z

Whew, OK. The above crash is a red herring, a different new bug reported as #3857

What I am running into which is still relevant to this issue is a situation where ti.sync() is still needed. It appears that the vulkan backend over-schedules kernels now such that I get the same "failed to submit command buffer" with my tiled kernels, unless I force the ti.sync()

I have not exhaustively tested but definitely see this on my resource-constrained i915 target, described more fully in the ti.random() bug report referenced above.

Here's some repro information for the scheduling/allocation issue; note how OpenGL behaves differently. The timing in square brackets helps explain what seems to be going on.

This code:

import sys

import math

import taichi as ti

import numpy as np

from time import monotonic

#ti.init(arch=ti.opengl) # fine
ti.init(arch=ti.vulkan) # over-schedules kernels unless ti.sync() is done

fieldWidth = 1024
fieldHeight = 688

field_chunk = 32

@ti.func
def samplePeriodic(field: ti.template(), u, v):
    P = ti.Vector([int(u), int(v)])
    shape = ti.Vector(field.shape)
    P = ti.mod(P, shape)
    return field[int(P)]

@ti.kernel
def initialize():
    for x in range(in_field.shape[0]):
        for y in range(in_field.shape[1]):
            in_field[x,y] = ti.sin(x/10 * math.pi) * ti.sin(y/5 * math.pi)

@ti.kernel
def compute_chunked(yi: ti.i32, yn: ti.i32):
    for px, py in in_field:
        F = 0.
        for x in range(out_field.shape[0]):
            for y in range(yi, yn):
                Q = samplePeriodic(in_field, x, y)
                F += Q
        out_field[px, py] += F


in_field = ti.field(ti.f32, shape=(fieldWidth, fieldHeight))
out_field = ti.field(ti.f32, shape=(fieldWidth, fieldHeight))

initialize()
print("Wait...", end="")
sys.stdout.flush()

out_field.fill(0.)
numy = int(ti.ceil(out_field.shape[1]/field_chunk))
last = monotonic()
for i in range(0, numy):
    now = monotonic()
    print("{}/{}[{:#.2f}]...".format(i+1, numy, now - last), end="")
    last = now
    sys.stdout.flush()
    compute_chunked(i*field_chunk, 
            min(out_field.shape[1], (i+1)*field_chunk))
    #ti.sync()   # Vulkan fails without this

print()
ti.imshow(out_field.to_numpy())

When run against OpenGL backend without ti.sync() it is fine:

[Taichi] version 0.8.8, llvm 10.0.0, commit 7bae9c77, linux, python 3.8.10
[Taichi] Starting on arch=opengl
Wait...1/22[0.00]...2/22[0.02]...3/22[3.99]...4/22[4.00]...5/22[3.96]...6/22[3.96]...7/22[3.95]...8/22[3.94]...9/22[3.96]...10/22[3.96]...11/22[3.96]...12/22[4.00]...13/22[3.95]...14/22[3.95]...15/22[3.95]...16/22[3.94]...17/22[3.95]...18/22[3.95]...19/22[3.94]...20/22[3.94]...21/22[3.93]...22/22[3.94]...

When run against Vulkan backed with ti.sync() between each kernel call:

[Taichi] version 0.8.8, llvm 10.0.0, commit 7bae9c77, linux, python 3.8.10
[Taichi] Starting on arch=vulkan
[W 12/22/21 13:17:01.770 60498] [vulkan_device.cpp:rw_buffer@513] Overriding last binding
Wait...[W 12/22/21 13:17:01.782 60498] [vulkan_device.cpp:rw_buffer@513] Overriding last binding
1/22[0.00]...[W 12/22/21 13:17:01.796 60498] [vulkan_device.cpp:rw_buffer@513] Overriding last binding
2/22[3.96]...3/22[3.88]...4/22[3.89]...5/22[3.92]...6/22[3.92]...7/22[3.89]...8/22[3.89]...9/22[3.88]...10/22[3.86]...11/22[3.86]...12/22[3.86]...13/22[3.89]...14/22[3.89]...15/22[3.88]...16/22[3.91]...17/22[3.90]...18/22[3.89]...19/22[3.90]...20/22[3.90]...21/22[3.92]...22/22[3.90]...
[W 12/22/21 13:18:25.604 60498] [vulkan_device.cpp:rw_buffer@513] Overriding last binding
[W 12/22/21 13:18:25.604 60498] [vulkan_device.cpp:rw_buffer@513] Overriding last binding

When run against Vulkan backend without ti.sync():

[Taichi] version 0.8.8, llvm 10.0.0, commit 7bae9c77, linux, python 3.8.10
[Taichi] Starting on arch=vulkan
[W 12/22/21 13:19:34.887 60697] [vulkan_device.cpp:rw_buffer@513] Overriding last binding
Wait...[W 12/22/21 13:19:34.900 60697] [vulkan_device.cpp:rw_buffer@513] Overriding last binding
1/22[0.00]...[W 12/22/21 13:19:34.913 60697] [vulkan_device.cpp:rw_buffer@513] Overriding last binding
2/22[0.01]...3/22[0.00]...4/22[0.00]...5/22[0.00]...6/22[0.00]...7/22[0.00]...8/22[0.00]...9/22[0.00]...10/22[0.00]...11/22[0.00]...12/22[0.00]...13/22[0.00]...14/22[0.00]...15/22[0.00]...16/22[0.00]...17/22[0.00]...18/22[0.00]...19/22[0.00]...20/22[0.00]...21/22[0.00]...22/22[0.00]...
MESA-INTEL: error: ../src/intel/vulkan/anv_device.c:3316: GPU hung on one of our command buffers (VK_ERROR_DEVICE_LOST)
[E 12/22/21 13:19:50.644 60697] [vulkan_device.cpp:submit_synced@1420] Vulkan Error : -4 : failed to submit command buffer


***********************************
* Taichi Compiler Stack Traceback *
***********************************
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::Logger::error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::vulkan::VulkanStream::submit_synced(taichi::lang::CommandList*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::vulkan::VkRuntime::launch_kernel(taichi::lang::vulkan::VkRuntime::KernelHandle, taichi::lang::RuntimeContext*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Kernel::operator()(taichi::lang::Kernel::LaunchContextBuilder&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::ConstantFold::jit_evaluate_binary_op(taichi::lang::TypedConstant&, taichi::lang::BinaryOpStmt*, taichi::lang::TypedConstant const&, taichi::lang::TypedConstant const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::ConstantFold::visit(taichi::lang::BinaryOpStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::BasicStmtVisitor::visit(taichi::lang::Block*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::BasicStmtVisitor::visit(taichi::lang::IfStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::BasicStmtVisitor::visit(taichi::lang::Block*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::OffloadedStmt::all_blocks_accept(taichi::lang::IRVisitor*, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::BasicStmtVisitor::visit(taichi::lang::Block*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::ConstantFold::run(taichi::lang::IRNode*, taichi::lang::Program*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::constant_fold(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::ConstantFoldPass::Args const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::full_simplify(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::FullSimplifyPass::Args const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::offload_to_executable(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::Kernel*, bool, bool, bool, bool, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::compile_to_executable(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::Kernel*, bool, bool, bool, bool, bool, bool, bool, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::spirv::lower(taichi::lang::Kernel*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::VulkanProgramImpl::compile(taichi::lang::Kernel*, taichi::lang::OffloadedStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Program::compile(taichi::lang::Kernel&, taichi::lang::OffloadedStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Kernel::compile()
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Kernel::operator()(taichi::lang::Kernel::LaunchContextBuilder&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x3fc95a) [0x7f156c28f95a]
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x3b643e) [0x7f156c24943e]
/usr/bin/python3: PyCFunction_Call
/usr/bin/python3: _PyObject_MakeTpCall
/usr/bin/python3: ) [0x50b485]
/usr/bin/python3: PyObject_Call
/usr/bin/python3: ) [0x59cb20]
/usr/bin/python3: _PyObject_MakeTpCall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyObject_FastCallDict
/usr/bin/python3: _PyObject_Call_Prepend
/usr/bin/python3: ) [0x59cafb]
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: PyEval_EvalCode
/usr/bin/python3: ) [0x67cd01]
/usr/bin/python3: ) [0x67cd7f]
/usr/bin/python3: ) [0x67ce21]
/usr/bin/python3: PyRun_SimpleFileExFlags
/usr/bin/python3: Py_RunMain
/usr/bin/python3: Py_BytesMain
/lib/x86_64-linux-gnu/libc.so.6: __libc_start_main
/usr/bin/python3: _start

Internal error occurred. Check out this page for possible solutions:
https://docs.taichi.graphics/lang/articles/misc/install
[E 12/22/21 13:19:50.647 60697] [ir.cpp:~DelayedIRModifier@449] Assertion failure: to_insert_before_.empty()


***********************************
* Taichi Compiler Stack Traceback *
***********************************
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::Logger::error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::DelayedIRModifier::~DelayedIRModifier()
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::ConstantFold::run(taichi::lang::IRNode*, taichi::lang::Program*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::constant_fold(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::ConstantFoldPass::Args const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::full_simplify(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::FullSimplifyPass::Args const&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::offload_to_executable(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::Kernel*, bool, bool, bool, bool, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::irpass::compile_to_executable(taichi::lang::IRNode*, taichi::lang::CompileConfig const&, taichi::lang::Kernel*, bool, bool, bool, bool, bool, bool, bool, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::spirv::lower(taichi::lang::Kernel*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::VulkanProgramImpl::compile(taichi::lang::Kernel*, taichi::lang::OffloadedStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Program::compile(taichi::lang::Kernel&, taichi::lang::OffloadedStmt*)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Kernel::compile()
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::lang::Kernel::operator()(taichi::lang::Kernel::LaunchContextBuilder&)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x3fc95a) [0x7f156c28f95a]
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x3b643e) [0x7f156c24943e]
/usr/bin/python3: PyCFunction_Call
/usr/bin/python3: _PyObject_MakeTpCall
/usr/bin/python3: ) [0x50b485]
/usr/bin/python3: PyObject_Call
/usr/bin/python3: ) [0x59cb20]
/usr/bin/python3: _PyObject_MakeTpCall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyObject_FastCallDict
/usr/bin/python3: _PyObject_Call_Prepend
/usr/bin/python3: ) [0x59cafb]
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: PyObject_Call
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: _PyFunction_Vectorcall
/usr/bin/python3: _PyEval_EvalFrameDefault
/usr/bin/python3: _PyEval_EvalCodeWithName
/usr/bin/python3: PyEval_EvalCode
/usr/bin/python3: ) [0x67cd01]
/usr/bin/python3: ) [0x67cd7f]
/usr/bin/python3: ) [0x67ce21]
/usr/bin/python3: PyRun_SimpleFileExFlags
/usr/bin/python3: Py_RunMain
/usr/bin/python3: Py_BytesMain
/lib/x86_64-linux-gnu/libc.so.6: __libc_start_main
/usr/bin/python3: _start

Internal error occurred. Check out this page for possible solutions:
https://docs.taichi.graphics/lang/articles/misc/install
terminate called after throwing an instance of 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >'
[E 12/22/21 13:19:50.649 60697] Received signal 6 (Aborted)


***********************************
* Taichi Compiler Stack Traceback *
***********************************
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so: taichi::Logger::error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x785d34) [0x7f156c618d34]
/lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7f1570edb210]
/lib/x86_64-linux-gnu/libc.so.6: gsignal
/lib/x86_64-linux-gnu/libc.so.6: abort
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e911) [0x7f156bd20911]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa38c) [0x7f156bd2c38c]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3f7) [0x7f156bd2c3f7]
/usr/local/lib/python3.8/dist-packages/taichi/_lib/core/taichi_core.so(+0x3ab43b) [0x7f156c23e43b]

Internal error occurred. Check out this page for possible solutions:
https://docs.taichi.graphics/lang/articles/misc/install

whorfin added the potential bug Something that looks like a bug but not yet confirmed label Nov 18, 2021

whorfin mentioned this issue Sep 4, 2022

Vulkan issue on ubuntu 22.04.1 #5974

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested Operations on "large" fields fails on Vulkan and OpenGL, works on Cuda #3544

Nested Operations on "large" fields fails on Vulkan and OpenGL, works on Cuda #3544

whorfin commented Nov 18, 2021 •

edited

Loading

whorfin commented Nov 20, 2021

k-ye commented Nov 22, 2021

g1n0st commented Nov 22, 2021 via email

g1n0st commented Nov 26, 2021

g1n0st commented Nov 26, 2021

whorfin commented Nov 26, 2021

whorfin commented Nov 29, 2021

bobcao3 commented Dec 17, 2021

whorfin commented Dec 19, 2021

bobcao3 commented Dec 22, 2021

whorfin commented Dec 22, 2021

whorfin commented Dec 22, 2021

whorfin commented Dec 22, 2021

Nested Operations on "large" fields fails on Vulkan and OpenGL, works on Cuda #3544

Nested Operations on "large" fields fails on Vulkan and OpenGL, works on Cuda #3544

Comments

whorfin commented Nov 18, 2021 • edited Loading

whorfin commented Nov 20, 2021

k-ye commented Nov 22, 2021

g1n0st commented Nov 22, 2021 via email

g1n0st commented Nov 26, 2021

g1n0st commented Nov 26, 2021

whorfin commented Nov 26, 2021

whorfin commented Nov 29, 2021

bobcao3 commented Dec 17, 2021

whorfin commented Dec 19, 2021

bobcao3 commented Dec 22, 2021

whorfin commented Dec 22, 2021

whorfin commented Dec 22, 2021

whorfin commented Dec 22, 2021

whorfin commented Nov 18, 2021 •

edited

Loading