Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why taichi gpu jit compiler complain warning only at first time? #7753

Closed
lgyStoic opened this issue Apr 7, 2023 · 5 comments
Closed

why taichi gpu jit compiler complain warning only at first time? #7753

lgyStoic opened this issue Apr 7, 2023 · 5 comments
Assignees
Labels
question Question on using Taichi

Comments

@lgyStoic
Copy link
Contributor

lgyStoic commented Apr 7, 2023

image

As I tested ,I use taichi lang for image pyramid, but I found the compile warning message only compain at first time,

as I expect, taichi wiil be do compile at every time when I start run my script..But I found there is no warning message output at non-first time.
Does taichi gpu cached the compile binary for the after launch?

But if the backend is cpu, the compile warning message is alwary exist..

Does this behavior is in expect or not ?

@lgyStoic lgyStoic added the question Question on using Taichi label Apr 7, 2023
@github-project-automation github-project-automation bot moved this to Untriaged in Taichi Lang Apr 7, 2023
@lgyStoic lgyStoic changed the title why taichi gpu jit compiler explain warning only at first time? why taichi gpu jit compiler complain warning only at first time? Apr 7, 2023
@PENGUINLIONG PENGUINLIONG moved this from Untriaged to Todo in Taichi Lang Apr 7, 2023
@jim19930609
Copy link
Contributor

Hi lgyStoic,
This is because Taichi compiles each kernel only once and caches the compiled kernel for successive runs, which is why the compilation warning only appears once as well.

@lgyStoic
Copy link
Contributor Author

lgyStoic commented Apr 7, 2023

image
but it is complain when arch is cpu...there is differenct behavior when in different backend...

Hi lgyStoic, This is because Taichi compiles each kernel only once and caches the compiled kernel for successive runs, which is why the compilation warning only appears once as well.

@jim19930609
Copy link
Contributor

That was wierd... Can you provide an example code to reproduce that?

@lgyStoic
Copy link
Contributor Author

lgyStoic commented Apr 8, 2023

ok...I found that core dump caused because my access index is invalid....
But also there has some problem confused. ..

Consicely,

@ti.kernel
def img_with_pyrdown(img: img_gray_2d, img_blur: img_gray_2d, img_down: img_gray_2d):
    h, w = img.shape
    weight[0] = 1
    weight[1] = 4
    weight[2] = 6
    weight[3] = 4
    weight[4] = 1
    radius = 2
    total_weight = 16
    for i, j in ti.ndrange(h, w):
        l_begin, l_end = ti.max(0, i - radius), ti.min(h, i + radius + 1)
        total = 0.0
        for l in range(l_begin, l_end):
            wi= weight[l - i]
            total += img[l, j] * wi
        total /= total_weight
        img_blur[i,j] = ti.cast(total, ti.u8)
    
    for i, j in ti.ndrange(h, w):
        l_begin, l_end = ti.max(0, j - radius), ti.min(w, j + radius + 1)
        total = 0.0
        for l in range(l_begin, l_end):
            wi = weight[l - j]
            total += img_blur[i, l] * wi
        total /= total_weight
        img_blur[i,j] = ti.cast(total, ti.u8)

    for i, j in ti.ndrange(h // 2, w // 2):
        res = (img[i * 2, j * 2] + img[i * 2 + 1, j * 2] + img[i * 2, j * 2 + 1] + img[i * 2 + 1, j * 2 + 1]) / 4.0
        img_down[i, j] = ti.cast(res, ti.u8)

In upper function, I want to do a gaussian filter to image , and then resize to half.
At first, I write the wrong range in img_down's ndarray, and forget to do cast ...as below

for i, j in ti.ndrange(h, w):
        res = (img[i * 2, j * 2] + img[i * 2 + 1, j * 2] + img[i * 2, j * 2 + 1] + img[i * 2 + 1, j * 2 + 1]) / 4.0
        img_down[i, j] = res

I forget to do div on height and wide.. Using this wrong code. compiler's behavior is quit strange.
in gpu this operation does not cause any excepiton on runtime. even i, j is out of ndarray's range.
In cpu the compiler throw an core dump, which using python-dbg show thread crashed on type_check...
And I fellowed the step by build debug enviroment, can only show several
debug infos...even cannot show line no.... could you share some more skills to build taichi debug environments?
Even more, there has some wired things occured... If I run the funciton successively, as I expect, the compiler will crash,
and no result can generate..but sometimes, It will have result after several times I run this script by cpu..
image

Like the screenshot upper...The script "done" at first time, and write a image to file(don't care image's correctness)........
Second time, it doesn't print "done", and didn't write image either...

That was wierd... Can you provide an example code to reproduce that?

@jim19930609
Copy link
Contributor

In general, out-of-bound access is undefined behavior and the consequence is unpredictable. You can check for out-of-bound access with ti.init(arch=ti.cpu, debug=True, check_out_of_bound=True), and you'll notice sth like:

截图 2023-04-10 09-21-32

@github-project-automation github-project-automation bot moved this from Todo to Done in Taichi Lang Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question on using Taichi
Projects
Status: Done
Development

No branches or pull requests

2 participants