-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Passing struct vs deferencing fields in struct performance #4802
Comments
As a good first step, we can enable |
Note that you might also want to exclude the timing for the first run, because that counts JIT time as well. |
@k-ye thanks for that. So with a simple change to exclude the jit time (run once before timing) I get: Vulkan CPU Metal: So there is something coming out with different code, and the previous timing seemed to more reflect that the compilation / JIT time is different for each kernel (because of the different code being generated) Side question: Is there a way to not recompile kernels each test run? I.e. if I run |
Yup, @PGZXB 's working on an offline cache system (#4401), starting with the LLVM backend. We are using the AST as the cache key now. @PGZXB only has one day or so per week to work on Taichi, so the feature is a bit slow to release. But we are moving towards that direction. Thanks for your suggestion :-) |
That's quite fascinating. I'd imagine the CHI-IR generated should be quite similar, and it's weird that we only see Vulkan with a big regression. (And I'm assuming this is on a mac? Where SPIR-V is actually translated to MSL by MoltenVK...) We should check three things:
Something else that can be quite helpful is to run the two SPIR-V through Radeon Graphics Analyzer as well to get the raw assembly for the #inst & cycle latency readings |
In short I see significant performance decrease in passing a struct to a function vs dereferencing the struct fields and passing the values.
Here I show two ways to intersect a ray with a bunch of spheres, one passing the sphere struct vs getting the values of the sphere struct and passing those.
Also note metal vs vulkan effects quite a bit:
Metal
Passing reference 0.23202180862426758
Passing decomposed 0.09195494651794434
Vulkan
Passing reference 0.07849979400634766
Passing decomposed 0.05303597450256348
CPU
Passing reference 0.08902120590209961
Passing decomposed 0.06803393363952637
Originally posted by @bsavery in #4784 (reply in thread)
The text was updated successfully, but these errors were encountered: