-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More register used when multiple target regions are compiled together #24
Comments
This is good information. I would like to know what optimization level if any you requested. Can you attach your source and command line? Thank you. |
reproducer
src/QMCWaveFunctions/einspline_spo_omp.cpp
Now just comment the #pragma omp 149, 238, 405 but leave 311.
The NumVGPRs reduces and there is no spill. Another test, if I add right before line 311
The newly added kernel has
All the numbers are significantly larger than the numbers given when the empty offload region is compiled standalone. |
-v doesn't produce this output anymore. A potentially useful alternative is
Alternatively, that is available by reading the msgpack data from the shared library (elf) containing device code. |
Hi Ye, this one is over 3 years old, closing. if still an issue please reopen,, or open new issue. |
The source code I'm using has multiple offload regions in different member functions of a class.
If I enable individual target region and comment the other target pragma
Kernel 1 only
kernel 2 only
If I enabled both offload regions.
kernel 1
kernal 2
The amount of needed vector register + spill is more than individually ones.
Both kernels are compiled from independent target regions. This behaviour seems very strange.
The text was updated successfully, but these errors were encountered: