-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Warn when number of kernel args exceeds maximum available on GPU #2361
Conversation
Emit a warning when number of resulting kernel arguments exceeds 2k - maximum available number of kernel arguments on GPU device. Emit a warning only in GPU AOT mode since other devices don't have such limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for small nit about diagnostic message. I think it is a little confusing but I am also not sure how to re-word it. Is it better putting the numbers inside ( )?
'resulting number of kernel arguments (2001) is greater than maximum supported on GPU (2000).'
If you're not using brackets, maybe a comma after 2001 is helpful.
'resulting number of kernel arguments 2001, is greater than maximum supported on GPU - 2000'
How about: "after decomposition, there are 2001 kernel arguments, which exceed the maximum supported on GPU - 2000"? |
I'm not sure that we need to mention decomposition since it is a low level implementation detail. Not sure that it is clear what is "decomposition" for user. |
Done |
Yes, but after @elizabethandrews pointed it out, I feel saying "resulting" leaves it a bit vague. "Resulting" from what? What could the user do to fix it? But I don't have a better suggestion for wording. :-) |
I agree with @Fznamznon. I don't think we can say decomposition. From a user perspective it makes no sense. I have no idea what the right diagnostic is though. Maybe we can be more descriptive? "Kernel argument count (2001) exceeds supported maximum of 2000 on GPU. Please note, array elements and fields of a class/struct may be counted separately" Is that any better? I said 'may' because after Erich's current refactor, I will be working on patch to not decompose unless required. |
Resulting from OpenCL kernel building. We can say something like "number of resulting OpenCL kernel arguments (n) exceeds bla..." of "number of resulting kernel function arguments (n) exceeds..." but I'm not sure if it is better.
I think SYCL spec points that all fields of kernel object become parameters for some "device kernel". I think our OpenCL kernel that we generate in SemaSYCL is "device kernel" from the spec. User should know how to fix it - number of kernel object fields should be reduced.
Looks good, but Isn't it too long? I'm asking because clang internals manual suggests keep it short https://clang.llvm.org/docs/InternalsManual.html#the-format-string . |
It is long. But I'm not sure how else to to convey the warning in a user friendly manner. "Kernel argument count (2001) exceeds supported maximum of 2000 on GPU" might be confusing if number of fields is less than 2000 but warning still shows up because of decomposition. Either we should have 'number of resulting...' like in your initial diagnostic or the note. I'm leaning towards the latter because it is more clear but I do not know if diagnostic length matters a lot. @erichkeane is it alright giving longer diagnostics? |
The important part of the diagnostic length 'limit' is to simply make sure that you only give as much information as the user needs. I'd say that everything after 'GPU' likely should be somewhere else: |
Okay, I've added a note. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 small nits, otherwise LGTM.
Note that this is exactly how the Handlers are supposed to be, light, small, and easy to understand. We should attempt 'new' handlers any time it seems reasonable.
A side note: At one point, I wonder if we should separate the visitor and checkers into their own file structure. This SemaSYCL.cpp file has way too many responsibilities...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as Erich and Prem about NumParams but otherwise LGTM
493e64b
The SPIR-V Specification allows `OpConstantNull` types to be scalar or vector booleans, integers, or floats. Update an assert for this and add a SPIR-V -> LLVM IR test. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9ec969c1c379bde
Emit a warning when number of resulting kernel arguments exceeds 2k -
maximum available number of kernel arguments on GPU device. Emit a
warning only in GPU AOT mode since other devices don't have such
limitation.