-
Notifications
You must be signed in to change notification settings - Fork 733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Add kernel properties for three function attributes #14448
[SYCL] Add kernel properties for three function attributes #14448
Conversation
This patch adds kernel properties for three existing SYCL function attributes: 'max_work_group_size', 'min_work_groups_per_cu', and 'max_work_groups_per_mp'. The semantics of the properties are the same as for their respective function attributes. Each of these attributes was originally earmarked as a SYCL equivalent to CUDA's '__launch_bounds__' qualifier, hence the focus on lowering for NVPTX. It has since been identified that we will need an additional property for CUDA's 'maxThreadsPerBlock' which isn't always expressable with 'max_work_group_size'. That property will come in a follow-up patch. This patch also wires up the 'max_work_group_size' property to the equivalent SPIR-V execution mode, which should hopefully improve certain use cases. It lowers the other two properties despite them not currently having a SPIR-V equivalent.
} else if (auto Attr = F->getFnAttribute("sycl-max-work-group-size"); | ||
Attr.isValid()) { | ||
// Split values in the comma-separated list integers. | ||
SmallVector<StringRef, 3> ValStrs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of code overlap between NVPTX lowering of kernel properties and CompileTimePropertiesPass
lowering of kernel properties. We might want to think of a shared utility to work with kernel properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hiding the fact that they're all represented as strings in LLVM IR as an "implementation detail" and providing a std::optional<T> KernelPropertyAttr::getAsInteger
or getAsIntegerList
might be a good step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the addition, but we should have an experimental extension document for it as well. See sycl/doc/extensions/template.asciidoc or maybe it could be made as an expansion of sycl_ext_oneapi_kernel_arg_properties.
Tag @gmlueck for the comment above about extensions. |
(sorry for lightning response - I was already writing this to no-one when your feedback came in) Yeah sorry, I couldn't find any specifications for any of the existing kernel properties. Are you suggesting I add an extension for all properties that already exist, or just these ones? Also I couldn't pinpoint where (if anywhere) kernel properties are verified with respect to each other, in the same way that attributes are. We'd ideally like to check that various work-group properties make sense as a holistic whole, wouldn't we? |
Are you referring to the Clang attributes with the same names? I am not sure we originally saw them as "SYCL extensions", even though they only work on SYCL kernels currently, from what I know. I suppose they are documented in the Clang attributes documentation though. For the kernel properties, I believe most are documented either in sycl_ext_oneapi_kernel_arg_properties or in sycl_ext_intel_fpga_kernel_interface_properties.
Just these new ones. Since they are somewhat similar to the ones in sycl_ext_oneapi_kernel_arg_properties, I wouldn't personally be opposed to adding them in there. Do they have an effect on non-CUDA targets? If not, then we'd want to note that in the extension.
I think the We also have a check in SemaDeclAttr that might be relevant to expand here. The check issues a warning if the user is using properties on kernels that have a corresponding property. It's not actually checking if there is a conflict, hence it is just a warning. It is not my favorite part of the properties-related code, but we try to make the Clang frontend as oblivious to these properties as possible. |
I'm not sure these belong in Sorry, for some reason I couldn't find that
Ah excellent, thanks - I'll check that out. It might not be essential for this PR, and this PR (and its subsequent follow-up) is time sensitive as we want it in before the cut-off.
I hadn't spotted that either, thank you. Yeah, I've been made vaguely aware that properties are the way forward, as opposed to attributes. I can see why. |
Sorry, that was the extension I meant! I may have been a tad too quick when looking for it. I think these fit in there, but I would like to hear @gmlueck's thoughts. |
Are there already C++ attributes with these names that can be used to decorate a kernel? Are they documented someplace? |
Yep, they were added late last year as function attributes. max_work_group_size |
What is the behavior if a kernel with these attributes/properties is compiled for a non-CUDA device? |
The The idea is that the behaviour of My hunch is that |
To be specific, I think we can say something to the effect that the two CUDA attributes/properties "have no effect and can be safely ignored when compiled for non-CUDA devices". |
Note, I've now renamed two of the CUDA-specific properties as we found that the original function attributes' names were misleading, with Note also that I intend to add another target-agnostic property to help with I suspect that it might be best if this PR splits into two: one for the two CUDA-specific properties, and another for the two target-agnostic properties. Thoughts? |
Will you add a specification for these CUDA properties also? If they will have no effect on non-CUDA targets, I think it would be OK to document them in sycl_ext_oneapi_kernel_properties. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add FE tests
Yes I'll add some kind of specification. I'll probably continue it here as a separate PR to #14518 as those have a higher priority.
Where abouts exactly, sorry? I had assumed that the tests I'd added to |
We usually require frontend tests for any frontend changes. It looks like this PR adds new metadata. Tests checking for it should suffice. |
The tests I've added in |
No its my bad sorry. I did not realize you were doing IR checks in sycl/test. We usually do those in CodeGenSYCL in clang/tests without invoking the driver. Are metadata checks for other properties tested in sycl/test and not in clang/tests? |
Yeah exactly, codegen for the other properties is being testing in the same place. |
My concern is that it is very common to run just check-clang when making pure FE changes and these tests will not be run in that case. My understanding with llvm.org policy is that clang changes need to be accompanied by tests in the clang folder, and we follow that same guideline. So I would personally prefer having these tests in CodeGenSYCL and moving other tests over at some point. @premanandrao thoughts? |
I agree with @elizabethandrews. |
I agree with you both in principle but I don't believe this is truly a "clang front-end" change, despite it touching code in I think what's going on here is that the code that's in clang should really be in LLVM. This is not really a front-end change but is for historical reasons. @steffenlarsen and I have already discussed in #14502 about moving this code at some point. For the absence of doubt, let me walk through how these kernel properties end up in LLVM IR: The properties are defined in the SYCL headers. They each specialise The SYCL headers for APIs like Now that the functions have their attributes, they need to be "lowered" for each target. That is, the function attributes are transformed into LLVM IR that the toolchains expect. For the SPIR-V path, note how this happens in llvm/lib/SYCLLowerIR/CompileTimePropertiesPass.cpp for the SPIR-V path - this isn't using clang at all. If we weren't wanting any CUDA-specific behaviour we wouldn't be having this discussion. For CUDA, yes, this happens |
Why do we push conceptually wrong code, which is supposed to be refactored someday? Why can't we move the code logic from |
Closed in favour of #14518. If the two other properties are desired in the end, we can open a new PR. |
This patch adds three kernel properties,
max_work_group_size
,min_work_groups_per_multiprocessor
, andmax_work_groups_per_cluster
for three existing SYCL function attributes:max_work_group_size
,min_work_groups_per_cu
, andmax_work_groups_per_mp
, respectively. The semantics of the properties are the same as for their respective function attributes. Note the change of name fromcu
tomultiprocessor
andmp
tocluster
. This attempts to fix some of the ambiguous naming of the attributes by more closely matching the original CUDA__launch_bounds__
names.Each of these attributes was originally earmarked as a SYCL equivalent to CUDA's
__launch_bounds__
qualifier, hence the focus on lowering for NVPTX. It has since been identified that we will need an additional property for CUDA'smaxThreadsPerBlock
which isn't always expressible with the three-argumentmax_work_group_size
. That new property will come in a follow-up patch.This patch also wires up the
max_work_group_size
property to the equivalent SPIR-V execution mode, which should hopefully improve certain use cases. It lowers the other two properties despite them not currently having a SPIR-V equivalent.