-
Notifications
You must be signed in to change notification settings - Fork 1
[SYCL] Builtin support + missing changes #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl-upstream-fe
Are you sure you want to change the base?
[SYCL] Builtin support + missing changes #42
Conversation
In SYCL device compilation, SYCL kernel entry point functions trigger the generation of SYCL kernel caller functions. This is done when handling deferred emissions.
The builtin takes the kernel name type as it's argument and returns the mangled name for the kernel caller function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @elizabethandrews! The changes look good. Most of my suggestions are for changes to comments or for additional tests. I did not a couple of technical concerns as well, but overall this looks right.
// SYCL | ||
def SYCLKernelName : LangBuiltin<"SYCL_LANG"> { | ||
let Spellings = ["__builtin_sycl_kernel_name"]; | ||
let Attributes = [NoThrow, CustomTypeChecking]; | ||
let Prototype = "char const*(...)"; | ||
} | ||
|
||
def SYCLKernelParamCount : LangBuiltin<"SYCL_LANG"> { | ||
let Spellings = ["__builtin_sycl_kernel_param_count"]; | ||
let Attributes = [NoThrow, CustomTypeChecking]; | ||
let Prototype = "int(...)"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add the Const
and Constexpr
attributes here if doing so doesn't cause significant issues. I can't think of any reason that these builtins shouldn't work during constant evaluation. We'll need to ensure that these builtins are always evaluated at compile-time and that constant arguments are passed for those that accept arguments (it is possible this is already accounted for and I just haven't read the code far enough yet).
int test5 = __builtin_sycl_kernel_param_count("str"); // expected-error {{invalid argument; argument must be a class or struct type with a member type alias named 'type'}} | ||
int test6 = __builtin_sycl_kernel_param_count(kernel_id_2(), kernel_id_2()); // expected-error {{builtin requires exactly 1 argument}} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a test to ensure that the err_builtin_requires_language
(e.g., "'__builtin_sycl_kernel_name' is only available in SYCL") diagnostic (or a similar one) is issued if one of these builtins is called from a non-SYCL program. See clang/test/SemaSYCL/sycl-kernel-entry-point-attr-ignored.cpp
and clang/test/SemaSYCL/kernel-attribute-on-non-sycl.cpp
as existing example tests.
@@ -12199,6 +12199,9 @@ def err_sycl_kernel_name_type : Error< | |||
def err_sycl_kernel_name_conflict : Error< | |||
"'sycl_kernel_entry_point' kernel name argument conflicts with a previous" | |||
" declaration">; | |||
def err_builtin_invalid_argument_count : Error<"builtin requires exactly 1 argument">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that a generic diagnostic for this doesn't already exist, but I went looking and didn't find one.
I suggest we model this after err_attribute_wrong_number_arguments
to make it more reusable.
def err_builtin_invalid_argument_count : Error<"builtin requires exactly 1 argument">; | |
def err_builtin_invalid_argument_count : Error< | |
"%0 %plural{0:takes no arguments|1:takes one argument|" | |
":requires exactly %1 arguments}1">; |
Also, how about colocating this with the other generic builtin diagnostics? Search for "/// Built-in functions." in the same file.
def err_builtin_invalid_argument : Error<"invalid argument; argument must be a class or struct type" | ||
" with a member type alias named 'type'">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at some other diagnostics for wording inspiration. These included err_pragma_loop_invalid_argument_type
, err_typecheck_unary_expr
, err_hip_invalid_args_builtin_mangled_name
, err_opencl_builtin_pipe_invalid_arg
, err_opencl_builtin_to_addr_invalid_arg
, err_attribute_vecreturn_only_vector_member
, and err_typename_nested_not_found_enable_if
. There isn't strong consistency across these, but the following is a little more consistent.
The name of the diagnostic should be changed; err_builtin_invalid_argument
is a pretty generic name for a rather specific diagnostic.
def err_builtin_invalid_argument : Error<"invalid argument; argument must be a class or struct type" | |
" with a member type alias named 'type'">; | |
def err_sycl_kernel_name_argument_invalid : Error< | |
"invalid argument of type %0; expected a class or structure " | |
"with a member typedef or type alias named 'type'">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took inspiration from some existing diagnostics with "argument must be a xyz" wording. I personally prefer that to "expected a...". I have no strong feelings though and so I have made this modification.
I did not specify the type via %0 in the diagnostic because we have the ^ pointing to the problematic type when we emit the diagnostic. Way back, someone told me to not include unnecessary information in diagnostics to keep it short, if we are already pointing to the problematic type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the "argument must be a xyz" wording; that is consistent with other diagnostics too.
My motivation for including the type in the diagnostic is because the carat won't actually point to the type; I expect it will point to the expression that produces an object of the type. Given this:
template<typename KernelName>
struct kernel_id_t {
using type = KernelName;
};
template<typename KernelName>
void f() {
__builtin_sycl_kernel_name(kernel_id_t<KernelName>());
}
I would expect a diagnostic to look something like:
invalid argument; argument must be a class or structure with a member typedef or type alias named 'type'
__builtin_sycl_kernel_name(kernel_id_t<KernelName>());
^^^^^^^^^^^^^^^^^^^^^^^^^
Our expectation is that the expression will readily identify the type in expected usage as in the above case, but that isn't guaranteed. For example, the expression could call some function (I don't know of any good reason to do so, but style preferences could lead someone to do so).
// FIXME: Existing tests fail if we limit emission to the kernel caller | ||
// function and functions called from it. Once the sycl_device attribute | ||
// is implemented, modify this check (and tests) to include it and | ||
// return false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since sycl_device
doesn't exist upstream yet, I think we should refrain from mentioning it here. The following suggested change defers to the SYCL specification for expected future behavior.
I suspect more changes will be needed in this function. For example, should we be returning true
for declarations attributed with AliasAttr
or UsedAttr
? Or for global variables with external linkage?
(I tend to prefix "FIXME" on every line of a FIXME comment so that a grep for FIXME yields the entire comment. It also helps to separate FIXME content from other content. Not everyone does this of course).
// FIXME: Existing tests fail if we limit emission to the kernel caller | |
// function and functions called from it. Once the sycl_device attribute | |
// is implemented, modify this check (and tests) to include it and | |
// return false. | |
// FIXME: The only functions that must be emitted during device compilation | |
// FIXME: are the kernel entry points and functions declared with SYCL_EXTERNAL. | |
// FIXME: However, some existing tests fail if the set of emitted functions is | |
// FIXME: limited to those functions and the ones they call; further investigation | |
// FIXME: is needed to determine how to address those test failures. |
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
// If the Decl corresponds to a SYCL kernel entry point function, generate | ||
// and emit the corresponding SYCL kernel caller function, i.e the | ||
// offload kernel. Otherwise, emit the definition and move on to the next | ||
// one. | ||
const FunctionDecl *FD = nullptr; | ||
if (LangOpts.SYCLIsDevice && | ||
(FD = D.getDecl()->getAsFunction()) != nullptr && | ||
FD->hasAttr<SYCLKernelEntryPointAttr>() && | ||
FD->isDefined()) { | ||
// Generate and emit the offload kernel | ||
EmitSYCLKernelCaller(FD, getContext()); | ||
} else { | ||
EmitGlobalDefinition(D, GV); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this code requires restoring the original code I had replaced. The else
branch was there to preserve existing functionality. I'm surprised this didn't cause any tests to fail!
// Otherwise, emit the definition and move on to the next one.
EmitGlobalDefinition(D, GV);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think I accidently removed that. It is there in my fork. I'll put it back!
clang/lib/CodeGen/CodeGenModule.cpp
Outdated
// If the Decl corresponds to a SYCL kernel entry point function, generate | ||
// and emit the corresponding the SYCL kernel caller function i.e the | ||
// offload kernel. | ||
if (const auto *FD = D.getDecl()->getAsFunction()) { | ||
if (LangOpts.SYCLIsDevice && FD->hasAttr<SYCLKernelEntryPointAttr>() && | ||
FD->isDefined()) { | ||
// Generate and emit the offload kernel | ||
EmitSYCLKernelCaller(FD, getContext()); | ||
// The offload kernel invokes the operator method of the SYCL kernel | ||
// object i.e. the SYCL kernel function is invoked. Emit this function. | ||
EmitDeferred(); | ||
// Do not emit the SYCL kernel entry point function. | ||
continue; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My motivation for moving this chunk of code later in the function was to avoid the recursive call to EmitDeferred()
here since that recursion will already happen further below (along with an assertion that all deferred declarations were actually emitted; that is missing here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is GetAddrOfGlobal below will result in an emit of the entry-point function. We need this block in the beginning to avoid emitting anything related the the entry-point function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't realize that. That makes sense. Perhaps we should add a comment to that effect. And update tests to validate that the entry point function isn't emitted on the device side.
clang/lib/Sema/SemaChecking.cpp
Outdated
IdentifierTable &IdentTable = S.Context.Idents; | ||
auto Name = DeclarationName(&(IdentTable.get("type"))); | ||
DeclContext::lookup_result Lookup = RD->lookup(Name); | ||
if (Lookup.empty() || !Lookup.isSingleResult() || !isa<TypeAliasDecl>(Lookup.front())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (Lookup.empty() || !Lookup.isSingleResult() || !isa<TypeAliasDecl>(Lookup.front())) | |
if (Lookup.empty() || !Lookup.isSingleResult() || !isa<TypedefNameDecl>(Lookup.front())) |
const char* test5 = __builtin_sycl_kernel_name("str"); // expected-error {{invalid argument; argument must be a class or struct type with a member type alias named 'type'}} | ||
const char* test6 = __builtin_sycl_kernel_name(kernel_id_2(), kernel_id_2()); // expected-error {{builtin requires exactly 1 argument}} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional test suggestions:
Test invocation in constant evaluation:
struct constexpr_kernel_name1;
void test_constexpr() {
kernel_single_task<constexpr_kernel_name1>([]{});
constexpr const char* test1 = __builtin_sycl_kernel_name(kernel_id_1<constexpr_kernel_name1>()); // Valid
}
Test that a non-constant argument produces an error.
struct nonconst_kernel_name1;
struct nonconst_kernel_name2;
void test() {
kernel_single_task<nonconst_kernel_name1>([]{});
kernel_single_task<nonconst_kernel_name2>([]{});
constexpr kernel_id_1<nonconst_kernel_name1> kn1;
(void)__builtin_sycl_kernel_name(kn1); // Valid
kernel_id_1<nonconst_kernel_name2> kn2;
(void)__builtin_sycl_kernel_name(kn2); // expected-error {{FIXME}}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional cases to exercise:
- Calling the builtin in a constexpr context before a call to
kernel_single_task()
; ensure an error is produced. - Calling the builtin before a call to
kernel_single_task()
; this should work by accessing data in a symbol declaration for which a symbol definition is emitted sometime after the call tokernel_single_task()
is observed (perhaps at the end of the translation unit contingent on use of the symbol). - Calling the builtin with no corresponding call to
kernel_single_task()
; ensure an error is produced at the end of the translation unit. This case is similar to the previous case; the error occurs because the symbol definition is never emitted.
Implemented support for __builtin_sycl_kernel_param_kind, __builtin_sycl_kernel_param_size and __builtin_sycl_kernel_param_offset Changes cause ast-dump-sycl-kernel-call-stmt.cpp to fail. The failure occurs when we try to obtain the size of a parameter with an instantiated type. The failure is because the definition of the type is not instantiated when attempting to obtain it's size. It is unclear why this is happening. Maybe KernelInfo is being collected too early? Further debugging is required.
This is a dummy implementation so that the library will compile. Actual implementation will be done after SYCL special types are supported.
The old design required the SYCL kernel object to be the first parameter of the kernel entry point function. In the new design, all parameters of the kernel entry point function are treated in a similar manner. We do not have a designated spot for the kernel object.
Getting the size of parameter requires it to be a complete type. When the parameter sizes were previously stored in SYCLKernelInfo, the compiler hit an assert in cases where the definition of the type was not yet instantiated by the compiler. This PR moves the code calculating size to the point of invocation of the builtin (instead of kernel invocation), hoping the type is fully instantiated by the compiler by then. This needs to be verified by adding new builtin tests. The existing crashing test now passes because it doesn't invoke the builtin.
The following builtins were implemented to point to the location of the SYCL kernel object. In the new design, the SYCL kernel object does not having 'special' designation since all members of the entry point function will be treated in a similar manner. It is unclear what these builtins should now return or whether they should even exist. This PR is a dummy implementation to make library compile while we investigate. __builtin_sycl_kernel_file_name(kernel-name) __builtin_sycl_kernel_function_name(kernel-name) __builtin_sycl_kernel_line_number(kernel-name) __builtin_sycl_kernel_column_number(kernel-name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I requested some minor changes; nothing fancy.
I still need to re-review the tests, but ran out of time for today. I'll look at those tomorrow.
enum kernel_param_kind_t { | ||
kind_first, | ||
kind_accessor = kind_first, | ||
kind_std_layout, | ||
kind_sampler, | ||
kind_pointer, | ||
kind_specialization_constants_buffer, | ||
kind_stream, | ||
kind_last = kind_stream | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation for defining kernel_param_kind_t
as a member of SYCLKernelInfo
? I think it makes sense for KernelParamDesc
to be a member since it is just implementation detail, but since the enumeration is exposed as part of the interface, I think it would be more ergonomic to define it at namespace scope.
The enumeration name and enumerators should be renamed to follow LLVM style. Perhaps SYCLKernelParamKind
with the enumerators names SKPK_*
(see PrimitiveCopyKind
and DestructionKind
as precedents). Alternatively, use of enum class
would allow omitting the SKPK_
prefix (see NonConstantStorageReason
and AutoTypeKeyword
as precedents).
Since this enumeration is expected to match a definition in the SYCL run-time, it would be good to define an explicit underlying type for it; perhaps just int8_t
.
If we don't have a use for kind_first
and kind_last
, they can be omitted for now.
private: | ||
// Kernel caller function parameter descriptor. | ||
struct KernelParamDesc { | ||
kernel_param_kind_t Kind = kind_last; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind_last
(equivalent to kind_stream
here) makes for an odd default. If a default is needed, then I recommend introducing a new enumerator (e.g., kind_invalid
) for that purpose. I'm generally inclined not to introduce such "invalid" values unless absolutely required though. In this case, it looks like a default can be avoided by adding a constructor that requires the kind to be provided (and thus eliding the default constructor; assuming SmallVector
doesn't complain about the type not being default constructible; which it might).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have a good answer here. I just kept the implementation of the enum as it is in syclos.
def err_sycl_kernel_name_invalid_arg : Error<"invalid argument; expected a class " | ||
"or structure with a member typedef " | ||
"or type alias alias named 'type'">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def err_sycl_kernel_name_invalid_arg : Error<"invalid argument; expected a class " | |
"or structure with a member typedef " | |
"or type alias alias named 'type'">; | |
def err_sycl_kernel_name_invalid_arg : Error<"invalid argument; expected a class " | |
"or structure with a member typedef " | |
"or type alias named 'type'">; |
// FIXME: Improve the comment. | ||
RecordDecl *RD = E->getArg(0)->getType()->castAs<RecordType>()->getDecl(); | ||
IdentifierTable &IdentTable = Ctx.Idents; | ||
auto Name = DeclarationName(&(IdentTable.get("type"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IdentifierTable::get()
will create and return an IdentifierInfo
reference even if no such identifier as been encountered yet. Perhaps it would be better to use find()
here and to assert the identifier is found.
auto Name = DeclarationName(&(IdentTable.get("type"))); | |
auto TypeIdentInfo = IdentTable.find("type"); | |
assert(TypeIdentInfo != IdentTable.end(); | |
auto Name = DeclarationName(*TypeIdentInfo); |
return RValue::get( | ||
llvm::ConstantInt::get(Int32Ty, KernelInfo->GetParamCount())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The builtin signature specifies that it returns int
. I think this should match.
return RValue::get( | |
llvm::ConstantInt::get(Int32Ty, KernelInfo->GetParamCount())); | |
return RValue::get( | |
llvm::ConstantInt::get(IntTy, KernelInfo->GetParamCount())); |
// If the Decl corresponds to a SYCL kernel entry point function, generate | ||
// and emit the corresponding the SYCL kernel caller function i.e the | ||
// offload kernel. The generation of the offload kernel needs to happen | ||
// first in this loop, in order to avoid generating IR for the SYCL kernel | ||
// entry point function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// If the Decl corresponds to a SYCL kernel entry point function, generate | |
// and emit the corresponding the SYCL kernel caller function i.e the | |
// offload kernel. The generation of the offload kernel needs to happen | |
// first in this loop, in order to avoid generating IR for the SYCL kernel | |
// entry point function. | |
// If the Decl corresponds to a SYCL kernel entry point function, generate | |
// and emit the corresponding SYCL kernel caller function i.e the | |
// offload kernel. The generation of the offload kernel needs to happen | |
// first in this loop, in order to avoid generating IR for the SYCL kernel | |
// entry point function. |
// The argument must be a class or struct with a member | ||
// named type. | ||
static bool CheckBuiltinSyclKernelName(Sema &S, CallExpr *TheCall) { | ||
QualType ArgTy = TheCall->getArg(0)->getType(); | ||
const auto *RT = ArgTy->getAs<RecordType>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary intent of this function appears to be to validate that the type that identifies the kernel meets specific requirements. This is useful and is separable from the fact that the type is obtained by querying the type of the first argument in a call expression; I think it would be clearer to colocate the retrieval of the type with the code that checks for the number of arguments below.
// The argument must be a class or struct with a member | |
// named type. | |
static bool CheckBuiltinSyclKernelName(Sema &S, CallExpr *TheCall) { | |
QualType ArgTy = TheCall->getArg(0)->getType(); | |
const auto *RT = ArgTy->getAs<RecordType>(); | |
// The SYCL kernel builtin functions identify the kernel to operate on | |
// by the type of one of the arguments. That type is required to be a | |
// class or structure type with a member typedef or type alias named | |
// "type". | |
static bool CheckBuiltinSyclKernelIDType(Sema &S, QualType T) { | |
const auto *RT = T->getAs<RecordType>(); |
|
||
RecordDecl *RD = RT->getDecl(); | ||
IdentifierTable &IdentTable = S.Context.Idents; | ||
auto Name = DeclarationName(&(IdentTable.get("type"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if I should be worried about this, but IdentifierTable::get()
will create an IdentifierInfo
object if one doesn't already exist. That makes me nervous. There is a find()
member function to do a lookup without creating a new object, but it isn't particularly ergonomic to use. The lack of a getExisting()
or similar function that can return null is a little surprising to me though, so maybe I'm being overly paranoid. Maybe it is worth using find()
here to avoid unlikely potential problems?
return ExprError(); | ||
} | ||
|
||
if (CheckBuiltinSyclKernelName(*this, TheCall)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per prior comment:
if (CheckBuiltinSyclKernelName(*this, TheCall)) { | |
if (CheckBuiltinSyclKernelName(*this, TheCall->getArg(0)->getType())) { |
case Builtin::BI__builtin_sycl_kernel_param_offset: | ||
case Builtin::BI__builtin_sycl_kernel_param_size: | ||
case Builtin::BI__builtin_sycl_kernel_param_access_target: { | ||
// Builtin takes 1 argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Builtin takes 1 argument | |
// Builtin takes 2 arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of comments regarding the parameter index argument potentially being a run-time supplied value.
// FIXME: This is a dummy value. This builtin will be implemented when | ||
// support for special sycl types is implemented. | ||
return RValue::get(llvm::ConstantInt::get(Int32Ty, 0)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the above builtins that take a parameter index as an argument, we'll need to support the parameter index argument being a non-constant; see how Sergey implemented getKernelParamDescs()
in intel/llvm#15070. Code consuming the parameter information will look something like:
int param_count = __builtin_sycl_kernel_param_count(KernelIdentity<KernelNameType>());
for (int i = 0; i < param_count; ++i) {
int param_size = __builtin_sycl_kernel_param_size(KernelIdentity<KernelNameType>(), i);
...
}
There are a few implications of this.
- When the parameter index argument can be evaluated at compile-time, IR can be emitted to just provide the parameter info value as the code does now. However, we need to issue a diagnostic if the parameter index is not in range.
- When the parameter index argument can't be evaluated at compile-time, IR will have to be emitted to retrieve the value from a stored array of elements. This will presumably require emitting something like a static local constexpr variable (something with internal linkage) that the IR will index into. We can't issue a diagnostic for an out of range parameter index in this case of course, so the IR should include a comparison and branch to an
unreachable
instruction.
// CHECK: store i32 1, ptr %test1, align 4 | ||
// CHECK: store i32 1, ptr %test2, align 4 | ||
// CHECK: store i32 1, ptr %test3, align 4 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the comment about the parameter index argument not necessarily being a constant, we should have a test that looks something like this:
class kernel_name_3;
void test2() {
SYCLKernel Obj;
kernel_single_task<kernel_name_3>(Obj, 1);
int kn3_param_count = __builtin_sycl_kernel_param_count(kernel_id_t<kernel_name_3>());
for (int i = 0; i < kn3_param_count; ++i) {
int param_kind = __builtin_sycl_kernel_param_kind(kernel_id_t<kernel_name_3>(), i);
}
}
I think it would be reasonable to have a single test that covers all of the builtins that take a parameter index argument.
Commit 1 - Fixes a bug to generate kernel caller function correctly when kernel entry point attribute is not a template
Commits 2 and 3 - Builtin support
Commit - 4 - Fixes discrepancies between Tom and Elizabeth forks