Skip to content

Commit

Permalink
[SYCL][Doc] Add "unnamed lambda" and "kernel restrict all" extension …
Browse files Browse the repository at this point in the history
…drafts (#817)

Signed-off-by: Michael Kinsner <michael.kinsner@intel.com>
  • Loading branch information
Mike Kinsner authored and bader committed Nov 14, 2019
1 parent ec31e82 commit 47c4c71
Show file tree
Hide file tree
Showing 4 changed files with 323 additions and 0 deletions.
4 changes: 4 additions & 0 deletions sycl/doc/extensions/KernelRestrictAll/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# SYCL_INTEL_kernel_restrict_all

Optimization hint attribute asserting that no kernel arguments will alias with each other.

Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
= SYCL_INTEL_kernel_restrict_all
:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en

:blank: pass:[ +]

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}

== Introduction
IMPORTANT: This specification is a draft.

NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.

NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons.

This document describes an extension that adds a function type attribute which has the same effect as adding the C99 `restrict` attribute to all pointer arguments when applied to a kernel function.


== Name Strings

+SYCL_INTEL_kernel_restrict_all+

== Notice

Copyright (c) 2019 Intel Corporation. All rights reserved.

== Status

Working Draft

This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.

Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products.

== Version

Built On: {docdate} +
Revision: 1

== Contact
Michael Kinsner, Intel (michael 'dot' kinsner 'at' intel 'dot' com)

== Dependencies

This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-5.

== Overview

This extension adds a function type attribute that has the same effect as adding the C99 (or equivalently OpenCL C kernel language) `restrict` attribute to all pointers and the pointer member of any accessors, that are function arguments, lambda captures, or functor members, of the callable to which the attribute was applied. The attribute can be applied to kernel lambdas, function call operators of a functor, or arbitrary functions (the effect on arbitrary functions, if any, is implementaton defined), to provide the compiler with additional information for optimization.

A new attribute is added by this extension because there is no clear location on which to manually annotate C99 `restrict` on lambda captures, functor members, or accessors.

== Add new paragraphs to end of section 6.7 (Attributes)

The attribute `intel::kernel_args_restrict` is legal on device functions, and can be ignored on non-device functions. When applied to a lambda or function call operator (of a functor) that defines a kernel, the attribute is a hint to the compiler equivalent to specifying the C99 `restrict` attribute on all pointer arguments or the pointer member of any accessors, which are a function argument, lambda capture, or functor member, of the callable to which the attribute was applied. This effect is equivalent to annotating `restrict` on *all* kernel pointer arguments in an OpenCL or SPIR-V kernel, if the callable is a kernel. If `intel::kernel_args_restrict` is applied to a function called from a device kernel, the effect is implementation defined. The attribute forms an unchecked assertion, in that implementations do not need to check/confirm the pre-condition in any way. If a user applies `intel::kernel_args_restrict` to a kernel, but there is in fact aliasing between kernel pointer arguments at runtime, the behavior is undefined.

The attribute-token `intel::kernel_args_restrict` shall appear at most once in each _attribute-list_ and no _attribute-argument-clause_ shall be present. The attribute may be applied to the _function-type_ in a function declaration. The first declaration of a function shall specify the `intel::kernel_args_restrict` attribute if any declaration of that function specifies the `intel::kernel_args_restrict` attribute. If a function is declared with the `intel::kernel_args_restrict` attribute in one translation unit and the same function is declared without the `intel::kernel_args_restrict` attribute in another translation unit, the program is ill-formed and no diagnostic is required.

The `intel::kernel_args_restrict` attribute has an effect when applied to a function, and no effect otherwise.

== Example uses

[source,c++,Restrict on lambda,linenums]
----
Q.submit([&](handler &cgh) {
auto acc1 = out_buf_1.get_access<access::mode::write>(cgh);
auto acc2 = out_buf_2.get_access<access::mode::write>(cgh);
cgh.parallel_for<class lambda_foo>(
range<1>(N), [=](id<1> wiid) [[intel::kernel_args_restrict]] {
int id = wiid[0];
acc1[id]=id;
acc2[id]=id*2;
});
});
----

[source,c++,Restrict on functor,linenums]
----
class functor_foo {
...
void operator()(item<1> item) [[intel::kernel_args_restrict]]
{
int id = item[0];
buf1_m[id]=id;
buf2_m[id]=id*2;
}
};
----



== Issues

None.

//. Title
//+
//--
//*RESOLUTION*: Description
//--
== Revision History
[cols="5,15,15,70"]
[grid="rows"]
[options="header"]
|========================================
|Rev|Date|Author|Changes
|1|2019-11-11|Michael Kinsner|*Initial public working draft*
|========================================
//************************************************************************
//Other formatting suggestions:
//
//* Use *bold* text for host APIs, or [source] syntax highlighting.
//* Use +mono+ text for device APIs, or [source] syntax highlighting.
//* Use +mono+ text for extension names, types, or enum values.
//* Use _italics_ for parameters.
//************************************************************************
4 changes: 4 additions & 0 deletions sycl/doc/extensions/UnnamedKernelLambda/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# SYCL_INTEL_unnamed_kernel_lambda

Make kernel name class for lambda-defined kernels optional. Simplifies invocation interface.

Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
= SYCL_INTEL_unnamed_kernel_lambda
:source-highlighter: coderay
:coderay-linenums-mode: table

// This section needs to be after the document title.
:doctype: book
:toc2:
:toc: left
:encoding: utf-8
:lang: en

:blank: pass:[ +]

// Set the default source code type in this document to C++,
// for syntax highlighting purposes. This is needed because
// docbook uses c++ and html5 uses cpp.
:language: {basebackend@docbook:c++:cpp}

== Introduction
IMPORTANT: This specification is a draft.

NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.

NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons.

This document describes an extension that makes it optional, instead of required, to name (through template parameter of invocation function) kernels that are defined as lambdas.


== Name Strings

+SYCL_INTEL_unnamed_kernel_lambda+

== Notice

Copyright (c) 2019 Intel Corporation. All rights reserved.

== Status

Working Draft

This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.

Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products.

== Version

Built On: {docdate} +
Revision: 1

== Contact
Michael Kinsner, Intel (michael 'dot' kinsner 'at' intel 'dot' com)

== Dependencies

This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-5.

== Overview

SYCL allows device kernels to be defined in a variety of ways, one of which is as a lambda. Separate compilation means that the host-side invocation of a kernel lambda must be matched to the device-compiled variant of the lambda from a potentially different compilation. Lambda internal compiler naming is an implementation detail and not required to match across different compilers. Differences in macro expansion and other details between host and device compilations can further lead to divergence of compiled lambda name, for the same lambda in the original source code.

SYCL 1.2.1 requires that a kernel name be specified as a template parameter of the invocation function when a kernel has been defined as a lambda. This kernel name enables correlation of the lambda name between multiple compilations.

The extension described by this document makes naming of lambdas at invocation time optional in SYCL 1.2.1. Launch of a kernel is simplified from, for example:

[source,c++,LambdaNamed,linenums]
----
...
cgh.parallel_for<class my_kernel_name>(range<1>{ 1024 }, [=](id<1> idx) {
writeResult[idx] = idx[0];
});
...
----

To:

[source,c++,NoName,linenums]
----
...
cgh.parallel_for(range<1>{ 1024 }, [=](id<1> idx) {
writeResult[idx] = idx[0];
});
...
----

In this example, `<class my_kernel_name>` is optional and does not need to be included.

== Enabling the extension

In the Intel prototype implementation of this extension at https://github.com/intel/llvm[GitHub link], the extension is enabled through the compiler flag `-fsycl-unnamed-lambda`. When enabled, the extension defines the macro `$$__SYCL_INTEL_UNNAMED_LAMBDA__$$`.

== Modifications of SYCL 1.2.1 specification

=== Modify paragraph in Section 3.2

*Change from:*

The `parallel_for` function is templated with a class, in this case called `class simple_test`. This class is used only as a name to enable the kernel (compiled with a device compiler) and the host code (possibly compiled with a different host compiler) to be linked. This is required because C++ lambda functions have no name that a linker could use to link the kernel to the host code.

*To:*

The `parallel_for` function is templated with a class, in this case called `class simple_test`. This class is used to manually name the kernel when desired, such as to avoid a compiler-generated name when debugging a kernel defined through a lambda.

=== Remove sentences from Section 3.4.1.1

[line-through]#In SYCL, all kernels must have a kernel name, which must be a globally-accessible {cpp} type name. This is required to enable kernels compiled with one compiler to be linked to host code compiled with a different compiler.#

[line-through]#For named function objects, the type name of the function object is sufficient as the kernel name, but for {cpp}11 lambda functions, the user must provide a user-defined type name as the kernel name.#

=== Modify part of Section 3.9.2

For a lambda function there is no globally-visible name, so the user may optionally provide one for debugging or code style reasons. In SYCL, this optional name is provided as a template parameter to the kernel invocation function, such as `<class kernelName>` in `parallel_for<class kernelName>(...`

=== Modify part of Section 4.8.5

*Change from:*

Each function takes a kernel name template parameter. The kernel name must be a datatype that is unique for each kernel invocation. If a kernel is a named function object, and its type is globally visible, then the kernel's function object type will be automatically used as the kernel name and so the user does not need to supply a name. If the kernel function is a {cpp}11 lambda function, then the user must manually provide a kernel name to enable linking between host and device code to occur.

*To:*

Each function takes an optional kernel name template parameter. The user may optionally provide a kernel name, otherwise an implementation defined name will be generated for the kernel.


=== Modify Table 4.78

*Replace each instance of:*

If it is a named function object and the function object type is globally visible there is no need for the developer to provide a kernel name (`typename KernelName`) for it, as described in 4.8.5.

*With:*

Specification of a kernel name (`typename KernelName`), as described in 4.8.5, is optional.

=== Modify part of Section 4.8.9.2

*Change from:*

We allow lambda functions to define kernels in SYCL, but we have an extra requirement to name lambda functions in order to enable the linking of the SYCL device kernels with the host code to invoke them. The name of a lambda function in SYCL is a {cpp} class. If the lambda function relies on template arguments, then the name of the lambda function must contain those template arguments. The class used for the name of a lambda function is only used for naming purposes and is not required to be defined. For details on restrictions for kernel naming, please refer to 6.2.

To invoke a {cpp}11 lambda, the kernel name must be included explicitly by the user as a template parameter to the kernel invoke function.

*To:*

Kernels may be defined as lambda functions in SYCL. The name of a lambda function in SYCL may optionally be specified by passing it as a template parameter to the invoking method, and in that case, the lambda name is a {cpp} class. If the lambda function relies on template arguments, then if specified, the name of the lambda function must contain those template arguments. The class used for the name of a lambda function is only used for naming purposes and is not required to be defined. For details on restrictions for kernel naming, please refer to 6.2.


=== Modify part of Section 6.2

Lambdas do not have a globally-visible or unique name. A typename can optionally be provided to the kernel invoking interface, as described in 4.8.5, so that the developer can control the kernel name for purposes such as debugging.


== Proof of concept implementation references

. https://github.com/intel/llvm/pull/387/files
. https://github.com/intel/llvm/pull/250/files

== Issues

None.

//. asd
//+
//--
//*RESOLUTION*: Not resolved.
//--
== Revision History
[cols="5,15,15,70"]
[grid="rows"]
[options="header"]
|========================================
|Rev|Date|Author|Changes
|1|2019-11-11|Michael Kinsner|*Initial public working draft*
|========================================
//************************************************************************
//Other formatting suggestions:
//
//* Use *bold* text for host APIs, or [source] syntax highlighting.
//* Use +mono+ text for device APIs, or [source] syntax highlighting.
//* Use +mono+ text for extension names, types, or enum values.
//* Use _italics_ for parameters.
//************************************************************************

0 comments on commit 47c4c71

Please sign in to comment.