diff --git a/OpenCL_C.txt b/OpenCL_C.txt index dce47180..2ac4d9d8 100644 --- a/OpenCL_C.txt +++ b/OpenCL_C.txt @@ -34,6 +34,9 @@ Khronos{R} OpenCL Working Group // Various special / math symbols. This is easier to edit with than Unicode. include::config/attribs.txt[] +// Feature Dictionary +include::c/feature-dictionary.asciidoc[] + // External Footnotes include::c/footnotes.asciidoc[] @@ -139,78 +142,78 @@ Feature macro identifiers are used as names of features in this document. | *Feature Macro/Name* | *Brief Description* -| `+__opencl_c_3d_image_writes+` +| {opencl_c_3d_image_writes} | The OpenCL C compiler supports built-in functions for writing to 3D image objects. -OpenCL C compilers that define the feature macro `+__opencl_c_3d_image_writes+` -must also define the feature macro `+__opencl_c_images+`. +OpenCL C compilers that define the feature macro {opencl_c_3d_image_writes} +must also define the feature macro {opencl_c_images}. -| `+__opencl_c_atomic_order_acq_rel+` +| {opencl_c_atomic_order_acq_rel} | The OpenCL C compiler supports enumerations and built-in functions for atomic operations with acquire and release memory consistency orders. -| `+__opencl_c_atomic_order_seq_cst+` +| {opencl_c_atomic_order_seq_cst} | The OpenCL C compiler supports enumerations and built-in functions for atomic operations and fences with sequentially consistent memory consistency order. -| `+__opencl_c_atomic_scope_device+` +| {opencl_c_atomic_scope_device} | The OpenCL C compiler supports enumerations and built-in functions for atomic operations and fences with device memory scope. -| `+__opencl_c_atomic_scope_all_devices+` +| {opencl_c_atomic_scope_all_devices} | The OpenCL C compiler supports enumerations and built-in functions for atomic operations and fences with all with memory scope across all devices that can share SVM memory with each other and the host process. -| `+__opencl_c_device_enqueue+` +| {opencl_c_device_enqueue} | The OpenCL C compiler supports built-in functions to enqueue additional work from the device. -OpenCL C compilers that define the feature macro `+__opencl_c_device_enqueue+` -must also define the feature macro `+__opencl_c_generic_address_space+`. +OpenCL C compilers that define the feature macro {opencl_c_device_enqueue} +must also define the feature macro {opencl_c_generic_address_space}. -| `+__opencl_c_generic_address_space+` +| {opencl_c_generic_address_space} | The OpenCL C compiler supports the unnamed generic address space. -| `+__opencl_c_fp64+` +| {opencl_c_fp64} | The OpenCL C compiler supports types and built-in functions with 64-bit floating point types. -| `+__opencl_c_images+` +| {opencl_c_images} | The OpenCL C compiler supports types and built-in functions for images. -| `+__opencl_c_int64+` +| {opencl_c_int64} | The OpenCL C compiler supports types and built-in functions with 64-bit integers. OpenCL C compilers for FULL profile devices or devices with 64-bit pointers -must always define the `+__opencl_c_int64+` feature macro. +must always define the {opencl_c_int64} feature macro. -| `+__opencl_c_pipes+` +| {opencl_c_pipes} | The OpenCL C compiler supports the pipe modifier and built-in functions to read and write from a pipe. -OpenCL C compilers that define the feature macro `+__opencl_c_pipes+` must -also define the feature macro `+__opencl_c_generic_address_space+`. +OpenCL C compilers that define the feature macro {opencl_c_pipes} must +also define the feature macro {opencl_c_generic_address_space}. -| `+__opencl_c_program_scope_global_variables+` +| {opencl_c_program_scope_global_variables} | The OpenCL C compiler supports program scope variables in the global address space. -| `+__opencl_c_read_write_images+` +| {opencl_c_read_write_images} | The OpenCL C compiler supports reading from and writing to the same image object in a kernel. OpenCL C compilers that define the feature macro -`+__opencl_c_read_write_images+` must also define the feature macro -`+__opencl_c_images+`. +{opencl_c_read_write_images} must also define the feature macro +{opencl_c_images}. -| `+__opencl_c_subgroups+` +| {opencl_c_subgroups} | The OpenCL C compiler supports built-in functions operating on sub-groupings of work-items. -| `+__opencl_c_work_group_collective_functions+` +| {opencl_c_work_group_collective_functions} | The OpenCL C compiler supports built-in functions that perform collective operations across a work-group. @@ -545,25 +548,25 @@ OpenCL. on the device. <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_device_enqueue+` feature. + newer and the {opencl_c_device_enqueue} feature. | `ndrange_t` | The N-dimensional range over which a kernel executes. <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_device_enqueue+` feature. + newer and the {opencl_c_device_enqueue} feature. | `clk_event_t` | A device side event that identifies a command enqueue to a device command queue. <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_device_enqueue+` feature. + newer and the {opencl_c_device_enqueue} feature. | `reserve_id_t` | A reservation ID. This opaque type is used to identify the reservation for <>. <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_pipes+` feature. + newer and the {opencl_c_pipes} feature. | `event_t` | An event. This can be used to identify <> from @@ -588,7 +591,7 @@ The `image2d_t`, `image3d_t`, `image2d_array_t`, `image1d_t`, supports images, i.e. the value of the <>) is `CL_TRUE`. If this is the case then an OpenCL C 3.0 or newer compiler must also define -the `+__opencl_c_images+` feature macro. +the {opencl_c_images} feature macro. ==== The C99 derived types (arrays, structs, unions, functions, and pointers), @@ -865,7 +868,7 @@ The numeric indices that can be used are given in the table below: [[table-vector-indices]] .Numeric indices for built-in vector data types -[cols=",",] +[width="100%",cols="<34%,<66%",options="header"] |==== | *Vector Components* | *Numeric indices that can be used* | 2-component | 0, 1 @@ -989,7 +992,7 @@ float2 low = vf.lo; // (1.0f, 2.0f); float2 high = vf.hi; // (3.0f, _undefined_); ---------- -It is an error to take the address of a vector element and will result in a +It is illegal to take the address of a vector element and will result in a compilation error. For example: @@ -2047,17 +2050,17 @@ types. [open,refpage='addressSpaceQualifiers',desc='Address Space Qualifiers',type='freeform',spec='clang',anchor='address-space-qualifiers',xrefs='constant genericAddressSpace global local private'] -- -OpenCL has hierarchical memory architecture represented by address spaces -defined in <>. It -extends C syntax to allow an address space name as a valid type qualifier -(<>). -OpenCL implements the following disjoint named address spaces with the spelling: +OpenCL C has a hierarchical memory architecture represented by address spaces, as +defined in section 5 of <>. It +extends the C syntax to allow an address space name as a valid type qualifier +(section 5.1.2 of <>). +OpenCL implements disjoint named address spaces with the spelling `+__global+`, `+__local+`, `+__constant+` and `+__private+`. The address space qualifier may be used in variable declarations to specify the region where objects are to be allocated. If the type of an object is qualified by an address space name, the object is allocated in the -specified address space. Similarly in pointers a type pointed to can be qualified -by an address space signaling the address space the object pointed to is located. +specified address space. Similarly, for pointers, the type pointed to can be qualified +by an address space signaling the address space where the object pointed to is located. The address space name spelling without the `+__+` prefix, i.e. `global`, `local`, `constant` and `private`, are valid and may be substituted for the @@ -2079,14 +2082,14 @@ void foo (...) } ---------- -For OpenCL C 2.0, or OpenCL C 3.0 with the `+__opencl_c_generic_address_space+` +For OpenCL C 2.0, or OpenCL C 3.0 with the {opencl_c_generic_address_space} feature macro, there is an additional unnamed generic address space. -Most of the restrictions from <> apply in OpenCL C i.e. address spaces can not -be used with a return type, a function parameter, or a function type; multiple -address space qualifiers are not allowed. However, in OpenCL C it is allowed to -qualify local variables with an address space qualifier. +Most of the restrictions from section 5.1.2 and section 5.3 of the +<> apply in OpenCL C, e.g. address +spaces can not be used with a return type, a function parameter, or a function +type, and multiple address space qualifiers are not allowed. However, in OpenCL +C it is allowed to qualify local variables with an address space qualifier. Examples: @@ -2095,22 +2098,20 @@ Examples: // OK. int f() { ... } -// Error. Address space qualifier cannot be used with non-pointer return type. +// Error. Address space qualifier cannot be used with a non-pointer return type. private int f() { ... } -// OK. Address space qualifier can be used with pointer return type. +// OK. Address space qualifier can be used with a pointer return type. local int *f() { ... } -// Error. Multiple address spaces specified per type. +// Error. Multiple address spaces specified for a type. private local int i; -// Ok. The first address space qualifies an object pointer to and the second +// OK. The first address space qualifies the object pointed to and the second // qualifies the pointer. private int *local ptr; - ---------- - The `+__global+`, `+__constant+`, `+__local+`, `+__private+`, `global`, `constant`, `local`, and `private` names are reserved for use as address space qualifiers and shall not be used otherwise. @@ -2122,7 +2123,6 @@ The size of pointers to different address spaces may differ. It is not correct to assume that, for example, `+sizeof(__global int *)+` always equals `+sizeof(__local int *)+`. ==== - -- [[global-or-global]] @@ -2138,7 +2138,7 @@ A buffer memory object can be declared as a pointer to a scalar, vector or user-defined struct. This allows the kernel to read and/or write any location in the buffer. -The actual size of the array memory object is determined when the memory +The actual size of the memory object is determined when the memory object is allocated via appropriate API calls in the host code. Examples: @@ -2152,7 +2152,7 @@ typedef struct { int b[2]; } foo_t; -global foo_t *my_info; // An array of foo_t elements. +global foo_t *my_info; // An array of foo_t elements ---------- As image objects are always allocated from the `global` address space, the @@ -2160,12 +2160,12 @@ As image objects are always allocated from the `global` address space, the The elements of an image object cannot be directly accessed. Built-in functions to read from and write to an image object are provided. -Variables at program scope or `static`/`extern` variables inside functions -can be declared in global address space if -`__opencl_c_program_scope_global_variables` feature is supported. These +Variables at program scope or `static` or `extern` variables inside functions +can be declared in global address space if the +{opencl_c_program_scope_global_variables} feature is supported. These variables in the `global` address space have the same lifetime as the program, and their values persist between calls to any of the kernels in the program. -These variables are not shared across devices. They have distinct storage. +They are not shared across devices and have distinct storage. -- @@ -2175,9 +2175,8 @@ These variables are not shared across devices. They have distinct storage. [open,refpage='local',desc='local Address Space Qualifiers',type='freeform',spec='clang',anchor='local-or-local',xrefs='addressSpaceQualifiers constant genericAddressSpace global private'] -- -The `+__local+` or `local` address space name is used to describe variables -that need to be allocated in local memory and are shared by all work-items -of a work-group. +The `+__local+` or `local` address space name is used to describe variables that +are allocated in local memory and shared by all work-items in a work-group. Examples: @@ -2186,10 +2185,10 @@ Examples: kernel void my_func(...) { local float a; // A single float allocated - // in local address space + // in the local address space local float b[10]; // An array of 10 floats - // allocated in local address space. + // allocated in the local address space } ---------- [NOTE] @@ -2208,10 +2207,10 @@ only for the lifetime of the work-group executing the kernel. -- The `+__constant+` or `constant` address space name is used to describe -variables accessible globally (declared in program scope or inside functions -with `static`/`extern` storage class specifier) as read-only variables. -These read-only variables can be accessed by all work-items of different -kernels during their execution. +read-only variables that are accessible globally. They may +be declared in program scope or in the outermost kernel scope or inside + functions with a `static` or `extern` storage class specifier. Such variables + can be accessed by all work-items or by different kernels during the program execution. [NOTE] ==== @@ -2221,32 +2220,31 @@ defined as the value of the <>. ==== -Writing to such a variable results in a compile-time error. +It is illegal to write to a variable in the constant address space and will +result in a compilation error. Example: [source,c] ---------- -constant int a = 3; // int in constant address space initialized with a constant value. +constant int a = 3; // int allocated in the constant address space kernel void k1(global int *buf) { - buf[a] = ...; // allowed. All work items access element with index 3; + buf[a] = ...; // OK. All work items access element with index 3. } kernel void k2(global int *buf) { - *buf = a; // allowed. All work items stored value 3; - a = 42; // error. a is in constant memory. + *buf = a; // OK. All work items store value 3. + a = 42; // Error. a is in constant memory. } ---------- - Implementations are not required to aggregate these declarations into the fewest number of constant arguments. This behavior is implementation defined. Thus portable code must conservatively assume that each variable declared inside a function or in program scope allocated in the `+__constant+` address space counts as a separate constant argument. - -- [[private-or-private]] @@ -2254,10 +2252,10 @@ address space counts as a separate constant argument. [open,refpage='private',desc='private Address Space Qualifiers',type='freeform',spec='clang',anchor='private-or-private',xrefs='addressSpaceQualifiers constant genericAddressSpace global local'] -- -Private address space is a memory segment that can only be accessed by one work -item. Variables that are not shareable among work items are allocated in private -and it is the default address space for most of variables in particular variables -with automatic storage duration. +The private address space is a memory segment that can only be accessed by one +work item. Variables that are not shareable among work items are allocated in +the private address space, and it is the default address space for most +variables, in particular variables with automatic storage duration. Example: @@ -2265,8 +2263,7 @@ Example: ---------- kernel void foo(...) { - private int i; - + private int i; } ---------- -- @@ -2277,8 +2274,8 @@ kernel void foo(...) [open,refpage='genericAddressSpace',desc='The Generic Address Space',type='freeform',spec='clang',anchor='the-generic-address-space',xrefs='addressSpaceQualifiers constant global local private'] -- -Generic address space requires support for OpenCL C 2.0 or OpenCL C 3.0 with -the `+__opencl_c_generic_address_space+` feature. It can be used with pointer +The generic address space requires support for OpenCL C 2.0 or OpenCL C 3.0 with +the {opencl_c_generic_address_space} feature. It can be used with pointer types and it represents a placeholder for any of the named address spaces - `global`, `local` or `private`. It signals that a pointer points to an object in one of these concrete named address spaces. The exact address space @@ -2288,10 +2285,9 @@ resolution can occur dynamically during the kernel execution. ---------- kernel void foo(int a) { - private int b; - local int c; - int* p = a ? &b : &c; // p points to an object in local or private address space. - + private int b; + local int c; + int* p = a ? &b : &c; // p points to the local or private address space. } ---------- @@ -2302,11 +2298,11 @@ kernel void foo(int a) This section describes use of address space qualifiers with respect to declaration scopes or variable types. -Local variables inside functions can be qualified by private address space +Local variables inside functions can be qualified by the private address space qualifier. Variables declared in the outermost compound statement inside the body of the -kernel function can be qualified by local or constant address spaces. +kernel function can be qualified by the local or constant address spaces. Examples: @@ -2314,25 +2310,25 @@ Examples: ---------- kernel void my_func(...) { - private float a; // allowed. - local float b; // allowed. + private float a; // OK. + local float b; // OK. if (...) { - // example of variable in __local address space but not + // Example of variable in __local address space but not // declared at __kernel function scope. - local float c; // not allowed. + local float c; // Error. } } ---------- -Program scope variables or variables with `extern`/`static` storage class +Program scope variables or variables with a `extern` or `static` storage class specifier: * Must be qualified by `__constant` in OpenCL C prior to 2.0 or OpenCL C 3.0 - without `+__opencl_c_program_scope_global_variables+` feature. + without {opencl_c_program_scope_global_variables} feature. * Can be qualified by either `__constant` or `__global` for OpenCL C 2.0 or - OpenCL C 3.0 with `+__opencl_c_program_scope_global_variables+` feature. + OpenCL C 3.0 with {opencl_c_program_scope_global_variables} feature. Examples: @@ -2340,6 +2336,7 @@ Examples: ---------- // Note: these examples assume OpenCL C 2.0 or the // __opencl_c_program_scope_global_variables feature macro. + constant int foo; // OK. global int baz; // OK. global uchar buf[512]; // OK. @@ -2348,21 +2345,17 @@ static global int bat; // OK. Internal linkage. extern constant int foo; // OK. - void func(...) { - - constant static int foo = 1; // OK. - global extern int foo; // OK. + constant static int foo = 1; // OK. + global extern int foo; // OK. } -global int * global ptr; // OK. -constant int *global ptr=&baz; // Error, baz is in the global address - // space. -global int * constant ptr = &baz; // OK. +global int *global ptr; // OK. +constant int *global ptr = &baz; // Error, baz is in the global address space. +global int *constant ptr = &baz; // OK. ---------- - Kernel function arguments declared to be a pointer or an array of a type must point to one of the named address spaces `+__global+`, `+__local+` or `+__constant+`. @@ -2371,21 +2364,22 @@ Examples: [source,c] ---------- -kernel void my_kernel(global int *ptr) // OK + // OK. +kernel void my_kernel(global int *ptr) { - ... + ... } -kernel void my_kernel(int *ptr) // Error, ptr must point to either global, local or constant int + // Error, ptr must point to the global, local, or constant address space. +kernel void my_kernel(int *ptr) { - ... + ... } ---------- - -- === Initialization -- -Program scope and `static` variables in the `global` address space are zero +Program scope and `static` variables in the `+__global+` address space are zero initialized by default. A constant expression may be given as an initializer. Variables allocated in the `+__local+` address space inside a kernel function @@ -2394,7 +2388,7 @@ cannot be initialized. Variables allocated in the +__constant+ address space are required to be initialized and the values used to initialize these variables must be a compile time constant. -Private address space objects are not initialized by default, any initializer is +Private address space objects are not initialized by default; any initializer is allowed to be given. Examples: @@ -2422,36 +2416,35 @@ kernel void my_func(...) === Inference -- - Address space qualifiers are not required in many cases. If they are not specified explicitly the default address space will be inferred depending on the declaration scope and the object type. -There is no syntax to provide address space in the source for some situation, -therefore only default address space is applicable. +There is no syntax to provide address space in the source for some situations, +therefore only the default address space is applicable. -For OpenCL C 2.0 or with the `+__opencl_c_program_scope_global_variables+` +For OpenCL C 2.0 or with the {opencl_c_program_scope_global_variables} feature, the address space for a variable at program scope or a `static` -or `extern` variable inside a function are inferred to `+__global+`. +or `extern` variable inside a function are inferred to be `+__global+`. If the generic address space is supported i.e. for OpenCL C 2.0 or OpenCL C 3.0 -with `__opencl_c_generic_address+space` feature, pointers that are declared +with {opencl_c_generic_address_space} feature, pointers that are declared without pointing to a named address space point to the generic address space. All string literal storage shall be in the `+__constant+` address space. For all other cases that are not listed above the address space is inferred to -private. This includes: +`+__private+`. This includes: - * All function arguments as well as return values are in private address + * All function arguments as well as return values are in the private address space. * Pointers that are declared without pointing to a named address space point - to the `+__private+` address space if generic address space is not + to the `+__private+` address space if the generic address space is not supported. * Variables inside a function not declared with an address space qualifier - are inferred to private address space. + are inferred to be in the private address space. Examples: @@ -2460,17 +2453,18 @@ Examples: // Note: these examples assume OpenCL C 2.0 or the // __opencl_c_program_scope_global_variables feature macro. -int foo; // Declared in the global address space. +int foo; // Inferred to be in the global address space. -static int foo; // Declared in the global address space. +static int foo; // Inferred to be in the global address space. -int *ptr; // ptr is allocated in the global address space. +int *ptr; // ptr is inferred to be in the global address space. // ptr points to a location in (1) the generic address // space for OpenCL C 2.0 or OpenCL C 3.0 with // __opencl_c_generic_address_space feature or // in (2) the private address space otherwise. -int * global ptr; // ptr points to an location in (1) the generic address +int *global ptr; // ptr is declared to be in the global address space. + // ptr points to an location in (1) the generic address // space for OpenCL C 2.0 or OpenCL C 3.0 with // __opencl_c_generic_address_space feature or // in (2) the private address space otherwise. @@ -2496,78 +2490,80 @@ void func(int param) // param is allocated in the private address space. Qualifiers must be explicitly specified for: * Program scope variables or variables inside functions with - `static`/`extern` type specifier for OpenCL C prior to version 2.0 or - OpenCL C 3.0 without `+__opencl_c_program_scope_global_variables+` feature, + a `static` or `extern` type specifier for OpenCL C prior to version 2.0 or + OpenCL C 3.0 without {opencl_c_program_scope_global_variables} feature, - * In pointers used as arguments to the kernel function (address space pointed - to must be speficied explicitly). + * Pointers used as arguments to kernel functions (the address space pointed + to must be specified explicitly). ==== [[table-addr-spaces-summary]] .Address space behavior -[%header,cols=4*] +[width="100%",cols="1,2,2,2",options="header"] |==== | *Address Space* -| *Scope/Type* -| *Initialization* -| *Inference* + | *Supported Usage* + | *Initialization* + | *Inference* | `+__global+` -a| * Program scope variable, + | Program scope variables, for OpenCL C 2.0 or + OpenCL C 3.0 with the {opencl_c_program_scope_global_variables} feature, - * `static`/`extern` local variable for OpenCL C 2.0 or - OpenCL C 3.0 with `+__opencl_c_program_scope_global_variables+` feature, + `static` or `extern` local variables, for OpenCL C 2.0 or + OpenCL C 3.0 with the {opencl_c_program_scope_global_variables} feature, - * everywhere in pointers. -a| * Optional constant initializers, + Pointers. + | Optional constant initializers, 0-initialized by default. - * 0-initialized by default. -a| * Program scope variable, + | Program scope variables, for OpenCL C 2.0 or + OpenCL C 3.0 with the {opencl_c_program_scope_global_variables} feature. - * `static`/`extern` local variable for OpenCL C 2.0 or - OpenCL C 3.0 with `+__opencl_c_program_scope_global_variables+` feature. + `static` or `extern` local variables, for OpenCL C 2.0 or + OpenCL C 3.0 with the {opencl_c_program_scope_global_variables} feature. | `+__private+` -a| * Local scope variables, + | Local scope variables, - * function arguments and return types, + Function arguments and return types, - * everywhere in pointers. -a| * Optional initializers, + Pointers. - * no default initialization. -a| * Local scope variables, + | Optional initializers, otherwise no default initialization. + | Local scope variables, - * function arguments and return types, + Function arguments and return types, - * for pointers in which address space they point to is not given explicitly - (for OpenCL prior to version 2.0 or OpenCL C 3.0 without - `+__opencl_c_generic_address_space+` feature. + Pointers in which the address space they point to is not given explicitly, + for OpenCL C prior to version 2.0 or OpenCL C 3.0 without the + {opencl_c_generic_address_space} feature. | `+__constant+` -a| * Program scope variables, + | Program scope variables, - * kernel scope variables, + Kernel scope variables, - * everywhere for string literal, + String literals, - * everywhere in pointers. -| Mandatory initialization with compile time constant. -| For all string literals. + Pointers. + | Mandatory initialization with a compile time constant. + | String literals. | `+__local+` -a| * Kernel scope variables, - - * everywhere in pointers. -| No initializers. -| None. - -| Generic (for OpenCL C 2.0 or OpenCL C 3.0 with `+__opencl_c_generic_address_space+` feature) -| All pointers in which address space they point to is not given explicitly. -| Not applicable. -| All pointers in which address space they point to is not given explicitly. + | Kernel scope variables, + + Pointers. + | Not supported. + | Not supported. + +| Generic + | Pointers, for OpenCL C 2.0 or OpenCL C 3.0 with the + {opencl_c_generic_address_space} feature + | Not applicable. + | Pointers in which the address space they point to is not given explicitly, + for OpenCL C 2.0 or OpenCL C 3.0 with the {opencl_c_generic_address_space} + feature. |==== - -- [[addr-spaces-conversions]] @@ -2588,21 +2584,23 @@ OpenCL implements the address space nesting model for pointers from [NOTE] ==== -OpenCL definition of the generic address space is different to the definition in -<>. In OpenCL no objects can be -allocated in this address space. It can only be used with pointer types, where a -pointer pointing to a location in the generic address space can be used for -objects allocated in any of the following concrete named address spaces: -`private`, `local`, or `global`. +The OpenCL definition of the generic address space is different than the +definition in section 5 of the <>. In +OpenCL C, no objects can be allocated in this address space. It can only be used +with pointer types, where a pointer pointing to a location in the generic +address space can be used for objects allocated in any of the concrete named +address spaces `private`, `local`, or `global`. ==== -Following <>, it is only allowed to -convert pointers implicitly i.e. in assignments, function parameters, operations, -if the original pointer points to an object qualified by an address space -enclosed into the address space pointed by the destination pointer. +Following section 5.3 of the <>, it +is only allowed to convert pointers implicitly, i.e. in assignments, function +parameters, operations, if the original pointer points to an object qualified by +an address space enclosed into the address space pointed by the destination +pointer. -In contrast to Embedded C, explicitly converting i.e. casting between pointers to -non-overlapping address spaces is illegal in OpenCL. +In contrast to the <>, explicitly +converting i.e. casting between pointers to non-overlapping address spaces is +illegal in OpenCL. Considering the above, the following applies to conversions of pointers pointing to different address spaces: @@ -2620,7 +2618,7 @@ Examples: This is the canonical example. In this example, function `foo` is declared with an argument that is a -pointer with unnamed generic address space address space qualifier. +pointer with the unnamed generic address space address space qualifier. [source,c] ---------- @@ -2716,20 +2714,21 @@ private int *pp; constant int *cp; int *p; -p = gp; // legal -p = lp; // legal -p = pp; // legal -p = cp; // illegal +p = gp; // OK. +p = lp; // OK. +p = pp; // OK. +p = cp; // Error. // it is illegal to convert from a generic pointer // to an explicit address space pointer without a cast: -gp = p; // compile-time error -lp = p; // compile-time error -pp = p; // compile-time error -cp = p; // compile-time error +gp = p; // Error. +lp = p; // Error. +pp = p; // Error. +cp = p; // Error. ---------- -Example below illustrates the implicit conversion between named address spaces. +The example below illustrates the implicit conversion between named address +spaces. [source,c] ---------- @@ -2741,24 +2740,24 @@ constant int *cp; // it is illegal to convert pointers pointing to different // named address spaces. -gp = lp; // compile-time error -gp = pp; // compile-time error -gp = cp; // compile-time error +gp = lp; // Error. +gp = pp; // Error. +gp = cp; // Error. -lp = gp; // compile-time error -lp = pp; // compile-time error -lp = cp; // compile-time error +lp = gp; // Error. +lp = pp; // Error. +lp = cp; // Error. -pp = lp; // compile-time error -pp = gp; // compile-time error -pp = cp; // compile-time error +pp = lp; // Error. +pp = gp; // Error. +pp = cp; // Error. -cp = lp; // compile-time error -cp = pp; // compile-time error -cp = gp; // compile-time error +cp = lp; // Error. +cp = pp; // Error. +cp = gp; // Error. ---------- -Example below demonstrates explicit conversions for pointers pointing to +The example below demonstrates explicit conversions for pointers pointing to different address spaces. [source,c] @@ -2772,14 +2771,14 @@ private int *pp; constant int *cp; int *p; -gp = (global int *)lp; // illegal to cast between named address spaces -p = (int *)lp; // legal to cast from global to generic -gp = (global int*)p; // legal to cast from generic to global +gp = (global int *)lp; // illegal to cast between named address spaces +p = (int *)lp; // legal to cast from global to generic +gp = (global int*)p; // legal to cast from generic to global ---------- -In nested pointers implicit conversions between address spaces are disallowed. +For nested pointers, implicit conversions between address spaces are disallowed. Explicitly casting between different address spaces in nested pointers is -allowed but the use of such pointers can lead to incorrect behavior i.e. +allowed but the use of such pointers can lead to incorrect behavior such as accessing invalid memory locations. [source,c] @@ -2789,20 +2788,28 @@ accessing invalid memory locations. kernel void mykernel(...) { -local int *local * ll; -global int *local * gl; -int *local * nl; - -ll = gl; // illegal to convert address spaces implicitly - // in nested pointers. -ll = nl; // illegal to convert address spaces implicitly - // in nested pointers. -ll = (local int* local*)gl; // legal to convert explicitly, - // but uses of 'll' can result in - // in ill-formed program. -ll = (local int* local*)nl; // legal to convert explicitly, - // but uses of 'll' can result in - // in ill-formed program. + // ll is a pointer to a pointer in the local address space, + // which points to an integer in the local address space + local int *local *ll; + + // gl is a pointer to a pointer in the local address space, + // which points to an integer in the global address space + global int *local *gl; + + // nl is a pointer to a pointer in the local address space, + // which points to an integer via the unnamed generic address space + int *local * nl; + + ll = gl; // Error, cannot convert address spaces implicitly + // for nested pointers. + ll = nl; // Error, cannot convert address spaces implicitly + // for nested pointers. + ll = (local int* local*)gl; // OK to convert explicitly, + // but uses of 'll' can result in + // in ill-formed program. + ll = (local int* local*)nl; // OK to convert explicitly, + // but uses of 'll' can result in + // in ill-formed program. } ---------- @@ -3021,7 +3028,7 @@ space and therefore can be qualified with `private`/`__private`. Image objects specified as arguments to a kernel can be declared to be read-only or write-only. -For OpenCL C 2.0, or with the `+__opencl_c_read_write_images+` feature, +For OpenCL C 2.0, or with the {opencl_c_read_write_images} feature, image objects specified as arguments to a kernel can additionally be declared to be read-write. @@ -3374,13 +3381,13 @@ do_proc (__global char *pA, short b, except for those in <>. Such program scope variables may be of any user-defined type, or a pointer to a user-defined type. - - In the presence of shared virtual memory, these pointers or pointer - members should work as expected as long as they are shared virtual memory - pointers and the referenced storage has been mapped appropriately. - Program scope varibales can be declared with `+__constant+` address space - qualifiers or if `__opencl_c_program_scope_global_variables` feature is - supported with `+__global+` address space qualifier. ++ +In the presence of shared virtual memory, these pointers or pointer +members should work as expected as long as they are shared virtual memory +pointers and the referenced storage has been mapped appropriately. +Program scope varibales can be declared with `+__constant+` address space +qualifiers or if {opencl_c_program_scope_global_variables} feature is +supported with `+__global+` address space qualifier. -- @@ -3398,12 +3405,12 @@ The *#pragma* directive is described as: * *#pragma* _pp-tokens~opt~_ _new-line_ A *#pragma* directive where the preprocessing token `OPENCL` (used instead -of *`STDC`*) does not immediately follow *pragma* in the directive (prior to +of *`STDC`*) does not immediately follow *#pragma* in the directive (prior to any macro replacement) causes the implementation to behave in an implementation-defined manner. The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner. -Any such *pragma* that is not recognized by the implementation is ignored. +Any such *#pragma* that is not recognized by the implementation is ignored. If the preprocessing token `OPENCL` does immediately follow *#pragma* in the directive (prior to any macro replacement), then no macro replacement is performed on the directive, and the directive shall have one of the @@ -3494,7 +3501,7 @@ __kernel __attribute__((work_group_size_hint(X, 1, 1))) \ This is an integer constant of 1 if images are supported and is undefined otherwise. Also refer to the value of the <> and the `+__opencl_c_images+` + `CL_DEVICE_IMAGE_SUPPORT` device query>> and the {opencl_c_images} feature. `+__FAST_RELAXED_MATH__+` :: @@ -4011,7 +4018,7 @@ information provided is in some sense correct. -- NOTE: The functionality described in this section <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the -`+__opencl_c_device_enqueue+` feature. +{opencl_c_device_enqueue} feature. This section describes the clang block syntax footnote:[{fn-clang-block-syntax}]. @@ -4483,7 +4490,7 @@ identifier of each work-item when this kernel is being executed on a device. |==== NOTE: The functionality described in the following table <> support for OpenCL C 3.0 or newer and the `+__opencl_c_subgroups+` +requires>> support for OpenCL C 3.0 or newer and the {opencl_c_subgroups} feature. The following table describes the list of built-in work-item functions that @@ -4664,7 +4671,7 @@ all arguments and the return type, unless otherwise specified. gentype *fract*(gentype _x_, {private} gentype _*iptr_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + gentype *fract*(gentype _x_, gentype _*iptr_) | Returns *fmin*(_x_ - *floor*(_x_), `0x1.fffffep-1f`). @@ -4680,7 +4687,7 @@ all arguments and the return type, unless otherwise specified. float **frexp**(float _x_, {private} int *exp) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + float__n__ **frexp**(float__n__ _x_, int__n__ *exp) + float **frexp**(float _x_, int *exp) @@ -4698,7 +4705,7 @@ all arguments and the return type, unless otherwise specified. double **frexp**(double _x_, {private} int *exp) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + double__n__ **frexp**(double__n__ _x_, int__n__ *exp) + double **frexp**(double _x_, int *exp) @@ -4739,7 +4746,7 @@ all arguments and the return type, unless otherwise specified. double **lgamma_r**(double _x_, {private} int *_signp_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + float__n__ **lgamma_r**(float__n__ _x_, int__n__ *_signp_) + float **lgamma_r**(float _x_, int *_signp_) + @@ -4783,7 +4790,7 @@ all arguments and the return type, unless otherwise specified. gentype *modf*(gentype _x_, {private} gentype _*iptr_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + gentype *modf*(gentype _x_, gentype _*iptr_) | Decompose a floating-point number. @@ -4826,7 +4833,7 @@ all arguments and the return type, unless otherwise specified. float **remquo**(float _x_, float _y_, {private} int _*quo_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + float__n__ **remquo**(float__n__ _x_, float__n__ _y_, int__n__ _*quo_) + float **remquo**(float _x_, float _y_, int _*quo_) @@ -4849,7 +4856,7 @@ all arguments and the return type, unless otherwise specified. double **remquo**(double _x_, double _y_, {private} int _*quo_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + double__n__ **remquo**(double__n__ _x_, double__n__ _y_, int__n__ _*quo_) + double **remquo**(double _x_, double _y_, int _*quo_) @@ -4883,7 +4890,7 @@ all arguments and the return type, unless otherwise specified. gentype *sincos*(gentype _x_, {private} gentype _*cosval_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + gentype *sincos*(gentype _x_, gentype _*cosval_) | Compute sine and cosine of x. @@ -5052,7 +5059,7 @@ single precision floating-point number. |==== If double precision is supported by the device, e.g. for OpenCL C 3.0 or newer -the `+__opencl_c_fp64+` feature macro is present, the following symbolic +the {opencl_c_fp64} feature macro is present, the following symbolic constants will also be available: [cols=",",] @@ -5166,7 +5173,7 @@ They are of type `float` and are accurate within the precision of the |==== If double precision is supported by the device, e.g. for OpenCL C 3.0 or newer -the `+__opencl_c_fp64+` feature macro is present, then the following macros +the {opencl_c_fp64} feature macro is present, then the following macros and constants are also available: The `FP_FAST_FMA` macro indicates whether the *fma*() family of functions @@ -5245,10 +5252,10 @@ The vector versions of the integer functions operate component-wise. The description is per-component. We use the generic type name `gentype` to indicate that the function can take -`char`, `char{2|3|4|8|16}`, `uchar`, `uchar{2|3|4|8|16}`, `short`, -`short{2|3|4|8|16}`, `ushort`, `ushort{2|3|4|8|16}`, `int`, `int{2|3|4|8|16}`, -`uint`, `uint{2|3|4|8|16}`, `long` footnote:[{fn-int64-supported}], -`long{2|3|4|8|16}`, `ulong`, or `ulong{2|3|4|8|16}` as the type for the +`char`, `char__n__`, `uchar`, `uchar__n__`, `short`, +`short__n__`, `ushort`, `ushort__n__`, `int`, `int__n__`, +`uint`, `uint__n__`, `long` footnote:[{fn-int64-supported}], +`long__n__`, `ulong`, or `ulong__n__` as the type for the arguments. We use the generic type name `ugentype` to refer to unsigned versions of `gentype`. @@ -5260,9 +5267,10 @@ For built-in integer functions that take `gentype` and `sgentype` arguments, the `gentype` argument must be a vector or scalar version of the `sgentype` argument. For example, if `sgentype` is `uchar`, `gentype` must be `uchar` or -`uchar{2|3|4|8|16}`. +`uchar__n__`. For vector versions, `sgentype` is implicitly widened to `gentype` as described for <>. +_n_ is 2, 3, 4, 8, or 16. For any specific use of a function, the actual type has to be the same for all arguments and the return type unless otherwise specified. @@ -5803,7 +5811,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, gentype__n__ **vload__n__**(size_t _offset_, const {private} gentype *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + gentype__n__ **vload__n__**(size_t _offset_, const gentype *_p_) | Return `sizeof(gentype__n__)` bytes of data, where the first `(__n__ * @@ -5818,7 +5826,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, void **vstore__n__**(gentype__n__ _data_, size_t _offset_, {private} gentype *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + void **vstore__n__**(gentype__n__ _data_, size_t _offset_, gentype *_p_) | Write `_n_ * sizeof(gentype)` bytes given by _data_ to the address @@ -5833,7 +5841,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, float **vload_half**(size_t _offset_, const {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + float **vload_half**(size_t _offset_, const half *_p_) | Read `sizeof(half)` bytes of data from the address computed as `(_p_ @@ -5848,7 +5856,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, float__n__ **vload_half__n__**(size_t _offset_, const {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + float__n__ **vload_half__n__**(size_t _offset_, const half *_p_) | Read `(_n_ * sizeof(half))` bytes of data from the address computed as @@ -5876,7 +5884,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, void **vstore_half{rtn}**(float _data_, size_t _offset_, {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + void **vstore_half**(float _data_, size_t _offset_, half *_p_) + void **vstore_half{rte}**(float _data_, size_t _offset_, half *_p_) + @@ -5910,7 +5918,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, void **vstore_half__n__{rtn}**(float__n__ _data_, size_t _offset_, {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + void **vstore_half__n__**(float__n__ _data_, size_t _offset_, half *_p_) + void **vstore_half__n__{rte}**(float__n__ _data_, size_t _offset_, half *_p_) + @@ -5945,7 +5953,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, void **vstore_half{rtn}**(double _data_, size_t _offset_, {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + void **vstore_half**(double _data_, size_t _offset_, half *_p_) + void **vstore_half{rte}**(double _data_, size_t _offset_, half *_p_) + @@ -5979,7 +5987,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, void **vstore_half__n__{rtn}**(double__n__ _data_, size_t _offset_, {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + void **vstore_half__n__**(double__n__ _data_, size_t _offset_, half *_p_) + void **vstore_half__n__{rte}**(double__n__ _data_, size_t _offset_, half *_p_) + @@ -6000,7 +6008,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, float__n__ **vloada_half__n__**(size_t _offset_, const {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + float__n__ **vloada_half__n__**(size_t _offset_, const half *_p_) | For n = 2, 4, 8 and 16, read `sizeof(half__n__)` bytes of data from @@ -6033,7 +6041,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, void **vstorea_half__n__{rtn}**(float__n__ _data_, size_t _offset_, {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + void **vstorea_half__n__**(float__n__ _data_, size_t _offset_, half *_p_) + void **vstorea_half__n__{rte}**(float__n__ _data_, size_t _offset_, half *_p_) + @@ -6072,7 +6080,7 @@ The suffix _n_ is also used in the function names (i.e. *vload__n__*, void **vstorea_half__n__{rtn}**(double__n__ _data_, size_t _offset_, {private} half *_p_) + For OpenCL C 2.0, or OpenCL C 3.0 or newer with the - `+__opencl_c_generic_address_space+` feature: + + {opencl_c_generic_address_space} feature: + void **vstorea_half__n__**(double__n__ _data_, size_t _offset_, half *_p_) + void **vstorea_half__n__{rte}**(double__n__ _data_, size_t _offset_, half *_p_) + @@ -6186,7 +6194,7 @@ in a work-group. -- NOTE: The functionality described in the following table <> support for OpenCL 3.0 or newer and the `+__opencl_c_subgroups+` +requires>> support for OpenCL 3.0 or newer and the {opencl_c_subgroups} feature. The following table describes built-in functions to synchronize the work-items @@ -6320,7 +6328,7 @@ The OpenCL C programming language implements the following explicit memory fence NOTE: The functionality described in this section <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the -`+__opencl_c_generic_address_space+` feature. +{opencl_c_generic_address_space} feature. This section describes built-in functions to safely convert from pointers to the generic address space to pointers to named address spaces, and to @@ -6365,13 +6373,14 @@ The OpenCL C programming language implements the <> that provide asynchronous copies between `global` and local memory and a prefetch from `global` memory. -We use the generic type name `gentype` to indicate the built-in data types char, -`char{2|3|4|8|16}`, `uchar`, `uchar{2|3|4|8|16}`, `short`, `short{2|3|4|8|16}`, -`ushort`, `ushort{2|3|4|8|16}`, `int`, `int{2|3|4|8|16}`, `uint`, -`uint{2|3|4|8|16}`, `long` footnote:[{fn-int64-supported}], `long{2|3|4|8|16}`, -`ulong`, `ulong{2|3|4|8|16}`, `float`, `float{2|3|4|8|16}`, or `double` -footnote:[{fn-double-supported}], `double{2|3|4|8|16}` as the type for -the arguments unless otherwise stated footnote:[{fn-vec3-async-copy}]. +We use the generic type name `gentype` to indicate the built-in data types `char`, +`char__n__`, `uchar`, `uchar__n__`, `short`, `short__n__`, +`ushort`, `ushort__n__`, `int`, `int__n__`, `uint`, +`uint__n__`, `long` footnote:[{fn-int64-supported}], `long__n__`, +`ulong`, `ulong__n__`, `float`, `float__n__`, `double` +footnote:[{fn-double-supported}], and `double__n__` as the type for +the arguments unless otherwise stated. +_n_ is 2, 3 footnote:[{fn-vec3-async-copy}], 4, 8, or 16. [[table-builtin-async-copy]] .Built-in Async Copy and Prefetch Functions @@ -6641,7 +6650,7 @@ work_group_barrier(CLK_LOCAL_MEM_FENCE); NOTE: The function variant that uses the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -6667,19 +6676,19 @@ The following table lists the enumeration constants: | <> support for OpenCL C 2.0 or newer. | `memory_order_acquire` | <> support for OpenCL C 2.0, but in OpenCL C 3.0 - or newer some uses require the `+__opencl_c_atomic_order_acq_rel+` + or newer some uses require the {opencl_c_atomic_order_acq_rel} feature. | `memory_order_release` | <> support for OpenCL C 2.0, but in OpenCL C 3.0 - or newer some uses require the `+__opencl_c_atomic_order_acq_rel+` + or newer some uses require the {opencl_c_atomic_order_acq_rel} feature. | `memory_order_acq_rel` | <> support for OpenCL C 2.0, but in OpenCL C 3.0 - or newer some uses require the `+__opencl_c_atomic_order_acq_rel+` + or newer some uses require the {opencl_c_atomic_order_acq_rel} feature. | `memory_order_seq_cst` | <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_atomic_order_seq_cst+` feature. + newer and the {opencl_c_atomic_order_seq_cst} feature. |==== The `memory_order` can be used when performing atomic operations to `global` @@ -6710,19 +6719,19 @@ The following table lists the enumeration constants: <> support for OpenCL C 2.0 or newer. | `memory_scope_sub_group` | <> support for OpenCL C 3.0 or newer and the - `+__opencl_c_subgroups+` feature. + {opencl_c_subgroups} feature. | `memory_scope_work_group` | <> support for OpenCL C 2.0 or newer. | `memory_scope_device` | <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_atomic_scope_device+` feature. + newer and the {opencl_c_atomic_scope_device} feature. | `memory_scope_all_svm_devices` | <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_atomic_scope_all_devices+` feature. + newer and the {opencl_c_atomic_scope_all_devices} feature. | `memory_scope_all_devices` | An alias for `memory_scope_all_svm_devices`. <> support for OpenCL C 3.0 or newer and the - `+__opencl_c_atomic_scope_all_devices+` feature. + {opencl_c_atomic_scope_all_devices} feature. |==== // This is no longer correct given `memory_scope_sub_group`. @@ -6882,14 +6891,14 @@ Memory is affected according to the value of _order_. NOTE: The non-explicit `atomic_store` function <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the -`+__opencl_c_atomic_order_seq_cst+` and `+__opencl_c_atomic_scope_device+` +{opencl_c_atomic_order_seq_cst} and {opencl_c_atomic_scope_device} features. For the explicit variants, memory order and scope enumerations must respect the <>. NOTE: The function variants that use the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -6946,14 +6955,14 @@ Atomically returns the value pointed to by _object_. NOTE: The non-explicit `atomic_load` function <> support for OpenCL C 2.0 or OpenCL C 3.0 or newer and both the -`+__opencl_c_atomic_order_seq_cst+` and `+__opencl_c_atomic_scope_device+` +{opencl_c_atomic_order_seq_cst} and {opencl_c_atomic_scope_device} features. For the explicit variants, memory order and scope enumerations must respect the <>. NOTE: The function variants that use the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -7018,14 +7027,14 @@ effects. NOTE: The non-explicit `atomic_exchange` function <> support for OpenCL C 2.0 or OpenCL C 3.0 or newer and both the -`+__opencl_c_atomic_order_seq_cst+` and `+__opencl_c_atomic_scope_device+` +{opencl_c_atomic_order_seq_cst} and {opencl_c_atomic_scope_device} features. For the explicit variants, memory order and scope enumerations must respect the <>. NOTE: The function variants that use the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -7104,7 +7113,7 @@ bool atomic_compare_exchange_strong_explicit( memory_order failure) // Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the -// __opencl_c_generic_address_space feature. +// opencl_c_generic_address_space feature. bool atomic_compare_exchange_strong_explicit( volatile A *object, C *expected, @@ -7233,7 +7242,7 @@ bool atomic_compare_exchange_weak_explicit( memory_order failure) // Requires OpenCL C 2.0, or OpenCL C 3.0 or newer and the -// __opencl_c_generic_address_space feature. +// opencl_c_generic_address_space feature. bool atomic_compare_exchange_weak_explicit( volatile A *object, C *expected, @@ -7334,14 +7343,14 @@ These generic functions return the result of the comparison. NOTE: The non-explicit `atomic_compare_exchange_strong` and `atomic_compare_exchange_weak` functions <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the -`+__opencl_c_atomic_order_seq_cst+` and `+__opencl_c_atomic_scope_device+` +{opencl_c_atomic_order_seq_cst} and {opencl_c_atomic_scope_device} features. For the explicit variants, memory order and scope enumerations must respect the <>. NOTE: The function variants that use the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -7437,14 +7446,14 @@ effects. NOTE: The non-explicit `atomic_fetch_key` functions <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the -`+__opencl_c_atomic_order_seq_cst+` and `+__opencl_c_atomic_scope_device+` +{opencl_c_atomic_order_seq_cst} and {opencl_c_atomic_scope_device} features. For the explicit variants, memory order and scope enumerations must respect the <>. NOTE: The function variants that use the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -7539,14 +7548,14 @@ Returns atomically the value of the `object` immediately before the effects. NOTE: The non-explicit `atomic_flag_test_and_set` function <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the -`+__opencl_c_atomic_order_seq_cst+` and `+__opencl_c_atomic_scope_device+` +{opencl_c_atomic_order_seq_cst} and {opencl_c_atomic_scope_device} features. For the explicit variants, memory order and scope enumerations must respect the <>. NOTE: The function variants that use the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -7609,14 +7618,14 @@ Memory is affected according to the value of order. NOTE: The non-explicit `atomic_flag_clear` function <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and both the -`+__opencl_c_atomic_order_seq_cst+` and `+__opencl_c_atomic_scope_device+` +{opencl_c_atomic_order_seq_cst} and {opencl_c_atomic_scope_device} features. For the explicit variants, memory order and scope enumerations must respect the <>. NOTE: The function variants that use the generic address space, i.e. no explicit address space is listed, <> support for OpenCL -C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_generic_address_space+` +C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_generic_address_space} feature. -- @@ -7832,30 +7841,30 @@ semantics of the minimum requirements. undefined. * Using `memory_order_acquire` with any built-in atomic function except `atomic_work_item_fence` <> support for OpenCL C - 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_atomic_order_acq_rel+` + 2.0, or OpenCL C 3.0 or newer and the {opencl_c_atomic_order_acq_rel} feature. * Using `memory_order_release` with any built-in atomic function except `atomic_work_item_fence` <> support for OpenCL C - 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_atomic_order_acq_rel+` + 2.0, or OpenCL C 3.0 or newer and the {opencl_c_atomic_order_acq_rel} feature. * Using `memory_order_acq_rel` with any built-in atomic function except `atomic_work_item_fence` <> support for OpenCL C - 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_atomic_order_acq_rel+` + 2.0, or OpenCL C 3.0 or newer and the {opencl_c_atomic_order_acq_rel} feature. * Using `memory_order_seq_cst` with any built-in atomic function <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_atomic_order_seq_cst+` feature. + newer and the {opencl_c_atomic_order_seq_cst} feature. * Using `memory_scope_sub_group` with any built-in atomic function <> support for OpenCL C 3.0 or newer and the - `+__opencl_c_subgroups+` feature. + {opencl_c_subgroups} feature. * Using `memory_scope_device` <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the - `+__opencl_c_atomic_scope_device+` feature. + {opencl_c_atomic_scope_device} feature. * Using `memory_scope_all_svm_devices` <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_atomic_scope_all_devices+` feature. + newer and the {opencl_c_atomic_scope_all_devices} feature. * Using `memory_scope_all_devices` <> support for OpenCL - C 3.0 or newer and the `+__opencl_c_atomic_scope_all_devices+` feature. + C 3.0 or newer and the {opencl_c_atomic_scope_all_devices} feature. -- @@ -7868,14 +7877,15 @@ semantics of the minimum requirements. The OpenCL C programming language implements the following additional built-in vector functions. We use the generic type name `gentype__n__` (or `gentype__m__`) to indicate the -built-in data types `char{2|4|8|16}`, `uchar{2|4|8|16}`, `short{2|4|8|16}`, -`ushort{2|4|8|16}`, `half{2|4|8|16}` footnote:[{fn-half-supported}], -`int{2|4|8|16}`, `uint{2|4|8|16}`, `long{2|4|8|16}` -footnote:[{fn-int64-supported}], `ulong{2|4|8|16}`, `float{2|4|8|16}`, or -`double{2|4|8|16}` footnote:[{fn-double-supported}] as the type for +built-in data types `char__n__`, `uchar__n__`, `short__n__`, +`ushort__n__`, +`int__n__`, `uint__n__`, `long__n__` +footnote:[{fn-int64-supported}], `ulong__n__`, `half__n__` footnote:[{fn-half-supported}], `float__n__`, or +`double__n__` footnote:[{fn-double-supported}] as the type for the arguments unless otherwise stated. We use the generic name `ugentype__n__` to indicate the built-in unsigned integer data types. +_n_ is 2, 4, 8, or 16. [[table-misc-vector]] .Built-in Miscellaneous Vector Functions @@ -8275,7 +8285,7 @@ specifier. ==== The conversion specifiers *e,E,g,G,a,A* convert a `float` or `half` argument that is a scalar type to a `double` only if the `double` data type is -supported, e.g. for OpenCL C 3.0 or newer the `+__opencl_c_fp64+` feature +supported, e.g. for OpenCL C 3.0 or newer the {opencl_c_fp64} feature macro is present. If the `double` data type is not supported, the argument will be a `float` instead of a `double` and the `half` type will be converted to a `float`. @@ -8415,7 +8425,7 @@ If a device supports images then the value of the <>) is `CL_TRUE` and the OpenCL C compiler for that device must define the `+__IMAGE_SUPPORT__+` macro. A compiler for OpenCL C 3.0 or newer for that device must also support the -`+__opencl_c_images+` feature. +{opencl_c_images} feature. Image memory objects that are being read by a kernel should be declared with the `read_only` qualifier. @@ -9657,7 +9667,7 @@ For write functions this may be `write_only` or `read_write`. image depth-1], respectively, is undefined. <> support for OpenCL C 2.0, or OpenCL C 3.0 or - newer and the `+__opencl_c_3d_image_writes+` feature, or the + newer and the {opencl_c_3d_image_writes} feature, or the `cl_khr_3d_image_writes` extension. |==== -- @@ -9928,7 +9938,7 @@ support will result in a `CL_OUT_OF_RESOURCES` error being returned. NOTE: The functionality described in this section <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the -`+__opencl_c_work_group_collective_functions+` feature. +{opencl_c_work_group_collective_functions} feature. This section decribes built-in functions that perform collective options across a work-group. @@ -10038,7 +10048,7 @@ given work-group. === Pipe Functions NOTE: The functionality described in this section <> -support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the `+__opencl_c_pipes+` feature. +support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the {opencl_c_pipes} feature. A pipe is identified by specifying the `pipe` keyword with a type. The data type specifies the size of each packet in the pipe. @@ -10308,7 +10318,7 @@ The following behavior is undefined: -- NOTE: The functionality described in this section <> support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the -`+__opencl_c_device_enqueue+` feature. +{opencl_c_device_enqueue} feature. This section describes built-in functions that allow a kernel to enqueue additional work to the same device, without host interaction. @@ -10984,7 +10994,7 @@ foo(queue_t q, ...) -- NOTE: The functionality described in this section <> -support for OpenCL C 3.0 or newer and the `+__opencl_c_subgroups+` feature. +support for OpenCL C 3.0 or newer and the {opencl_c_subgroups} feature. The table below describes OpenCL C programming language built-in functions that operate on a subgroup level. These built-in functions must be encountered by all work-items in the subgroup executing the kernel. @@ -11061,8 +11071,8 @@ The order of these floating-point operations is also non-deterministic for a giv ==== NOTE: The functionality described in the following table <> support for OpenCL C 3.0 or newer and the `+__opencl_c_subgroups+` -and `+__opencl_c_pipes+` features. +requires>> support for OpenCL C 3.0 or newer and the {opencl_c_subgroups} +and {opencl_c_pipes} features. The following table describes built-in pipe functions that operate at a subgroup level. @@ -11114,8 +11124,8 @@ The order of subgroup based reservations that belong to different work groups is implementation defined. NOTE: The functionality described in the following table <> support for OpenCL C 3.0 or newer and the `+__opencl_c_subgroups+` -and `+__opencl_c_device_enqueue+` features. +requires>> support for OpenCL C 3.0 or newer and the {opencl_c_subgroups} +and {opencl_c_device_enqueue} features. The following table describes built-in functions to query subgroup information for a block to be enqueued. diff --git a/c/feature-dictionary.asciidoc b/c/feature-dictionary.asciidoc new file mode 100644 index 00000000..172431e8 --- /dev/null +++ b/c/feature-dictionary.asciidoc @@ -0,0 +1,123 @@ +// Copyright 2017-2020 The Khronos Group. This work is licensed under a +// Creative Commons Attribution 4.0 International License; see +// http://creativecommons.org/licenses/by/4.0/ + +// opencl_c_3d_image_writes +ifdef::backend-html5[] +:opencl_c_3d_image_writes: pass:q[`\__opencl_c_3d_image_writes`] +endif::[] +ifndef::backend-html5[] +:opencl_c_3d_image_writes: pass:q[`\__opencl_c_​3d_​image_​writes`] +endif::[] + +// opencl_c_atomic_order_acq_rel +ifdef::backend-html5[] +:opencl_c_atomic_order_acq_rel: pass:q[`\__opencl_c_atomic_order_acq_rel`] +endif::[] +ifndef::backend-html5[] +:opencl_c_atomic_order_acq_rel: pass:q[`\__opencl_c_​atomic_​order_​​`] +endif::[] + +// opencl_c_atomic_order_seq_cst +ifdef::backend-html5[] +:opencl_c_atomic_order_seq_cst: pass:q[`\__opencl_c_atomic_order_seq_cst`] +endif::[] +ifndef::backend-html5[] +:opencl_c_atomic_order_seq_cst: pass:q[`\__opencl_c_​atomic_​order_​seq_​cst`] +endif::[] + +// opencl_c_atomic_scope_device +ifdef::backend-html5[] +:opencl_c_atomic_scope_device: pass:q[`\__opencl_c_atomic_scope_device`] +endif::[] +ifndef::backend-html5[] +:opencl_c_atomic_scope_device: pass:q[`\__opencl_c_​atomic_​scope_​device`] +endif::[] + +// opencl_c_atomic_scope_all_devices +ifdef::backend-html5[] +:opencl_c_atomic_scope_all_devices: pass:q[`\__opencl_c_atomic_scope_all_devices`] +endif::[] +ifndef::backend-html5[] +:opencl_c_atomic_scope_all_devices: pass:q[`\__opencl_c_​atomic_​scope_​all_​devices`] +endif::[] + +// opencl_c_device_enqueue +ifdef::backend-html5[] +:opencl_c_device_enqueue: pass:q[`\__opencl_c_device_enqueue`] +endif::[] +ifndef::backend-html5[] +:opencl_c_device_enqueue: pass:q[`\__opencl_c_​device_​enqueue`] +endif::[] + +// opencl_c_generic_address_space +ifdef::backend-html5[] +:opencl_c_generic_address_space: pass:q[`\__opencl_c_generic_address_space`] +endif::[] +ifndef::backend-html5[] +:opencl_c_generic_address_space: pass:q[`\__opencl_c_​generic_​address_​space`] +endif::[] + +// opencl_c_fp64 +ifdef::backend-html5[] +:opencl_c_fp64: pass:q[`\__opencl_c_fp64`] +endif::[] +ifndef::backend-html5[] +:opencl_c_fp64: pass:q[`\__opencl_c_​fp64`] +endif::[] + +// opencl_c_images +ifdef::backend-html5[] +:opencl_c_images: pass:q[`\__opencl_c_images`] +endif::[] +ifndef::backend-html5[] +:opencl_c_images: pass:q[`\__opencl_c_​images`] +endif::[] + +// opencl_c_int64 +ifdef::backend-html5[] +:opencl_c_int64: pass:q[`\__opencl_c_int64`] +endif::[] +ifndef::backend-html5[] +:opencl_c_int64: pass:q[`\__opencl_c_​int64`] +endif::[] + +// opencl_c_pipes +ifdef::backend-html5[] +:opencl_c_pipes: pass:q[`\__opencl_c_pipes`] +endif::[] +ifndef::backend-html5[] +:opencl_c_pipes: pass:q[`\__opencl_c_​pipes`] +endif::[] + +// opencl_c_program_scope_global_variables +ifdef::backend-html5[] +:opencl_c_program_scope_global_variables: pass:q[`\__opencl_c_program_scope_global_variables`] +endif::[] +ifndef::backend-html5[] +:opencl_c_program_scope_global_variables: pass:q[`\__opencl_c_​program_​scope_​global_​variables`] +endif::[] + +// opencl_c_read_write_images +ifdef::backend-html5[] +:opencl_c_read_write_images: pass:q[`\__opencl_c_read_write_images`] +endif::[] +ifndef::backend-html5[] +:opencl_c_read_write_images: pass:q[`\__opencl_c_​read_​write_​images`] +endif::[] + +// opencl_c_subgroups +ifdef::backend-html5[] +:opencl_c_subgroups: pass:q[`\__opencl_c_subgroups`] +endif::[] +ifndef::backend-html5[] +:opencl_c_subgroups: pass:q[`\__opencl_c_​subgroups`] +endif::[] + +// opencl_c_work_group_collective_functions +ifdef::backend-html5[] +:opencl_c_work_group_collective_functions: pass:q[`\__opencl_c_work_group_collective_functions`] +endif::[] +ifndef::backend-html5[] +:opencl_c_work_group_collective_functions: pass:q[`\__opencl_c_​work_​group_​collective_​functions`] +endif::[] diff --git a/scripts/clconventions.py b/scripts/clconventions.py index 867c08be..9053de03 100644 --- a/scripts/clconventions.py +++ b/scripts/clconventions.py @@ -213,6 +213,7 @@ def extra_refpage_headers(self): return 'include::../config/attribs.txt[]\n' + \ 'include::../api/footnotes.asciidoc[]\n' + \ 'include::../c/footnotes.asciidoc[]\n' + \ + 'include::../c/feature-dictionary.asciidoc[]\n' + \ 'include::{generated}/api/api-dictionary-no-links.asciidoc[]' @property