From 62b11fb888b418b89d0d36237581df6c28507b66 Mon Sep 17 00:00:00 2001 From: Ewan Crawford Date: Fri, 13 Dec 2024 13:56:09 +0000 Subject: [PATCH 1/3] Refactor command-buffer queue compatability As proposed in https://github.com/KhronosGroup/OpenCL-Docs/issues/1142 the PR changes the semantics of the command-queues parameters used for command-buffer creation and enqueue. The queues used on command-buffer creation now only inform the device and dependencies of commands, rather than restricting the properties set on the queues used for command-buffer enqueue. This is based ontop on the change in https://github.com/KhronosGroup/OpenCL-Docs/pull/850 to add supported queue property semantics. --- api/cl_khr_command_buffer.asciidoc | 40 +++++---- ...l_khr_command_buffer_multi_device.asciidoc | 5 +- api/opencl_platform_layer.asciidoc | 3 +- api/opencl_runtime_layer.asciidoc | 87 +++++++++++-------- xml/cl.xml | 2 +- 5 files changed, 76 insertions(+), 61 deletions(-) diff --git a/api/cl_khr_command_buffer.asciidoc b/api/cl_khr_command_buffer.asciidoc index 97d706e8a..e94a578ac 100644 --- a/api/cl_khr_command_buffer.asciidoc +++ b/api/cl_khr_command_buffer.asciidoc @@ -12,7 +12,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer.txt[] === Other Extension Metadata *Last Modified Date*:: - 2024-10-02 + 2024-12-13 *IP Status*:: No known IP claims. *Contributors*:: @@ -43,11 +43,6 @@ Command-buffers enable a reduction in overhead when enqueuing the same workload multiple times. By separating the command-queue setup from dispatch, the ability to replay a set of previously created commands is introduced. -The command-queues a command-buffer will be executed on can be set on replay via -parameters to {clEnqueueCommandBufferKHR}, provided they are -<> with the command-queues used on command-buffer -recording. - ==== Background On embedded devices where building a command stream accounts for a significant @@ -74,7 +69,7 @@ or writes memory objects; or enqueues a native kernel, is not available for command-buffer recording. Finally commands recorded into a command buffer do not wait for or return event objects, these are instead replaced with device-side synchronization-point identifiers which enable out-of-order -execution when enqueued on <> command-queues. +execution of the command-buffer commands. Adding new entry-points for individual commands, rather than recording existing command-queue APIs with begin/end markers was a design decision made for the @@ -102,16 +97,22 @@ following reasons: ==== Command Synchronization -Device-side {cl_sync_point_khr_TYPE} synchronization-points can be used within -command-buffers to define command dependencies. This allows the commands of a -command-buffer to execute out-of-order on a single <> -command-queue. The command-buffer itself has no inherent in-order/out-of-order -property, this ordering is inferred from the command-queue used on command -recording. {clEnqueueCommandBufferKHR} submissions to an out-of-order queue -have the same execution semantics are other operations enqueued to an -out-of-order queue, such as {clEnqueueFillBuffer}, where execution between -enqueued operations may happen concurrently unless dependencies between the -operations are expressed with events. +The command-buffer object has no in-order/out-of-order property set on creation, +it is out-of-order, and command ordering is defined by the dependencies set when +commands are created. Command dependencies can be define in 3 ways: + +1. Device-side {cl_sync_point_khr_TYPE} synchronization-points, providing an + explicit list of the commands to depend on. +2. Appending a {clCommandBarrierWithWaitListKHR} barrier command. +3. Passing an in-order queue when creating the command, creating an implicit + dependency on the any previous command created in the command-buffer using + the same queue. + +{clEnqueueCommandBufferKHR} submissions to an out-of-order queue have the same +execution semantics are other operations enqueued to an out-of-order queue, +such as {clEnqueueFillBuffer}, where execution between enqueued operations may +happen concurrently unless dependencies between the operations are expressed +with events. The {cl_sync_point_khr_TYPE} type is defined as a `cl_uint`, giving a hard upper limit on the number of commands a command-buffer can hold as @@ -464,5 +465,6 @@ features: * 0.9.5, 2024-07-24 ** Add a properties parameter to all command recording entry-points (provisional). - * 0.9.6, 2024-10-02 - ** Add device query for supported queue properties (provisional). + * 0.9.6, 2024-12-13 + ** Refactor queue compatability between command-buffer creation and enqueue + (provisional). diff --git a/api/cl_khr_command_buffer_multi_device.asciidoc b/api/cl_khr_command_buffer_multi_device.asciidoc index 8a595a5b3..4329f7fa0 100644 --- a/api/cl_khr_command_buffer_multi_device.asciidoc +++ b/api/cl_khr_command_buffer_multi_device.asciidoc @@ -6,7 +6,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer_multi_device.txt[] === Other Extension Metadata *Last Modified Date*:: - 2023-04-30 + 2024-12-13 *IP Status*:: No known IP claims. *Contributors*:: @@ -312,3 +312,6 @@ require it. * Revision 0.9.1, 2023-04-30 ** Added clCommandSVMMemcpyKHR and clCommandSVMMemFillKHR as affected functions (provisional). + * Revision 0.9.2, 2024-12-13 + ** Update clRemapCommandBufferKHR behavior to match cl_khr_command_buffer + version 0.9.6 (provisional). diff --git a/api/opencl_platform_layer.asciidoc b/api/opencl_platform_layer.asciidoc index 6377aca6b..97cec2d8c 100644 --- a/api/opencl_platform_layer.asciidoc +++ b/api/opencl_platform_layer.asciidoc @@ -240,8 +240,7 @@ include::{generated}/api/version-notes/CL_COMMAND_BUFFER_PLATFORM_UNIVERSAL_SYNC {CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR_anchor} - Platform supports the ability to create a deep copy of an existing - command-buffer with the commands explicitly remapped to different, - potentially <>, queues. + command-buffer with the commands explicitly remapped to different queues. include::{generated}/api/version-notes/CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR.asciidoc[] diff --git a/api/opencl_runtime_layer.asciidoc b/api/opencl_runtime_layer.asciidoc index 22978b83e..cc4f9c99c 100644 --- a/api/opencl_runtime_layer.asciidoc +++ b/api/opencl_runtime_layer.asciidoc @@ -14104,16 +14104,33 @@ of 0 or 1. The simultaneous use capability removes this restriction and allows command-buffers to have a <> greater than 1. -[[compatible]] Command-buffers are created using an ordered list of command-queues that -commands are recorded to and execute on by default. -These command-queues can be replaced on command-buffer enqueue with -different command-queues, provided for each element in the replacement list -the substitute command-queue is compatible with the command-queue used on -command-buffer creation. -A _compatible_ command-queue is defined as a command-queue with -identical properties targeting the same device and in the same OpenCL -context. +commands are recorded to and execute on by default. All these queue objects +must share the same context, but may be associated with different devices when +the {cl_khr_command_buffer_multi_device_EXT} extension is supported. + +When constructing a command-buffer by appending commands, the queue parameter +passed for the command being created is used to set the device with which the +command will be associated with, and also inform the scheduling of the command. +If the queue is an in-order queue, then an additional dependency is created on the +last command appended to the command-buffer using the same queue parameter. If +the queue is an out-of-order queue, then no extra dependencies on previous +commands using the same queue are created. All queue properties other than +{CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} are ignored for the purposes of command +creation, with the exception of any vendor extension defined queue properties +that explcitly define semantics for this purpose. + +The command-queues used on command-buffer creation must be replaced on +command-buffer enqueue with the command-queues to execute the command-buffer +on. These may be different command-queues, provided for each element the +substitute command-queue matches the device and context of the command-queue +used on command-buffer creation. Each command-queue in the enqueue list must +also have the minimum properties defined by +{CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR} and no properties +which are not reported by +{CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}. These queue +properties have the same execution semantics for {clEnqueueCommandBufferKHR} +as other operations enqueued to the queue. While constructing a command-buffer it is valid for the user to interleave calls to the same queue which create commands, such as @@ -14177,7 +14194,7 @@ target the same device. Commands recorded to different command-queues in the same command-buffer may be executed concurrently to each other unless synchronized explicitly with -sync-points. +sync-points, barrier commands, or in-order queue implicit dependencies. Ordering of other commands submitted to the same command-queues as used to enqueue a command-buffer is the responsibility of the programmer. A command-buffer enqueue spanning multiple queues can return an event to use @@ -14378,12 +14395,6 @@ returned in _errcode_ret_: * {CL_INVALID_COMMAND_QUEUE} if any command-queue in _queues_ is not a valid command-queue. - * {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any command-queue - in _queues_ contains a property not specified by - {CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}. - * {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any - command-queue in _queues_ does not contain the minimum properties - specified by {CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR}. * {CL_INVALID_CONTEXT} if all the command-queues in _queues_ do not have the same OpenCL context. * {CL_INVALID_VALUE} if the {cl_khr_command_buffer_multi_device_EXT} @@ -14516,10 +14527,10 @@ include::{generated}/api/protos/clEnqueueCommandBufferKHR.txt[] include::{generated}/api/version-notes/clEnqueueCommandBufferKHR.asciidoc[] * _num_queues_ is the number of command-queues listed in _queues_. - * _queues_ is a pointer to an ordered list of command-queues <> with the command-queues used on recording. - _queues_ can be `NULL`, in which case the default command-queues used on - command-buffer creation are used and _num_queues_ must be 0. + * _queues_ is a pointer to an ordered list of command-queues to execute the + command-buffer on. _queues_ can be `NULL`, in which case the default + command-queues used on command-buffer creation are used and _num_queues_ + must be 0. * _command_buffer_ refers to a valid command-buffer object. * _event_wait_list_, _num_events_in_wait_list_ specify events that need to complete before this particular command can be executed. @@ -14564,9 +14575,15 @@ execution was successfully queued, or one of the errors below: _num_queues_ set on _command_buffer_ creation. * {CL_INVALID_COMMAND_QUEUE} if any element of _queues_ is not a valid command-queue. - * {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if any element of _queues_ is not - <> with the command-queue set on - _command_buffer_ creation at the same list index. + * {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any command-queue + in _queues_ contains a property not specified by + {CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}. + * {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any + command-queue in _queues_ does not contain the minimum properties + specified by {CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR}. + * {CL_INVALID_DEVICE} if any element of _queues_ does not have the same + device as the command-queue set on _command_buffer_ creation at the + same list index. * {CL_INVALID_CONTEXT} if any element of _queues_ does not have the same context as the command-queue set on _command_buffer_ creation at the same list index. @@ -15950,22 +15967,18 @@ ifdef::cl_khr_command_buffer_multi_device[] If the {cl_khr_command_buffer_multi_device_EXT} extension is supported, platforms reporting the {CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR} capability support generating a deep copy of a command-buffer with its -commands remapped to a list of command-queues that are potentially -<> with the queues used to create the -command-buffer. -That is, the remapped command-buffer can execute on queues that differ in -terms of properties and/or associated device from the original +commands remapped to different devices than the devices used to create the +commands. That is, the remapped command-buffer can execute on queues that +differ in terms of properties and/or associated device from the original command-buffer queues. This functionality is invoked through a new synchronous entry-point {clRemapCommandBufferKHR} which takes a list of queues to which the commands -should now target. -It then returns a command-buffer containing the same commands as the -original, with the same command dependencies, but targeting different -queues. -A list of command handles may also be passed to the entry-point, which -allows handles to the equivalent commands in the remapped command-buffer to -be returned by an output parameter. +should now target the associated devices of. It then returns a command-buffer +containing the same commands as the original, with the same command +dependencies, but targeting different devices. A list of command handles may +also be passed to the entry-point, which allows handles to the equivalent +commands in the remapped command-buffer to be returned by an output parameter. Device properties restrict remapping possibilities, as existing commands can have a configuration which is not supported by another device, and so @@ -15988,7 +16001,7 @@ appear and disappear during runtime. [open,refpage='clRemapCommandBufferKHR',desc='Create copy of a command-buffer remapped to specified command-queues',type='protos'] -- To create a deep copy of the input command-buffer with the copied commands -remapped to target the passed command-queues, call the function +remapped to target devices of the passed command-queues, call the function include::{generated}/api/protos/clRemapCommandBufferKHR.txt[] include::{generated}/api/version-notes/clRemapCommandBufferKHR.asciidoc[] @@ -16047,8 +16060,6 @@ one of the following error values returned in _errcode_ret_: * {CL_INVALID_OPERATION} if the platform does not support the {CL_COMMAND_BUFFER_PLATFORM_AUTOMATIC_REMAP_KHR} flag and _automatic_ is {CL_TRUE}. - * {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if such an error would be returned - by passing _queues_ to {clCreateCommandBufferKHR}. * Any error relating to device support that can be returned by a command recording entry-point may also be returned. As a command in _command_buffer_ can have a configuration that is not diff --git a/xml/cl.xml b/xml/cl.xml index e147f9770..c93e19666 100644 --- a/xml/cl.xml +++ b/xml/cl.xml @@ -7410,7 +7410,7 @@ server's OpenCL/api-docs repository. - + From ffd3647f49214bf00373f3cfed04489933a549c8 Mon Sep 17 00:00:00 2001 From: Ewan Crawford Date: Tue, 14 Jan 2025 16:20:31 +0000 Subject: [PATCH 2/3] Address review feedback Clarify wording around default list of command-queues used for command-buffer enqueue. --- api/cl_khr_command_buffer.asciidoc | 10 ++++++---- api/cl_khr_command_buffer_multi_device.asciidoc | 2 +- api/opencl_runtime_layer.asciidoc | 9 +++++---- xml/cl.xml | 2 +- 4 files changed, 13 insertions(+), 10 deletions(-) diff --git a/api/cl_khr_command_buffer.asciidoc b/api/cl_khr_command_buffer.asciidoc index e94a578ac..6da252866 100644 --- a/api/cl_khr_command_buffer.asciidoc +++ b/api/cl_khr_command_buffer.asciidoc @@ -99,17 +99,17 @@ following reasons: The command-buffer object has no in-order/out-of-order property set on creation, it is out-of-order, and command ordering is defined by the dependencies set when -commands are created. Command dependencies can be define in 3 ways: +commands are created. Command dependencies can be defined in 3 ways: 1. Device-side {cl_sync_point_khr_TYPE} synchronization-points, providing an explicit list of the commands to depend on. 2. Appending a {clCommandBarrierWithWaitListKHR} barrier command. 3. Passing an in-order queue when creating the command, creating an implicit - dependency on the any previous command created in the command-buffer using + dependency on the previous command created in the command-buffer using the same queue. {clEnqueueCommandBufferKHR} submissions to an out-of-order queue have the same -execution semantics are other operations enqueued to an out-of-order queue, +execution semantics as other operations enqueued to an out-of-order queue, such as {clEnqueueFillBuffer}, where execution between enqueued operations may happen concurrently unless dependencies between the operations are expressed with events. @@ -465,6 +465,8 @@ features: * 0.9.5, 2024-07-24 ** Add a properties parameter to all command recording entry-points (provisional). - * 0.9.6, 2024-12-13 + * 0.9.6, 2024-10-02 + ** Add device query for supported queue properties (provisional). + * 0.9.7, 2024-12-13 ** Refactor queue compatability between command-buffer creation and enqueue (provisional). diff --git a/api/cl_khr_command_buffer_multi_device.asciidoc b/api/cl_khr_command_buffer_multi_device.asciidoc index 4329f7fa0..fa3f3047b 100644 --- a/api/cl_khr_command_buffer_multi_device.asciidoc +++ b/api/cl_khr_command_buffer_multi_device.asciidoc @@ -314,4 +314,4 @@ require it. functions (provisional). * Revision 0.9.2, 2024-12-13 ** Update clRemapCommandBufferKHR behavior to match cl_khr_command_buffer - version 0.9.6 (provisional). + version 0.9.7 (provisional). diff --git a/api/opencl_runtime_layer.asciidoc b/api/opencl_runtime_layer.asciidoc index cc4f9c99c..46e428540 100644 --- a/api/opencl_runtime_layer.asciidoc +++ b/api/opencl_runtime_layer.asciidoc @@ -14118,11 +14118,12 @@ the queue is an out-of-order queue, then no extra dependencies on previous commands using the same queue are created. All queue properties other than {CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} are ignored for the purposes of command creation, with the exception of any vendor extension defined queue properties -that explcitly define semantics for this purpose. +that explicitly define semantics for this purpose. -The command-queues used on command-buffer creation must be replaced on -command-buffer enqueue with the command-queues to execute the command-buffer -on. These may be different command-queues, provided for each element the +When enqueuing a command-buffer, a list of command-queues to execute the +command-buffer on can be passed by the user, otherwise the command-queues set +on command-buffer creation are used by default for execution. A user passed +list may contain different command-queues, provided for each element the substitute command-queue matches the device and context of the command-queue used on command-buffer creation. Each command-queue in the enqueue list must also have the minimum properties defined by diff --git a/xml/cl.xml b/xml/cl.xml index c93e19666..86a977c61 100644 --- a/xml/cl.xml +++ b/xml/cl.xml @@ -7410,7 +7410,7 @@ server's OpenCL/api-docs repository. - + From b6c9059aaedd03f85660c88f20aca345f0f97f73 Mon Sep 17 00:00:00 2001 From: Ewan Crawford Date: Tue, 14 Jan 2025 17:39:49 +0000 Subject: [PATCH 3/3] Update XML version --- xml/cl.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xml/cl.xml b/xml/cl.xml index 86a977c61..b01922db8 100644 --- a/xml/cl.xml +++ b/xml/cl.xml @@ -7191,7 +7191,7 @@ server's OpenCL/api-docs repository. - +