Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor command-buffer queue compatability #1292

Draft
wants to merge 1 commit into
base: EwanC/tmp/command-buffer_supported_queue_props
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 21 additions & 19 deletions api/cl_khr_command_buffer.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer.txt[]
=== Other Extension Metadata

*Last Modified Date*::
2024-10-02
2024-12-13
*IP Status*::
No known IP claims.
*Contributors*::
Expand Down Expand Up @@ -43,11 +43,6 @@ Command-buffers enable a reduction in overhead when enqueuing the same
workload multiple times. By separating the command-queue setup from dispatch,
the ability to replay a set of previously created commands is introduced.

The command-queues a command-buffer will be executed on can be set on replay via
parameters to {clEnqueueCommandBufferKHR}, provided they are
<<compatible, compatible>> with the command-queues used on command-buffer
recording.

==== Background

On embedded devices where building a command stream accounts for a significant
Expand All @@ -74,7 +69,7 @@ or writes memory objects; or enqueues a native kernel, is not available for
command-buffer recording. Finally commands recorded into a command buffer do
not wait for or return event objects, these are instead replaced with
device-side synchronization-point identifiers which enable out-of-order
execution when enqueued on <<compatible, compatible>> command-queues.
execution of the command-buffer commands.

Adding new entry-points for individual commands, rather than recording existing
command-queue APIs with begin/end markers was a design decision made for the
Expand Down Expand Up @@ -102,16 +97,22 @@ following reasons:

==== Command Synchronization

Device-side {cl_sync_point_khr_TYPE} synchronization-points can be used within
command-buffers to define command dependencies. This allows the commands of a
command-buffer to execute out-of-order on a single <<compatible, compatible>>
command-queue. The command-buffer itself has no inherent in-order/out-of-order
property, this ordering is inferred from the command-queue used on command
recording. {clEnqueueCommandBufferKHR} submissions to an out-of-order queue
have the same execution semantics are other operations enqueued to an
out-of-order queue, such as {clEnqueueFillBuffer}, where execution between
enqueued operations may happen concurrently unless dependencies between the
operations are expressed with events.
The command-buffer object has no in-order/out-of-order property set on creation,
it is out-of-order, and command ordering is defined by the dependencies set when
commands are created. Command dependencies can be define in 3 ways:

1. Device-side {cl_sync_point_khr_TYPE} synchronization-points, providing an
explicit list of the commands to depend on.
2. Appending a {clCommandBarrierWithWaitListKHR} barrier command.
3. Passing an in-order queue when creating the command, creating an implicit
dependency on the any previous command created in the command-buffer using
the same queue.

{clEnqueueCommandBufferKHR} submissions to an out-of-order queue have the same
execution semantics are other operations enqueued to an out-of-order queue,
such as {clEnqueueFillBuffer}, where execution between enqueued operations may
happen concurrently unless dependencies between the operations are expressed
with events.

The {cl_sync_point_khr_TYPE} type is defined as a `cl_uint`, giving a hard
upper limit on the number of commands a command-buffer can hold as
Expand Down Expand Up @@ -464,5 +465,6 @@ features:
* 0.9.5, 2024-07-24
** Add a properties parameter to all command recording entry-points
(provisional).
* 0.9.6, 2024-10-02
** Add device query for supported queue properties (provisional).
* 0.9.6, 2024-12-13
** Refactor queue compatability between command-buffer creation and enqueue
(provisional).
5 changes: 4 additions & 1 deletion api/cl_khr_command_buffer_multi_device.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ include::{generated}/meta/{refprefix}cl_khr_command_buffer_multi_device.txt[]
=== Other Extension Metadata

*Last Modified Date*::
2023-04-30
2024-12-13
*IP Status*::
No known IP claims.
*Contributors*::
Expand Down Expand Up @@ -312,3 +312,6 @@ require it.
* Revision 0.9.1, 2023-04-30
** Added clCommandSVMMemcpyKHR and clCommandSVMMemFillKHR as affected
functions (provisional).
* Revision 0.9.2, 2024-12-13
** Update clRemapCommandBufferKHR behavior to match cl_khr_command_buffer
version 0.9.6 (provisional).
3 changes: 1 addition & 2 deletions api/opencl_platform_layer.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -240,8 +240,7 @@ include::{generated}/api/version-notes/CL_COMMAND_BUFFER_PLATFORM_UNIVERSAL_SYNC

{CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR_anchor} - Platform
supports the ability to create a deep copy of an existing
command-buffer with the commands explicitly remapped to different,
potentially <<compatible, incompatible>>, queues.
command-buffer with the commands explicitly remapped to different queues.

include::{generated}/api/version-notes/CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR.asciidoc[]

Expand Down
87 changes: 49 additions & 38 deletions api/opencl_runtime_layer.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -13915,16 +13915,33 @@ of 0 or 1.
The simultaneous use capability removes this restriction and allows
command-buffers to have a <<pending_count, Pending Count>> greater than 1.

[[compatible]]
Command-buffers are created using an ordered list of command-queues that
commands are recorded to and execute on by default.
These command-queues can be replaced on command-buffer enqueue with
different command-queues, provided for each element in the replacement list
the substitute command-queue is compatible with the command-queue used on
command-buffer creation.
A _compatible_ command-queue is defined as a command-queue with
identical properties targeting the same device and in the same OpenCL
context.
commands are recorded to and execute on by default. All these queue objects
must share the same context, but may be associated with different devices when
the {cl_khr_command_buffer_multi_device_EXT} extension is supported.

When constructing a command-buffer by appending commands, the queue parameter
passed for the command being created is used to set the device with which the
command will be associated with, and also inform the scheduling of the command.
If the queue is an in-order queue, then an additional dependency is created on the
last command appended to the command-buffer using the same queue parameter. If
the queue is an out-of-order queue, then no extra dependencies on previous
commands using the same queue are created. All queue properties other than
{CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE} are ignored for the purposes of command
creation, with the exception of any vendor extension defined queue properties
that explcitly define semantics for this purpose.

The command-queues used on command-buffer creation must be replaced on
command-buffer enqueue with the command-queues to execute the command-buffer
on. These may be different command-queues, provided for each element the
substitute command-queue matches the device and context of the command-queue
used on command-buffer creation. Each command-queue in the enqueue list must
also have the minimum properties defined by
{CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR} and no properties
which are not reported by
{CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}. These queue
properties have the same execution semantics for {clEnqueueCommandBufferKHR}
as other operations enqueued to the queue.

While constructing a command-buffer it is valid for the user to interleave
calls to the same queue which create commands, such as
Expand Down Expand Up @@ -13988,7 +14005,7 @@ target the same device.

Commands recorded to different command-queues in the same command-buffer may
be executed concurrently to each other unless synchronized explicitly with
sync-points.
sync-points, barrier commands, or in-order queue implicit dependencies.
Ordering of other commands submitted to the same command-queues as used to
enqueue a command-buffer is the responsibility of the programmer.
A command-buffer enqueue spanning multiple queues can return an event to use
Expand Down Expand Up @@ -14189,12 +14206,6 @@ returned in _errcode_ret_:

* {CL_INVALID_COMMAND_QUEUE} if any command-queue in _queues_ is not a
valid command-queue.
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any command-queue
in _queues_ contains a property not specified by
{CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}.
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any
command-queue in _queues_ does not contain the minimum properties
specified by {CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR}.
* {CL_INVALID_CONTEXT} if all the command-queues in _queues_ do not have
the same OpenCL context.
* {CL_INVALID_VALUE} if the {cl_khr_command_buffer_multi_device_EXT}
Expand Down Expand Up @@ -14327,10 +14338,10 @@ include::{generated}/api/protos/clEnqueueCommandBufferKHR.txt[]
include::{generated}/api/version-notes/clEnqueueCommandBufferKHR.asciidoc[]

* _num_queues_ is the number of command-queues listed in _queues_.
* _queues_ is a pointer to an ordered list of command-queues <<compatible,
compatible>> with the command-queues used on recording.
_queues_ can be `NULL`, in which case the default command-queues used on
command-buffer creation are used and _num_queues_ must be 0.
* _queues_ is a pointer to an ordered list of command-queues to execute the
command-buffer on. _queues_ can be `NULL`, in which case the default
command-queues used on command-buffer creation are used and _num_queues_
must be 0.
* _command_buffer_ refers to a valid command-buffer object.
* _event_wait_list_, _num_events_in_wait_list_ specify events that need to
complete before this particular command can be executed.
Expand Down Expand Up @@ -14375,9 +14386,15 @@ execution was successfully queued, or one of the errors below:
_num_queues_ set on _command_buffer_ creation.
* {CL_INVALID_COMMAND_QUEUE} if any element of _queues_ is not a valid
command-queue.
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if any element of _queues_ is not
<<compatible, compatible>> with the command-queue set on
_command_buffer_ creation at the same list index.
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any command-queue
in _queues_ contains a property not specified by
{CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR}.
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if the properties of any
command-queue in _queues_ does not contain the minimum properties
specified by {CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR}.
* {CL_INVALID_DEVICE} if any element of _queues_ does not have the same
device as the command-queue set on _command_buffer_ creation at the
same list index.
* {CL_INVALID_CONTEXT} if any element of _queues_ does not have the same
context as the command-queue set on _command_buffer_ creation at the
same list index.
Expand Down Expand Up @@ -15761,22 +15778,18 @@ ifdef::cl_khr_command_buffer_multi_device[]
If the {cl_khr_command_buffer_multi_device_EXT} extension is supported,
platforms reporting the {CL_COMMAND_BUFFER_PLATFORM_REMAP_QUEUES_KHR}
capability support generating a deep copy of a command-buffer with its
commands remapped to a list of command-queues that are potentially
<<compatible, incompatible>> with the queues used to create the
command-buffer.
That is, the remapped command-buffer can execute on queues that differ in
terms of properties and/or associated device from the original
commands remapped to different devices than the devices used to create the
commands. That is, the remapped command-buffer can execute on queues that
differ in terms of properties and/or associated device from the original
command-buffer queues.

This functionality is invoked through a new synchronous entry-point
{clRemapCommandBufferKHR} which takes a list of queues to which the commands
should now target.
It then returns a command-buffer containing the same commands as the
original, with the same command dependencies, but targeting different
queues.
A list of command handles may also be passed to the entry-point, which
allows handles to the equivalent commands in the remapped command-buffer to
be returned by an output parameter.
should now target the associated devices of. It then returns a command-buffer
containing the same commands as the original, with the same command
dependencies, but targeting different devices. A list of command handles may
also be passed to the entry-point, which allows handles to the equivalent
commands in the remapped command-buffer to be returned by an output parameter.

Device properties restrict remapping possibilities, as existing commands can
have a configuration which is not supported by another device, and so
Expand All @@ -15799,7 +15812,7 @@ appear and disappear during runtime.
[open,refpage='clRemapCommandBufferKHR',desc='Create copy of a command-buffer remapped to specified command-queues',type='protos']
--
To create a deep copy of the input command-buffer with the copied commands
remapped to target the passed command-queues, call the function
remapped to target devices of the passed command-queues, call the function

include::{generated}/api/protos/clRemapCommandBufferKHR.txt[]
include::{generated}/api/version-notes/clRemapCommandBufferKHR.asciidoc[]
Expand Down Expand Up @@ -15858,8 +15871,6 @@ one of the following error values returned in _errcode_ret_:
* {CL_INVALID_OPERATION} if the platform does not support the
{CL_COMMAND_BUFFER_PLATFORM_AUTOMATIC_REMAP_KHR} flag and _automatic_ is
{CL_TRUE}.
* {CL_INCOMPATIBLE_COMMAND_QUEUE_KHR} if such an error would be returned
by passing _queues_ to {clCreateCommandBufferKHR}.
* Any error relating to device support that can be returned by a command
recording entry-point may also be returned.
As a command in _command_buffer_ can have a configuration that is not
Expand Down
2 changes: 1 addition & 1 deletion xml/cl.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7406,7 +7406,7 @@ server's OpenCL/api-docs repository.
<enum name="CL_MEM_DEVICE_ID_INTEL"/>
</require>
</extension>
<extension name="cl_khr_command_buffer_multi_device" revision="0.9.1" supported="opencl" depends="cl_khr_command_buffer" ratified="opencl" provisional="true" comment="in sync with version 0.9.1; requires cl_khr_command_buffer 0.9.3 or later">
<extension name="cl_khr_command_buffer_multi_device" revision="0.9.2" supported="opencl" depends="cl_khr_command_buffer" ratified="opencl" provisional="true" comment="requires cl_khr_command_buffer 0.9.6 or later">
<require>
<type name="CL/cl.h"/>
</require>
Expand Down
Loading