Skip to content

Commit

Permalink
[SYCL][Graph] command_graph queue constructor
Browse files Browse the repository at this point in the history
Feedback from Jack Kirk (@JackAKirk) that SYCL classes with a sycl::context
and sycl::device constructor normally have a matching constructor
taking a sycl::queue which replaces the context and device.

Being able to use a queue constructor makes it easier to port code from
CUDA/HIP where there is not a concept analogous to SYCL contexts.

Our tests/examples also commonly use the pattern
```cpp
queue Queue;
command_graph Graph {Queue.get_context(), Queue.get_device()}
```

So being able to use a queue constructor is also a more concise way
to call the constructor in a lot of cases, regardless of platform.
  • Loading branch information
EwanC committed Jan 8, 2024
1 parent ec7fb7c commit 7b2101f
Show file tree
Hide file tree
Showing 11 changed files with 225 additions and 2 deletions.
34 changes: 32 additions & 2 deletions sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ Erik Tomusk, Codeplay +
Bjoern Knafla, Codeplay +
Lukas Sommer, Codeplay +
Maxime France-Pillois, Codeplay +
Jack Kirk, Codeplay +
Ronan Keryell, AMD +
Andrey Alekseenko, KTH Royal Institute of Technology +

Expand Down Expand Up @@ -372,6 +373,9 @@ public:
command_graph(const context& syclContext, const device& syclDevice,
const property_list& propList = {});
command_graph(const queue& syclQueue,
const property_list& propList = {});
command_graph<graph_state::executable>
finalize(const property_list& propList = {}) const;
Expand Down Expand Up @@ -611,8 +615,34 @@ associated with `syclContext`.
* Throws synchronously with error code `invalid` if `syclDevice`
<<device-info-query, reports this extension as unsupported>>.

* Throws synchronously with error code `invalid` if the backend associated
with `syclDevice` is not supported.
|
[source,c++]
----
command_graph(const queue& syclQueue,
const property_list& propList = {});
----
|Simplified constructor form where `syclQueue` provides the device and context.
Zero or more properties can be provided to the constructed SYCL `command_graph`
via an instance of `property_list`.

Preconditions:

* This constructor is only available when the `command_graph` state is
`graph_state::modifiable`.

Parameters:

* `syclQueue` - Queue which provides the SYCL device and context for the graph,
which are immutable characteristics of the graph. All other properties of the
queue are ignored for the purposes of graph creation.

* `propList` - Optional parameter for passing properties. Valid `command_graph`
constructor properties are listed in Section <<graph-properties, Graph Properties>>.

Exceptions:

* Throws synchronously with error code `invalid` if the device associated with
`syclQueue` <<device-info-query, reports this extension as unsupported>>.

|===

Expand Down
12 changes: 12 additions & 0 deletions sycl/include/sycl/ext/oneapi/experimental/graph.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,12 @@ class __SYCL_EXPORT modifiable_command_graph {
modifiable_command_graph(const context &SyclContext, const device &SyclDevice,
const property_list &PropList = {});

/// Constructor.
/// @param SyclQueue Queue to use for the graph device and context.
/// @param PropList Optional list of properties to pass.
modifiable_command_graph(const queue &SyclQueue,
const property_list &PropList = {});

/// Add an empty node to the graph.
/// @param PropList Property list used to pass [0..n] predecessor nodes.
/// @return Constructed empty node which has been added to the graph.
Expand Down Expand Up @@ -325,6 +331,12 @@ class command_graph : public detail::modifiable_command_graph {
const property_list &PropList = {})
: modifiable_command_graph(SyclContext, SyclDevice, PropList) {}

/// Constructor.
/// @param SyclQueue Queue to use for the graph device and context.
/// @param PropList Optional list of properties to pass.
command_graph(const queue &SyclQueue, const property_list &PropList = {})
: modifiable_command_graph(SyclQueue, PropList) {}

private:
/// Constructor used internally by the runtime.
/// @param Impl Detail implementation class to construct object with.
Expand Down
5 changes: 5 additions & 0 deletions sycl/source/detail/graph_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -919,6 +919,11 @@ modifiable_command_graph::modifiable_command_graph(
: impl(std::make_shared<detail::graph_impl>(SyclContext, SyclDevice,
PropList)) {}

modifiable_command_graph::modifiable_command_graph(
const sycl::queue &SyclQueue, const sycl::property_list &PropList)
: impl(std::make_shared<detail::graph_impl>(
SyclQueue.get_context(), SyclQueue.get_device(), PropList)) {}

node modifiable_command_graph::addImpl(const std::vector<node> &Deps) {
impl->throwIfGraphRecordingQueue("Explicit API \"Add()\" function");
std::vector<std::shared_ptr<detail::node_impl>> DepImpls;
Expand Down
13 changes: 13 additions & 0 deletions sycl/test-e2e/Graph/Explicit/queue_constructor_buffer.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
// Extra run to check for leaks in Level Zero using UR_L0_LEAKS_DEBUG
// RUN: %if level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
//
// CHECK-NOT: LEAK
//
// TODO enable cuda once buffer issue investigated and fixed
// UNSUPPORTED: cuda

#define GRAPH_E2E_EXPLICIT

#include "../Inputs/queue_constructor_buffer.cpp"
10 changes: 10 additions & 0 deletions sycl/test-e2e/Graph/Explicit/queue_constructor_usm.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
// Extra run to check for leaks in Level Zero using UR_L0_LEAKS_DEBUG
// RUN: %if level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
//
// CHECK-NOT: LEAK

#define GRAPH_E2E_EXPLICIT

#include "../Inputs/queue_constructor_usm.cpp"
59 changes: 59 additions & 0 deletions sycl/test-e2e/Graph/Inputs/queue_constructor_buffer.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
// Tests a graph created with the simplified sycl::queue constructor works
// as expected.

#include "../graph_common.hpp"

int main() {
queue Queue{{sycl::ext::intel::property::queue::no_immediate_command_list{}}};

if (!are_graphs_supported(Queue)) {
return 0;
}

using T = unsigned short;

std::vector<T> DataA(Size), DataB(Size), DataC(Size);

std::iota(DataA.begin(), DataA.end(), 1);
std::iota(DataB.begin(), DataB.end(), 10);
std::iota(DataC.begin(), DataC.end(), 1000);

std::vector<T> ReferenceA(DataA), ReferenceB(DataB), ReferenceC(DataC);
calculate_reference_data(Iterations, Size, ReferenceA, ReferenceB,
ReferenceC);

buffer<T> BufferA{DataA.data(), range<1>{DataA.size()}};
BufferA.set_write_back(false);
buffer<T> BufferB{DataB.data(), range<1>{DataB.size()}};
BufferB.set_write_back(false);
buffer<T> BufferC{DataC.data(), range<1>{DataC.size()}};
BufferC.set_write_back(false);
{
exp_ext::command_graph Graph{
Queue, {exp_ext::property::graph::assume_buffer_outlives_graph{}}};

// Add commands to graph
add_nodes(Graph, Queue, Size, BufferA, BufferB, BufferC);

auto GraphExec = Graph.finalize();

event Event;
for (unsigned n = 0; n < Iterations; n++) {
Event =
Queue.submit([&](handler &CGH) { CGH.ext_oneapi_graph(GraphExec); });
}
Queue.wait_and_throw();
}

host_accessor HostAccA(BufferA);
host_accessor HostAccB(BufferB);
host_accessor HostAccC(BufferC);

for (size_t i = 0; i < Size; i++) {
assert(check_value(i, ReferenceA[i], HostAccA[i], "HostAccA"));
assert(check_value(i, ReferenceB[i], HostAccB[i], "HostAccB"));
assert(check_value(i, ReferenceC[i], HostAccC[i], "HostAccC"));
}

return 0;
}
68 changes: 68 additions & 0 deletions sycl/test-e2e/Graph/Inputs/queue_constructor_usm.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// Tests a graph created with the simplified sycl::queue constructor works
// as expected.

#include "../graph_common.hpp"

int main() {
queue Queue{{sycl::ext::intel::property::queue::no_immediate_command_list{}}};

if (!are_graphs_supported(Queue)) {
return 0;
}

using T = int;

std::vector<T> DataA(Size), DataB(Size), DataC(Size);
std::iota(DataA.begin(), DataA.end(), 1);
std::iota(DataB.begin(), DataB.end(), 10);
std::iota(DataC.begin(), DataC.end(), 1000);

std::vector<T> ReferenceA(DataA), ReferenceB(DataB), ReferenceC(DataC);
calculate_reference_data(Iterations, Size, ReferenceA, ReferenceB,
ReferenceC);

exp_ext::command_graph Graph{Queue};

T *PtrA = malloc_device<T>(Size, Queue);
T *PtrB = malloc_device<T>(Size, Queue);
T *PtrC = malloc_device<T>(Size, Queue);

Queue.copy(DataA.data(), PtrA, Size);
Queue.copy(DataB.data(), PtrB, Size);
Queue.copy(DataC.data(), PtrC, Size);
Queue.wait_and_throw();

// Add commands to graph
add_nodes(Graph, Queue, Size, PtrA, PtrB, PtrC);

auto GraphExec = Graph.finalize();

auto SubmitGraph = [&]() {
Queue.submit([&](handler &CGH) { CGH.ext_oneapi_graph(GraphExec); });
};

event Event;
for (unsigned n = 0; n < Iterations; n++) {
Event =
Queue.submit([&](handler &CGH) { CGH.ext_oneapi_graph(GraphExec); });
}

Queue.wait_and_throw();

Queue.copy(PtrA, DataA.data(), Size);
Queue.copy(PtrB, DataB.data(), Size);
Queue.copy(PtrC, DataC.data(), Size);
Queue.wait_and_throw();

free(PtrA, Queue);
free(PtrB, Queue);
free(PtrC, Queue);

for (size_t i = 0; i < Size; i++) {
assert(check_value(i, ReferenceA[i], DataA[i], "DataA"));
assert(check_value(i, ReferenceB[i], DataB[i], "DataB"));
assert(check_value(i, ReferenceC[i], DataC[i], "DataC"));
}

return 0;
}
13 changes: 13 additions & 0 deletions sycl/test-e2e/Graph/RecordReplay/queue_constructor_buffer.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
// Extra run to check for leaks in Level Zero using UR_L0_LEAKS_DEBUG
// RUN: %if level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
//
// CHECK-NOT: LEAK
//
// TODO enable cuda once buffer issue investigated and fixed
// UNSUPPORTED: cuda

#define GRAPH_E2E_RECORD_REPLAY

#include "../Inputs/queue_constructor_buffer.cpp"
10 changes: 10 additions & 0 deletions sycl/test-e2e/Graph/RecordReplay/queue_constructor_usm.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
// Extra run to check for leaks in Level Zero using UR_L0_LEAKS_DEBUG
// RUN: %if level_zero %{env UR_L0_LEAKS_DEBUG=1 %{run} %t.out 2>&1 | FileCheck %s %}
//
// CHECK-NOT: LEAK

#define GRAPH_E2E_RECORD_REPLAY

#include "../Inputs/queue_constructor_usm.cpp"
2 changes: 2 additions & 0 deletions sycl/test/abi/sycl_symbols_linux.dump
Original file line number Diff line number Diff line change
Expand Up @@ -3733,7 +3733,9 @@ _ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graph24addGraph
_ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graph7addImplERKSt6vectorINS3_4nodeESaIS7_EE
_ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graph7addImplESt8functionIFvRNS0_7handlerEEERKSt6vectorINS3_4nodeESaISC_EE
_ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graph9make_edgeERNS3_4nodeES7_
_ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graphC1ERKNS0_5queueERKNS0_13property_listE
_ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graphC1ERKNS0_7contextERKNS0_6deviceERKNS0_13property_listE
_ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graphC2ERKNS0_5queueERKNS0_13property_listE
_ZN4sycl3_V13ext6oneapi12experimental6detail24modifiable_command_graphC2ERKNS0_7contextERKNS0_6deviceERKNS0_13property_listE
_ZN4sycl3_V13ext6oneapi12experimental9image_memC1ERKNS3_16image_descriptorERKNS0_5queueE
_ZN4sycl3_V13ext6oneapi12experimental9image_memC1ERKNS3_16image_descriptorERKNS0_6deviceERKNS0_7contextE
Expand Down
1 change: 1 addition & 0 deletions sycl/test/abi/sycl_symbols_windows.dump
Original file line number Diff line number Diff line change
Expand Up @@ -600,6 +600,7 @@
??0modifiable_command_graph@detail@experimental@oneapi@ext@_V1@sycl@@QEAA@$$QEAV0123456@@Z
??0modifiable_command_graph@detail@experimental@oneapi@ext@_V1@sycl@@QEAA@AEBV0123456@@Z
??0modifiable_command_graph@detail@experimental@oneapi@ext@_V1@sycl@@QEAA@AEBVcontext@56@AEBVdevice@56@AEBVproperty_list@56@@Z
??0modifiable_command_graph@detail@experimental@oneapi@ext@_V1@sycl@@QEAA@AEBVqueue@56@AEBVproperty_list@56@@Z
??0node@experimental@oneapi@ext@_V1@sycl@@AEAA@AEBV?$shared_ptr@Vnode_impl@detail@experimental@oneapi@ext@_V1@sycl@@@std@@@Z
??0node@experimental@oneapi@ext@_V1@sycl@@QEAA@$$QEAV012345@@Z
??0node@experimental@oneapi@ext@_V1@sycl@@QEAA@AEBV012345@@Z
Expand Down

0 comments on commit 7b2101f

Please sign in to comment.