Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[L0] Support updating kernel commands in command buffers #1353

Merged

Conversation

againull
Copy link
Contributor

@againull againull commented Feb 16, 2024

Initial support for updating command buffers.

Also applied temporary fixes for conformance tests:

  • Execution range update is not supported by the L0 driver right now, so skipping those tests for L0 backend.

  • There is a synchronization issue with immediate submission when used for command buffers. It is reproducible even without changes of this PR, so should be fixed separately. For now use batched submission for command buffer update tests.

intel/llvm PR: intel/llvm#12897

@againull againull force-pushed the againull/l0_adapter_update_cmd_buffer branch 2 times, most recently from 7c0b644 to ddc3675 Compare March 1, 2024 08:12
@againull againull marked this pull request as ready for review March 1, 2024 08:13
@againull againull requested review from a team as code owners March 1, 2024 08:13
@againull againull force-pushed the againull/l0_adapter_update_cmd_buffer branch from ddc3675 to ac72e87 Compare March 1, 2024 08:20
source/adapters/level_zero/device.cpp Outdated Show resolved Hide resolved
source/adapters/level_zero/command_buffer.cpp Outdated Show resolved Hide resolved
test/conformance/exp_command_buffer/fixtures.h Outdated Show resolved Hide resolved
source/adapters/level_zero/command_buffer.cpp Show resolved Hide resolved
@againull againull requested a review from EwanC March 4, 2024 20:50
Copy link
Contributor

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Ideally we could test this with the DPC++ PR in intel/llvm#12840 on a setup with this driver feature and see if the new tests pass (apart from range update) when enabled for L0. But with the CTS tests I wouldn't say this is required and we could fix up any issues in a follow-up PR.

@kbenzie kbenzie changed the title [L0 Adapter] Support updating kernel commands in command buffers [L0] Support updating kernel commands in command buffers Mar 5, 2024
@againull againull force-pushed the againull/l0_adapter_update_cmd_buffer branch from 3059fa1 to 5f8a62c Compare March 5, 2024 17:57
Copy link
Contributor

@nrspruit nrspruit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for Level Zero

@nrspruit nrspruit added level-zero L0 adapter specific issues ready to merge Added to PR's which are ready to merge labels Mar 5, 2024
@kbenzie kbenzie added the v0.9.x Include in the v0.9.x release label Mar 11, 2024
@kbenzie
Copy link
Contributor

kbenzie commented Mar 11, 2024

Please pull in the main branch to have up to date testing, also update the tag in the intel/llvm PR.

@kbenzie kbenzie force-pushed the againull/l0_adapter_update_cmd_buffer branch from 5f8a62c to 0b355e1 Compare March 13, 2024 13:39
@codecov-commenter
Copy link

codecov-commenter commented Mar 13, 2024

Codecov Report

Attention: Patch coverage is 0% with 25 lines in your changes are missing coverage. Please review.

Project coverage is 12.48%. Comparing base (78ef1ca) to head (c9be1e2).
Report is 146 commits behind head on main.

Files Patch % Lines
test/conformance/exp_command_buffer/fixtures.h 0.00% 11 Missing ⚠️
...e/exp_command_buffer/buffer_fill_kernel_update.cpp 0.00% 9 Missing ⚠️
...ance/exp_command_buffer/usm_fill_kernel_update.cpp 0.00% 3 Missing ⚠️
.../conformance/exp_command_buffer/ndrange_update.cpp 0.00% 2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1353      +/-   ##
==========================================
- Coverage   14.82%   12.48%   -2.34%     
==========================================
  Files         250      239      -11     
  Lines       36220    36022     -198     
  Branches     4094     4092       -2     
==========================================
- Hits         5369     4498     -871     
- Misses      30800    31520     +720     
+ Partials       51        4      -47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kbenzie kbenzie mentioned this pull request Mar 13, 2024
* Execution range update is not supported by the L0 driver right now.
Currently it supports only kernel arguments update.

* There is a synchronization issue with immediate submission when used
for command buffers. It is reproducible even without changes of this
PR, so should be fixed separately. For now use batched submission for
command buffer update tests.
* Remove unnecesary env variable setting
* Check support of the feature on device
* Check sub-feature support before usage at urCommandBufferUpdateKernelLaunchExp
* Remove redundant code
@againull againull force-pushed the againull/l0_adapter_update_cmd_buffer branch from 0b355e1 to c9be1e2 Compare March 14, 2024 17:18
@againull
Copy link
Contributor Author

Please pull in the main branch to have up to date testing, also update the tag in the intel/llvm PR.

Done.

EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Mar 14, 2024
Implement the API for updating the kernel commands in a command-buffer
defined by oneapi-src#1089 for
the OpenCL adapter.

However, the following changes to the UR kernel update API have been
made based on implementation experience:

1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057
2. Remove struct fields to update exec info, after [DPC++ implementation
   prototype](intel/llvm#12840) shows this isn't
   needed.
3. Forbid changing the local work size from user to impl defined and
   vice-versa. See discussion in [L0 implementation
PR](oneapi-src#1353 (comment)).

This adapter implementation depends on support for the
[cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch)
extension.

Tested on Intel GPU/CPUs OpenCL implementations with the
[command-buffer emulation
layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu).

```bash
$ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics"
```

DPC++ PR intel/llvm#12724
@againull
Copy link
Contributor Author

Hi @kbenzie @nrspruit, Is it possible to merge this PR?

@kbenzie
Copy link
Contributor

kbenzie commented Mar 19, 2024

Hi @kbenzie @nrspruit, Is it possible to merge this PR?

@againull yes, I'll be getting to this soon. It's on our v0.9.x merge list.

@kbenzie kbenzie merged commit ca5c342 into oneapi-src:main Mar 19, 2024
50 checks passed
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Mar 20, 2024
Fixes Coverity defect report from L0 command-buffer update
code merged in oneapi-src#1353

```
This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U".
```
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Mar 21, 2024
Fixes Coverity defect report from L0 command-buffer update
code merged in oneapi-src#1353

```
This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U".
```
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Mar 22, 2024
Fixes Coverity defect report from L0 command-buffer update
code merged in oneapi-src#1353

```
This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U".
```
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Mar 22, 2024
Fixes Coverity defect report from L0 command-buffer update
code merged in oneapi-src#1353

```
This greater-than-or-equal-to-zero comparison of an unsigned value is always true. "CommandDesc->newWorkDim >= 0U".
```
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Mar 25, 2024
Implement the API for updating the kernel commands in a command-buffer
defined by oneapi-src#1089 for
the OpenCL adapter.

However, the following changes to the UR kernel update API have been
made based on implementation experience:

1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057
2. Remove struct fields to update exec info, after [DPC++ implementation
   prototype](intel/llvm#12840) shows this isn't
   needed.
3. Forbid changing the local work size from user to impl defined and
   vice-versa. See discussion in [L0 implementation
PR](oneapi-src#1353 (comment)).

This adapter implementation depends on support for the
[cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch)
extension.

Tested on Intel GPU/CPUs OpenCL implementations with the
[command-buffer emulation
layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu).

```bash
$ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics"
```

DPC++ PR intel/llvm#12724
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Apr 4, 2024
Implement the API for updating the kernel commands in a command-buffer
defined by oneapi-src#1089 for
the OpenCL adapter.

However, the following changes to the UR kernel update API have been
made based on implementation experience:

1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057
2. Remove struct fields to update exec info, after [DPC++ implementation
   prototype](intel/llvm#12840) shows this isn't
   needed.
3. Forbid changing the local work size from user to impl defined and
   vice-versa. See discussion in [L0 implementation
PR](oneapi-src#1353 (comment)).

This adapter implementation depends on support for the
[cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch)
extension.

Tested on Intel GPU/CPUs OpenCL implementations with the
[command-buffer emulation
layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu).

```bash
$ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics"
```

DPC++ PR intel/llvm#12724
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Apr 5, 2024
Implement the API for updating the kernel commands in a command-buffer
defined by oneapi-src#1089 for
the OpenCL adapter.

However, the following changes to the UR kernel update API have been
made based on implementation experience:

1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057
2. Remove struct fields to update exec info, after [DPC++ implementation
   prototype](intel/llvm#12840) shows this isn't
   needed.
3. Forbid changing the local work size from user to impl defined and
   vice-versa. See discussion in [L0 implementation
PR](oneapi-src#1353 (comment)).

This adapter implementation depends on support for the
[cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch)
extension.

Tested on Intel GPU/CPUs OpenCL implementations with the
[command-buffer emulation
layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu).

```bash
$ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics"
```

DPC++ PR intel/llvm#12724
EwanC added a commit to Bensuo/unified-runtime that referenced this pull request Apr 22, 2024
Implement the API for updating the kernel commands in a command-buffer
defined by oneapi-src#1089 for
the OpenCL adapter.

However, the following changes to the UR kernel update API have been
made based on implementation experience:

1. Forbid updating the work-dim of the kernel, see KhronosGroup/OpenCL-Docs#1057
2. Remove struct fields to update exec info, after [DPC++ implementation
   prototype](intel/llvm#12840) shows this isn't
   needed.
3. Forbid changing the local work size from user to impl defined and
   vice-versa. See discussion in [L0 implementation
PR](oneapi-src#1353 (comment)).

This adapter implementation depends on support for the
[cl_khr_command_buffer_mutable_dispatch](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.html#cl_khr_command_buffer_mutable_dispatch)
extension.

Tested on Intel GPU/CPUs OpenCL implementations with the
[command-buffer emulation
layer](https://github.com/bashbaug/SimpleOpenCLSamples/tree/main/layers/10_cmdbufemu).

```bash
$ OPENCL_LAYERS=<path/to/SimpleOpenCLSamples/build/layers/10_cmdbufemu/libCmdBufEmu.so> ./bin/test-exp_command_buffer --platform="Intel(R) OpenCL Graphics"
```

DPC++ PR intel/llvm#12724
@againull againull deleted the againull/l0_adapter_update_cmd_buffer branch July 9, 2024 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
level-zero L0 adapter specific issues ready to merge Added to PR's which are ready to merge v0.9.x Include in the v0.9.x release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants