[SYCL] Make queue fill use native functions #12702

konradkusiak97 · 2024-02-13T10:33:54Z

This PR changes the queue.fill() implementation to make use of the native functions for a specific backend. It also unifies that implementation with the one for memset, since it is just an 8-bit subset operation of fill.

In the CUDA case, both memset and fill are currently calling urEnqueueUSMFill which depending on the size of the filling pattern calls either cuMemsetD8Async, cuMemsetD16Async, cuMemsetD32Async or commonMemSetLargePattern. Before this patch memset was using the same thing, just beforehand setting patternSize always to 1 byte which resulted in calling cuMemsetD8Async. In other backends, the behaviour is analogous.

The fill method was just invoking a parallel_for to fill the memory with the pattern which was making this operation quite slow.

This PR depends on:

sycl/source/detail/memory_manager.hpp

konradkusiak97 · 2024-02-22T12:38:28Z

Looks like there is a bug in ROCM prior to 6.0.0 which causes hipMemset2D to behave incorrectly on host-pinned memory. I'm working on a workaround for this.

sycl/plugins/unified_runtime/CMakeLists.txt

EwanC

Graph related changes LGTM

konradkusiak97 · 2024-05-02T14:59:49Z

The dependent patches in the UR are all merged now so this PR is further ready for review. Friendly ping to @steffenlarsen and @intel/unified-runtime-reviewers

steffenlarsen

Changes LGTM!

kbenzie

UR LGTM

konradkusiak97 · 2024-05-02T20:30:02Z

@intel/llvm-gatekeepers this is ready to be merged

steffenlarsen · 2024-05-03T12:37:48Z

@konradkusiak97 - It looks like this may have caused https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/out_of_order_queue_status.cpp to fail on Gen12. Could you please have a look?

konradkusiak97 · 2024-05-03T14:16:26Z

@konradkusiak97 - It looks like this may have caused https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/out_of_order_queue_status.cpp to fail on Gen12. Could you please have a look?

I can't reproduce this locally, even with a debug build but I can see the post-commit failures, will investigate further.

aelovikov-intel · 2024-05-06T22:50:35Z

@konradkusiak97 - It looks like this may have caused https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/out_of_order_queue_status.cpp to fail on Gen12. Could you please have a look?

I can't reproduce this locally, even with a debug build but I can see the post-commit failures, will investigate further.

Any updates on this? Should we just revert this PR?

konradkusiak97 · 2024-05-06T22:54:24Z

@konradkusiak97 - It looks like this may have caused https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/out_of_order_queue_status.cpp to fail on Gen12. Could you please have a look?

I can't reproduce this locally, even with a debug build but I can see the post-commit failures, will investigate further.

Any updates on this? Should we just revert this PR?

I'm still working on the fix so let's revert it for now.

This reverts commit 46e49ec.

Reverts #12702. See #12702 (comment).

konradkusiak97 requested review from a team as code owners February 13, 2024 10:33

konradkusiak97 requested a review from steffenlarsen February 13, 2024 10:33

konradkusiak97 had a problem deploying to WindowsCILock February 13, 2024 10:46 — with GitHub Actions Error

konradkusiak97 had a problem deploying to WindowsCILock February 13, 2024 11:14 — with GitHub Actions Failure

konradkusiak97 had a problem deploying to WindowsCILock February 13, 2024 11:54 — with GitHub Actions Failure

konradkusiak97 marked this pull request as draft February 14, 2024 11:01

steffenlarsen reviewed Feb 14, 2024

View reviewed changes

sycl/source/detail/memory_manager.hpp Show resolved Hide resolved

konradkusiak97 had a problem deploying to WindowsCILock February 20, 2024 16:54 — with GitHub Actions Failure

konradkusiak97 had a problem deploying to WindowsCILock February 20, 2024 17:22 — with GitHub Actions Failure

konradkusiak97 temporarily deployed to WindowsCILock February 21, 2024 11:03 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock February 21, 2024 11:39 — with GitHub Actions Failure

konradkusiak97 temporarily deployed to WindowsCILock February 27, 2024 10:52 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock February 27, 2024 11:22 — with GitHub Actions Failure

konradkusiak97 mentioned this pull request Feb 28, 2024

[HIP] Implement workaround for hipMemset2D oneapi-src/unified-runtime#1395

Merged

konradkusiak97 had a problem deploying to WindowsCILock February 28, 2024 18:09 — with GitHub Actions Failure

konradkusiak97 had a problem deploying to WindowsCILock February 29, 2024 19:40 — with GitHub Actions Failure

konradkusiak97 had a problem deploying to WindowsCILock February 29, 2024 20:17 — with GitHub Actions Failure

konradkusiak97 temporarily deployed to WindowsCILock March 1, 2024 14:06 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock March 1, 2024 15:31 — with GitHub Actions Failure

konradkusiak97 mentioned this pull request Mar 5, 2024

[L0][OpenCL] Emulate Fill with copy when patternSize is not a power of 2 oneapi-src/unified-runtime#1412

Merged

konradkusiak97 commented Mar 5, 2024

View reviewed changes

sycl/plugins/unified_runtime/CMakeLists.txt Outdated Show resolved Hide resolved

konradkusiak97 marked this pull request as ready for review March 5, 2024 18:37

konradkusiak97 temporarily deployed to WindowsCILock March 5, 2024 18:49 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock March 5, 2024 19:22 — with GitHub Actions Inactive

konradkusiak97 had a problem deploying to WindowsCILock March 7, 2024 16:31 — with GitHub Actions Failure

konradkusiak97 added 2 commits April 3, 2024 10:26

Updated graph usm fill tests and Command Graph docs

d98d82e

Merge branch 'sycl' into improvedQueueFill

ea6a3b4

konradkusiak97 had a problem deploying to WindowsCILock April 3, 2024 09:28 — with GitHub Actions Failure

konradkusiak97 mentioned this pull request Apr 3, 2024

[NATIVECPU] Extended usm fill to bigger patterns than 1 byte oneapi-src/unified-runtime#1489

Merged

EwanC approved these changes Apr 3, 2024

View reviewed changes

konradkusiak97 mentioned this pull request Apr 5, 2024

[SYCL][COMPAT] Memset API updated to support 2-byte and 4-byte memsets #11340

Closed

Merge branch 'sycl' into improvedQueueFill

ef43753

konradkusiak97 temporarily deployed to WindowsCILock May 2, 2024 08:54 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 2, 2024 09:26 — with GitHub Actions Inactive

Merge branch 'sycl' into improvedQueueFill

c832525

konradkusiak97 temporarily deployed to WindowsCILock May 2, 2024 11:19 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 2, 2024 12:23 — with GitHub Actions Inactive

Merge branch 'sycl' into improvedQueueFill

37e4cf9

konradkusiak97 temporarily deployed to WindowsCILock May 2, 2024 14:17 — with GitHub Actions Inactive

konradkusiak97 temporarily deployed to WindowsCILock May 2, 2024 14:48 — with GitHub Actions Inactive

steffenlarsen approved these changes May 2, 2024

View reviewed changes

kbenzie approved these changes May 2, 2024

View reviewed changes

sarnex merged commit 46e49ec into intel:sycl May 2, 2024
12 checks passed

aelovikov-intel added a commit that referenced this pull request May 6, 2024

Revert "[SYCL] Make queue fill use native functions (#12702)"

123fc50

This reverts commit 46e49ec.

aelovikov-intel mentioned this pull request May 6, 2024

Revert "[SYCL] Make queue fill use native functions" #13675

Merged

aelovikov-intel added a commit that referenced this pull request May 7, 2024

Revert "[SYCL] Make queue fill use native functions" (#13675)

ae90070

Reverts #12702. See #12702 (comment).

This was referenced May 10, 2024

[SYCL] Fix post-commit failure #13657

Closed

Q.fill() improvements fail on gen12 #13787

Closed

[SYCL][ABI-Break] Improve Queue fill #13788

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] Make queue fill use native functions #12702

[SYCL] Make queue fill use native functions #12702

konradkusiak97 commented Feb 13, 2024 •

edited

Loading

konradkusiak97 commented Feb 22, 2024

EwanC left a comment

konradkusiak97 commented May 2, 2024

steffenlarsen left a comment

kbenzie left a comment

konradkusiak97 commented May 2, 2024

steffenlarsen commented May 3, 2024

konradkusiak97 commented May 3, 2024

aelovikov-intel commented May 6, 2024

konradkusiak97 commented May 6, 2024

[SYCL] Make queue fill use native functions #12702

[SYCL] Make queue fill use native functions #12702

Conversation

konradkusiak97 commented Feb 13, 2024 • edited Loading

konradkusiak97 commented Feb 22, 2024

EwanC left a comment

Choose a reason for hiding this comment

konradkusiak97 commented May 2, 2024

steffenlarsen left a comment

Choose a reason for hiding this comment

kbenzie left a comment

Choose a reason for hiding this comment

konradkusiak97 commented May 2, 2024

steffenlarsen commented May 3, 2024

konradkusiak97 commented May 3, 2024

aelovikov-intel commented May 6, 2024

konradkusiak97 commented May 6, 2024

konradkusiak97 commented Feb 13, 2024 •

edited

Loading