-
Notifications
You must be signed in to change notification settings - Fork 798
[SYCL][UR] Make v2 L0 the default (L0) adapter for BMG and newer #19333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm in general. But I think this patch should also consolidate building v1/v2 under the same CMake flag. When this is merged, both .so files will be required for the runtime to function properly on all systems.
This is needed so that we can enable V2 adapter by default on certain platforms: #19333 The reason is that we need to load both adapters (legacy and v2) to check the device version. However, loading v2 adapter causes L0 loader to emit logs for all API calls (if ZE_DEBUG=1 is set). Since the legacy adapter used different logic for printing API calls, this would result in printing the same logs twice. This patch fixes that.
Whenever we detect ANY device newer than BMG on the platform we will use V2, otherwise we will use V1. The default behavior can still be overwritten by setting SYCL_UR_USE_LEVEL_ZERO_V2=1 to use V2 adapter or SYCL_UR_USE_LEVEL_ZERO_V2=0 to use V1 adapter.
That's a good point, but I we'll need to modify the logic for UR conformance tests as they rely on the cmake var to decide which adapter to use. Ideally, we could specify that during runtime or just have the tests be run for all adapters (both legacy and v2). I think I would prefer to do that in a separate PR. |
Unless we just remove the option to select the adapter for testing entirely and just always use the default one for a given adapter. One problem with this though is that we still expect to run v2 on older hardware if there are multiple devices on the platform. |
This is needed so that we can enable V2 adapter by default on certain platforms: intel/llvm#19333 The reason is that we need to load both adapters (legacy and v2) to check the device version. However, loading v2 adapter causes L0 loader to emit logs for all API calls (if ZE_DEBUG=1 is set). Since the legacy adapter used different logic for printing API calls, this would result in printing the same logs twice. This patch fixes that.
I decided to do the adapter filtering on urAdapterGet. The logic now looks like this: The loader will iterate over all adapters and call urAdapterGet() on them. The adapters will return numAdapters, which will always be 1 for CUDA, HIP, OpenCL and NativeCPU and either 1 or 0 for L0. If we are on BMG or newer, the v2 adapter will return 1, otherwise, the legacy will return 1. @nrspruit could you take a look at changes in level_zero/adapter.cpp? I removed the PlatformsCache and now, I initialize all platforms during adapter init (in urAdapterGet). This is needed so that we can check the device id and decide whether we should use V1 or V2. |
Regression/reduction_resource_leak_dw.cpp started passing after my changes. I removed the XFAIL. As far as I can tell this is due to the adapter being initialized earlier, so the construction/destruction order changed and now the context is destroyed before the leak checker is triggered.
|
This is needed so that we can enable V2 adapter by default on certain platforms: intel/llvm#19333 The reason is that we need to load both adapters (legacy and v2) to check the device version. However, loading v2 adapter causes L0 loader to emit logs for all API calls (if ZE_DEBUG=1 is set). Since the legacy adapter used different logic for printing API calls, this would result in printing the same logs twice. This patch fixes that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, the split makes sense and the shared init looks clean, thanks for the changes!
@intel/sycl-graphs-reviewers could you please take a look? There are no graph-sepcific changes in this PR, but I do modify the logic for e2e tests and adapter loading logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I assume this means that v2 functionality is on par with v1 on BMG or newer. Also, probably it makes sense to do some testing downstream to see if there are any issues, if it hasn't been done yet.
@againull Yes, we've already done testing on BMG and addressed most of the issues. We do have functional parity. |
The loader will iterate over all adapters and call urAdapterGet() on them. The adapters will return numAdapters, which will always be 1 for CUDA, HIP, OpenCL and NativeCPU and either 1 or 0 for L0. If we are on BMG or newer, the v2 adapter will return 1, otherwise, the legacy will return 1.
llvm-lit still respects sycl_devices: level_zero:gpu means legacy adapter and level_zero_v2:gpu means v2 adapter.
This can be changed in later PRs.