Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial test plan for cl_khr_unified_svm #2150

Open
wants to merge 2 commits into
base: cl_khr_unified_svm
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 193 additions & 0 deletions cl_khr_unified_svm-test-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# cl_khr_unified_svm

This document describe the test plan for the [`cl_khr_unified_svm`](https://github.com/KhronosGroup/OpenCL-Docs/pull/1282) extension, which may currently be found here:

https://github.com/KhronosGroup/OpenCL-Docs/pull/1282

## Prior Work

Existing OpenCL 2.0 SVM CTS tests may be found here:

https://github.com/KhronosGroup/OpenCL-CTS/tree/main/test_conformance/SVM

Some prior tests for the Intel USM extension [`cl_intel_unified_shared_memory`](https://registry.khronos.org/OpenCL/extensions/intel/cl_intel_unified_shared_memory.html) may be found here:

https://github.com/intel/compute-samples/tree/master/compute_samples/tests/test_cl_unified_shared_memory

## Test Plan

### Consistency Check

As an initial test, perform a consistency check to ensure that the platform and the test device enumerate a consistent set of SVM capabilities:

* [ ] For each device in the platform, check that the platform and device report the same number of SVM capability combinations.
* [ ] For each SVM capability combination reported by the platform, check that the reported platform capabilities at an index are the intersection of all non-zero device capabilities at the same index.
* [ ] For each SVM capability combination reported by the test device, check that the device SVM capabilities are either a super-set of the platform SVM capabilities or are zero, indicating that this SVM type is not supported.

### Testing SVM Capabilities

* [ ] `CL_SVM_CAPABILITY_SINGLE_ADDRESS_SPACE_KHR`
* Testing options:
1. Pass a pointer-to-a-pointer as a kernel argument.
Read the pointer from the kernel argument and write to it.
Ensure the correct value was written on the host.
2. Pass the pointer as a kernel argument.
Write the kernel argument to another allocation, which could even be an OpenCL buffer memory object.
Ensure the value written on the device matches the value on the host.
* [ ] `CL_SVM_CAPABILITY_SYSTEM_ALLOCATED_KHR`
* [ ] When allocating memory to test, use the system `malloc` rather than `clSVMAllocWithPropertiesKHR`.
* [ ] `CL_SVM_CAPABILITY_DEVICE_OWNED_KHR`
* TBD
* [ ] `CL_SVM_CAPABILITY_DEVICE_UNASSOCIATED_KHR`
* [ ] When allocating memory to test, do not pass the test device via a `CL_SVM_ALLOC_ASSOCIATED_DEVICE_HANDLE_KHR` property.
* [ ] Include at least one targeted test that passes the `CL_SVM_ALLOC_ASSOCIATED_DEVICE_HANDLE_KHR` property anyhow.
* [ ] `CL_SVM_CAPABILITY_CONTEXT_ACCESS_KHR`
* TBD: Create a multi-device context, use the allocation on all of the devices in the context?
* [ ] `CL_SVM_CAPABILITY_HOST_OWNED_KHR`
* TBD
* [ ] `CL_SVM_CAPABILITY_HOST_READ_KHR`
* [ ] When verifying test results, read from the allocation directly on the host, without mapping or copying explicitly.
* [ ] For devices that also support `CL_SVM_CAPABILITY_DEVICE_WRITE_KHR`, also include a targeted test that writes on the device and reads the results on the host without mapping or copying explicitly.
* [ ] `CL_SVM_CAPABILITY_HOST_WRITE_KHR`
* [ ] When initializing test data, write to the allocation directly from the host, without mapping or copying explicitly.
* [ ] For devices that also support `CL_SVM_CAPABILITY_DEVICE_READ_KHR`, also include a targeted test that writes on the host without mapping or copying explicitly, then read the results on the device and writes it to an an OpenCL buffer memory object.
* [ ] `CL_SVM_CAPABILITY_HOST_MAP_KHR`
* [ ] When initializing test data or verifying test results, map the allocation for access from the host, rather than copying explicitly.
* [ ] For devices that also support `CL_SVM_CAPABILITY_DEVICE_WRITE_KHR`, also include a targeted test that writes on the device and reads the results on the host by mapping.
* [ ] For devices that also support `CL_SVM_CAPABILITY_DEVICE_READ_KHR`, also include a targeted test that writes on the host by mapping, then reads the results on the device and writes it to an OpenCL buffer memory object.
* [ ] `CL_SVM_CAPABILITY_DEVICE_READ_KHR`
* [ ] Populate an allocation via direct access from the host, via mapping, or via device memcpy, depending on supported capabilities.
Then, read the value on the device and write it to an OpenCL buffer memory object.
* Mechanisms to read from the allocation on the device are:
* [ ] Via a kernel that reads from the allocation as a kernel argument.
* [ ] Via `clEnqueueSVMMemcpy`.
* [ ] `CL_SVM_CAPABILITY_DEVICE_WRITE_KHR`
* [ ] Populate an OpenCL buffer memory object with values.
Read the values from the OpenCL buffer memory object on the device and write them to the memory allocation.
Verify that the values were written correctly via direct access from the host, via mapping, or via memcpy, depending on supported capabilities.
* Mechanisms to write to the allocation on the device are:
* [ ] Via a kernel that writes to the allocation as a kernel argument.
* [ ] Via `clEnqueueSVMMemcpy`.
* [ ] Via `clEnqueueSVMMemFill`.
* [ ] `CL_SVM_CAPABILITY_DEVICE_ATOMIC_ACCESS_KHR`
* [ ] Initialize a memory allocation with zero.
Atomically increment the memory allocation from the device.
Verify that the correct updates were made via direct access from the host, via mapping, or via memcpy, depending on supported capabilities.
* [ ] `CL_SVM_CAPABILITY_CONCURRENT_ACCESS_KHR`
* Note, as described, these tests will require `CL_SVM_CAPABILITY_HOST_READ_KHR`, `CL_SVM_CAPABILITY_HOST_WRITE_KHR`, `CL_SVM_CAPABILITY_DEVICE_READ_KHR`, and `CL_SVM_CAPABILITY_DEVICE_WRITE_KHR` also.
* Details TBD, but the rough idea will be:
1. Allocate a small amount of memory, some which will be accessed via the host (or eventually, another device?), and some which will be accessed via the device.
2. From the host, perform some number of read-modify-write accesses to the memory.
If just one host thread is accessing memory, these read-modify-write accesses may be done non-atomically.
3. From the device, perform some number of read-modify-write accesses to the memory.
If just one work-item is accessing each memory location (but multiple work-items are accessing the allocation) these accesses may also be done non-atomically.
4. Verify that the correct updates were made via direct access from the host, via mapping, or via memcpy, depending on supported capabilities.
* May want to test multiple iterations, or to take other steps to increase the likelihood that accesses are made concurrently.
* Are there any best practices we can borrow from fine-grain SVM testing?
* TODO: See document issue 1, regarding the minimum supported granularity for concurrent access.
* [ ] `CL_SVM_CAPABILITY_CONCURRENT_ATOMIC_ACCESS_KHR`
* Note, as described, these tests will require `CL_SVM_CAPABILITY_HOST_READ_KHR` and `CL_SVM_CAPABILITY_HOST_WRITE_KHR`, also.
* [ ] Initialize a memory allocation with zero.
Atomically increment the memory allocation from the device and the host (relaxed atomics, device scope).
Verify that the correct updates were made via direct access from the host, via mapping, or via memcpy, depending on supported capabilities.
* Note, requires `CL_DEVICE_ATOMIC_ORDER_RELAXED` and `CL_DEVICE_ATOMIC_SCOPE_ALL_DEVICES` capabilities to be included in `CL_DEVICE_ATOMIC_MEMORY_CAPABILITIES`.
* [ ] Initialize a memory allocation.
Write values non-atomically to part of the allocation from the device, then write a flag to memory using a store-release, all svm devices atomic.
On the host, poll the flag value using a load-acquire atomic.
When the updated flag value is seen, read and verify the values that were written non-atomically.
Write updated values non-atomically to part of the allocation from the host, then write a flag to memory using a store-release atomic.
On the device, poll for the updated flag value using a load-acquire, all svm devices atomic.
When the updated flag value is seen, read and verify the values that were written non-atomically.
Repeat as needed.
* May want to test multiple iterations, or to take other steps to increase the likelihood that accesses are made concurrently.
* Are there any best practices we can borrow from fine-grain SVM testing?
* [ ] `CL_SVM_CAPABILITY_INDIRECT_ACCESS_KHR`
* [ ] For devices that support `CL_SVM_CAPABILITY_DEVICE_READ_KHR`, initialize a memory allocation with a known value.
On the host, embed the pointer to the allocation into an OpenCL buffer memory object.
On the device, read the pointer out of the OpenCL buffer memory object, then read a value from the pointer.
Store the value read to another OpenCL buffer memory object.
Back on the host, verify the known value was read.

### Testing New SVM APIs

* [ ] `clSVMAllocWithPropertiesKHR`
* [ ] Test without a `CL_SVM_ALLOC_ALIGNMENT_KHR` property.
* [ ] Test without a `CL_SVM_ALLOC_ACCESS_FLAGS_KHR` property.
* [ ] For SVM types supporting `CL_SVM_CAPABILITY_DEVICE_UNASSOCIATED_KHR`, test without a `CL_SVM_ALLOC_ASSOCIATED_DEVICE_HANDLE_KHR` device.
* [ ] Test with varying the `CL_SVM_ALLOC_ALIGNMENT_KHR` property - all powers of two from 1 to 128 inclusive?
* [ ] Test with varying the `CL_SVM_ALLOC_ACCESS_FLAGS_KHR` property - all combinations?
* [ ] Include at least one test with all properties: `CL_SVM_ALLOC_ASSOCIATED_DEVICE_HANDLE_KHR` plus `CL_SVM_ALLOC_ALIGNMENT_KHR` plus `CL_SVM_ALLOC_ACCESS_FLAGS_KHR`.
* [ ] TODO: Test zero-byte allocation?
* [ ] `clSVMFreeWithPropertiesKHR`
* TBD - depends on blocking free behavior.
* [ ] `clGetSVMPointerInfoKHR`
* After allocating, perform each of the queries, both with and without an explicit `device` parameter, for the base pointer returned by the `clSVMAllocWithPropertiesKHR` and a pointer computed from the base pointer.
* [ ] `CL_SVM_INFO_TYPE_INDEX_KHR` - must match the index passed during allocation.
* [ ] `CL_SVM_INFO_CAPABILITIES_KHR` - must match the device capabilities for the explicit `device` parameter, or be a super-set of the platform capabilities otherwise.
* [ ] `CL_SVM_INFO_PROPERTIES_KHR` - must match the properties passed during allocation, unless the properties during allocation were `NULL`.
* [ ] `CL_SVM_INFO_ACCESS_FLAGS_KHR` - must match the access flags passed during allocation, or be zero.
* [ ] `CL_SVM_INFO_BASE_PTR_KHR` - must match the base of the allocation.
* [ ] `CL_SVM_INFO_SIZE_KHR` - must match the size passed during allocation.
* [ ] `CL_SVM_INFO_ASSOCIATED_DEVICE_HANDLE_KHR` - must match the associated device, or be `NULL`.
* Test each of the queries for a bogus pointer (both with and without an explicit `device` parameter?).
* [ ] `CL_SVM_INFO_TYPE_INDEX_KHR` - must return `CL_UINT_MAX`.
* [ ] `CL_SVM_INFO_CAPABILITIES_KHR` - must return `0`.
* [ ] `CL_SVM_INFO_PROPERTIES_KHR` - must return size equal to `0`.
* [ ] `CL_SVM_INFO_ACCESS_FLAGS_KHR` - must return `0`? See doc issue.
* [ ] `CL_SVM_INFO_BASE_PTR_KHR` - must return `NULL`.
* [ ] `CL_SVM_INFO_SIZE_KHR` - must return `0`.
* [ ] `CL_SVM_INFO_ASSOCIATED_DEVICE_HANDLE_KHR` - must return `NULL`.
* [ ] `clGetSVMSuggestedTypeIndexKHR`
* [ ] Pass each of the supported device capabilities as `required_capabilities` and verify that the capabilities at `suggested_type_index` satisfy the required capabilities.
* [ ] Test without a `CL_SVM_ALLOC_ALIGNMENT_KHR` property.
* [ ] Test without a `CL_SVM_ALLOC_ACCESS_FLAGS_KHR` property.
* [ ] For SVM types supporting `CL_SVM_CAPABILITY_DEVICE_UNASSOCIATED_KHR`, test without a `CL_SVM_ALLOC_ASSOCIATED_DEVICE_HANDLE_KHR` device.
* [ ] Test with varying the `CL_SVM_ALLOC_ALIGNMENT_KHR` property - all powers of two from 1 to 128?
* [ ] Test with varying the `CL_SVM_ALLOC_ACCESS_FLAGS_KHR` property - all combinations?
* [ ] Include at least one test with all properties: `CL_SVM_ALLOC_ASSOCIATED_DEVICE_HANDLE_KHR` plus `CL_SVM_ALLOC_ALIGNMENT_KHR` plus `CL_SVM_ALLOC_ACCESS_FLAGS_KHR`.

### Testing Existing SVM APIs

* [ ] `clSetKernelExecInfo(CL_KERNEL_EXEC_INFO_SVM_PTRS)`
* [ ] Follow similar methodology as `CL_SVM_CAPABILITY_INDIRECT_ACCESS_KHR`, except set the indirectly accessed allocation explicitly.
* [ ] Test a pointer offset from the base pointer and verify that the entire allocation may be accessed indirectly.
* [ ] TODO: Include a bogus pointer and verify that this is not an error?
* [ ] `clSetKernelArgSVMPointer`
* Generally do not need a targeted test, because this API will be exercised by any tests using SVM pointers in kernels.
* [ ] Ensure at least one test for each SVM type passes a pointer offset from the base pointer and accesses it in a kernel.
* [ ] TODO: Pass a bogus pointer, execute the kernel, and verify that no error occurs as long as the pointer is not dereferenced?
* [ ] `clEnqueueSVMFree` (see: `test_enqueue_api.cpp`)
* [ ] Include an event on the command and verify that the event type is `CL_COMMAND_SVM_FREE`.
* [ ] Allocate memory for each type and verify that it can be freed asynchronously.
* [ ] `clEnqueueSVMMemcpy` (see: `test_enqueue_api.cpp`)
* [ ] Include an event on the command and verify that the event type is `CL_COMMAND_SVM_MEMCPY`.
* [ ] Test all combinations of SVM pointer and host pointer sources and destinations.
* [ ] Test a pointer offset from the base pointer as a memcpy source and destination.
* [ ] TODO: Check document issue 10 and consider copying to other devices or contexts.
* [ ] TODO: Pass a `NULL` pointer and `size` equal to zero and verify that this is not an error?
* [ ] TODO: Pass a bogus non-`NULL` pointer and `size` equal to zero and verify that this is not an error?
* [ ] `clEnqueueSVMMemFill` (see: `test_enqueue_api.cpp`)
* [ ] Include an event on the command and verify that the event type is `CL_COMMAND_SVM_MEMCPY`.
* [ ] Test multiple fill pattern sizes - all powers of two from 1 to 128 inclusive?
* [ ] Test a pointer offset from the base pointer as a fill destination.
* [ ] TODO: Check document issue 9 and consider filling allocations for other devices or contexts.
* [ ] TODO: Pass a `NULL` pointer and `size` equal to zero and verify that this is not an error?
* [ ] TODO: Pass a bogus non-`NULL` pointer and `size` equal to zero and verify that this is not an error?
* [ ] `clEnqueueSVMMap` / `clEnqueueSVMUnmap`
* [ ] Include an event on the command and verify that the event type is `CL_COMMAND_SVM_MAP` / `CL_COMMAND_SVM_UNMAP`.
* [ ] Test all combinations of map flags: `CL_MAP_READ`, `CL_MAP_WRITE`, `CL_MAP_WRITE_INVALIDATE_REGION`, `CL_MAP_READ | CL_MAP_WRITE`.
* [ ] `clEnqueueSVMMigrateMem`
* [ ] Include an event on the command and verify that the event type is `CL_COMMAND_SVM_MIGRATE_MEM`.
* [ ] Test all combinations of migration flags: `0`, `CL_MIGRATE_MEM_OBJECT_HOST`, `CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED`, `CL_MIGRATE_MEM_OBJECT_HOST | CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED`.
* [ ] Migrate an entire SVM allocation.
* [ ] Migrate a subset of an SVM allocation, starting from the base pointer.
* [ ] Migrate a subset of an SVM allocation, starting from a pointer offset from the base pointer.

### Non-Conventional Uses

Depending how these are resolved, they may create additional test items:

* [ ] TODO: `clCreateBuffer(CL_MEM_USE_HOST_PTR)`: Check document issue 20.
* [ ] TODO: `clCreateBuffer(CL_MEM_COPY_HOST_PTR)`: Check document issue 21.
* [ ] TODO: `clEnqueueReadBuffer` and `clEnqueueWriteBuffer` sources and destinations: Check document issue 22.
* [ ] TODO: `clEnqueueSVMMemFill`, etc. patterns: Check document issue 23.
Loading