Update README.md #8

pbalcer · 2024-05-20T10:10:38Z

No description provided.

This patch adds a script for running compute-benchmarks, https://github.com/intel/compute-benchmarks/, and a corresponding GH Actions workflow that runs those benchmarks when prompted to do so with a comment, like so: /benchmarks-level-zero --env UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1 Additional arguments can be appended to the end of the line. After the build if finished, the results will be presented through a comment. For now, this runs only a single scenario, api_overhead_benchmark_sycl with SubmitKernel test, but will expand over time to cover more.

…ation functions" This reverts commit bbb04b6.

pbalcer · 2024-05-20T10:11:12Z

/benchmarks-level-zero --env UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1 --save baseline

github-actions · 2024-05-20T10:11:31Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157096738

github-actions · 2024-05-20T10:16:48Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157096738
Job status: success. Test status: success.

Benchmark Results

Chart

xychart-beta
title "api_overhead_benchmark_sycl (lower is better)"
x-axis ["Batched In Order", "Batched Out Of Order", "Immediate In Order", "Immediate Out Of Order"]
y-axis "mean execution time per 10 kernels (in μs)" 0 --> 100.0
bar [52.319, 34.515, 53.157, 34.171]

Comparison

Comparison data not found. No comparison performed.

Details

Batched In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),52.319,52.192,1.53%,51.105,81.569,[CPU],[us]

Batched Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),34.515,34.253,3.05%,32.832,55.529,[CPU],[us]

Immediate In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),53.157,52.438,3.31%,51.132,67.065,[CPU],[us]

Immediate Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),34.171,34.074,3.92%,32.980,150.076,[CPU],[us]

pbalcer · 2024-05-20T10:18:44Z

/benchmarks-level-zero --env UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1 --compare baseline

github-actions · 2024-05-20T10:19:03Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157189347

github-actions · 2024-05-20T10:24:26Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157189347
Job status: success. Test status: success.

Benchmark Results

Chart

xychart-beta
title "api_overhead_benchmark_sycl (lower is better)"
x-axis ["Batched In Order", "Batched Out Of Order", "Immediate In Order", "Immediate Out Of Order"]
y-axis "mean execution time per 10 kernels (in μs)" 0 --> 100.0
bar [49.456, 33.692, 27.009, 26.39]
line [52.319, 34.515, 53.157, 34.171]

Comparison

Comparison with previous data:

Batched In Order: -5.47%
Batched Out Of Order: -2.38%
Immediate In Order: -49.19%
Immediate Out Of Order: -22.77%

Details

Batched In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),49.456,49.354,1.48%,48.032,61.412,[CPU],[us]

Batched Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),33.692,33.370,3.30%,32.019,51.200,[CPU],[us]

Immediate In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),27.009,25.353,12.38%,23.875,46.167,[CPU],[us]

Immediate Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_l0 --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),26.390,25.451,11.08%,22.776,38.362,[CPU],[us]

pbalcer · 2024-05-20T10:25:12Z

/benchmarks-level-zero --save baseline

github-actions · 2024-05-20T10:25:31Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157272368

github-actions · 2024-05-20T10:26:03Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157272368
Job status: cancelled. Test status: skipped.

pbalcer · 2024-05-20T10:26:54Z

/benchmarks-level-zero --save baseline

github-actions · 2024-05-20T10:27:06Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157294453

github-actions · 2024-05-20T10:32:28Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157294453
Job status: success. Test status: success.

Benchmark Results

Chart

xychart-beta
title "api_overhead_benchmark_sycl (lower is better)"
x-axis ["Batched In Order", "Batched Out Of Order", "Immediate In Order", "Immediate Out Of Order"]
y-axis "mean execution time per 10 kernels (in μs)" 0 --> 100.0
bar [26.188, 44.666, 30.889, 48.461]
line [26.188, 44.666, 30.889, 48.461]

Comparison

Comparison with previous data:

Batched In Order: +0.00%
Batched Out Of Order: +0.00%
Immediate In Order: +0.00%
Immediate Out Of Order: +0.00%

Details

Batched In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),26.188,24.938,13.79%,24.203,190.529,[CPU],[us]

Batched Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),44.666,44.762,5.45%,27.310,215.884,[CPU],[us]

Immediate In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),30.889,28.060,16.98%,26.813,64.287,[CPU],[us]

Immediate Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.461,48.397,1.26%,46.630,65.479,[CPU],[us]

pbalcer · 2024-05-20T10:33:22Z

/benchmarks-level-zero --env UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

github-actions · 2024-05-20T10:33:39Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157375928

github-actions · 2024-05-20T10:39:03Z

Compute Benchmarks L0 run:
https://github.com/pbalcer/unified-runtime/actions/runs/9157375928
Job status: success. Test status: success.

Benchmark Results

Chart

xychart-beta
title "api_overhead_benchmark_sycl (lower is better)"
x-axis ["Batched In Order", "Batched Out Of Order", "Immediate In Order", "Immediate Out Of Order"]
y-axis "mean execution time per 10 kernels (in μs)" 0 --> 100.0
bar [46.693, 44.905, 53.893, 49.958]
line [26.188, 44.666, 30.889, 48.461]

Comparison

Comparison with previous data:

Batched In Order: +78.30%
Batched Out Of Order: +0.54%
Immediate In Order: +74.47%
Immediate Out Of Order: +3.09%

Details

Batched In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),46.693,46.724,4.88%,28.176,210.150,[CPU],[us]

Batched Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=0
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),44.905,44.325,4.82%,43.274,211.954,[CPU],[us]

Immediate In Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),53.893,53.138,5.09%,51.210,112.757,[CPU],[us]

Immediate Out Of Order

Click to expand

Environment Variables:

UR_L0_USE_IMMEDIATE_COMMANDLISTS=1
UR_L0_IMMEDIATE_COMMANDLISTS_BATCH_EVENT_COMPLETIONS=1

Command:

/home/pmdk/actions-runner/_work/unified-runtime/unified-runtime/compute-benchmarks-build/bin//api_overhead_benchmark_sycl --test=SubmitKernel --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=10000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 --csv --noHeaders

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),49.958,49.890,1.80%,48.124,101.896,[CPU],[us]

pbalcer added 5 commits May 16, 2024 15:34

Merge branch 'compute-benchmarks-workflow'

fca16fa

save and compare functionality

ac96ceb

Revert "[Bindless][Exp] Remove phMem argument from bindless image cre…

aa4659c

…ation functions" This reverts commit bbb04b6.

Update README.md

b83d86d

pbalcer mentioned this pull request May 20, 2024

add compute benchmark workflow for L0 oneapi-src/unified-runtime#1616

Merged

pbalcer force-pushed the main branch 2 times, most recently from efa690a to 138f7f9 Compare May 22, 2024 12:08

pbalcer force-pushed the main branch from 138f7f9 to 49b9899 Compare June 6, 2024 10:29

pbalcer force-pushed the main branch 6 times, most recently from 9eeeead to 62234f1 Compare July 26, 2024 13:20

pbalcer force-pushed the main branch 3 times, most recently from 6ca52a5 to 844c209 Compare July 29, 2024 11:21

Update README.md #8

Are you sure you want to change the base?

Update README.md #8

Conversation

pbalcer commented May 20, 2024

pbalcer commented May 20, 2024

github-actions bot commented May 20, 2024

github-actions bot commented May 20, 2024

Benchmark Results

Chart

Comparison

Details

Batched In Order

Environment Variables:

Command:

Output:

Batched Out Of Order

Environment Variables:

Command:

Output:

Immediate In Order

Environment Variables:

Command:

Output:

Immediate Out Of Order

Environment Variables:

Command:

Output:

pbalcer commented May 20, 2024

github-actions bot commented May 20, 2024

github-actions bot commented May 20, 2024

Benchmark Results

Chart

Comparison

Details

Batched In Order

Environment Variables:

Command:

Output:

Batched Out Of Order

Environment Variables:

Command:

Output:

Immediate In Order

Environment Variables:

Command:

Output:

Immediate Out Of Order

Environment Variables:

Command:

Output:

pbalcer commented May 20, 2024

github-actions bot commented May 20, 2024

github-actions bot commented May 20, 2024

pbalcer commented May 20, 2024

github-actions bot commented May 20, 2024

github-actions bot commented May 20, 2024

Benchmark Results

Chart

Comparison

Details

Batched In Order

Environment Variables:

Command:

Output:

Batched Out Of Order

Environment Variables:

Command:

Output:

Immediate In Order

Environment Variables:

Command:

Output:

Immediate Out Of Order

Environment Variables:

Command:

Output:

pbalcer commented May 20, 2024

github-actions bot commented May 20, 2024

github-actions bot commented May 20, 2024