Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add b200 policies for device.select.if,flagged,unique #3545

Merged

Conversation

bernhardmgruber
Copy link
Contributor

No description provided.

Copy link
Contributor

🟨 CI finished in 6h 14m: Pass: 98%/90 | Total: 2d 14h | Avg: 41m 52s | Max: 1h 13m | Hits: 293%/10928
  • 🟨 thrust: Pass: 97%/43 | Total: 23h 27m | Avg: 32m 44s | Max: 1h 02m | Hits: 261%/7376

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/41  | Total: 22h 31m | Avg: 32m 57s | Max:  1h 02m | Hits: 261%/7376  
      🟩 arm64              Pass: 100%/2   | Total: 56m 50s | Avg: 28m 25s | Max: 30m 02s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  3h 10m | Avg: 38m 04s | Max: 59m 06s | Hits: 260%/1844  
      🟩 12.5               Pass: 100%/2   | Total:  1h 47m | Avg: 53m 53s | Max: 55m 59s
      🔍 12.6               Pass:  97%/36  | Total: 18h 29m | Avg: 30m 49s | Max:  1h 02m | Hits: 261%/5532  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 53m 26s | Avg: 26m 43s | Max: 28m 20s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 10m | Avg: 38m 04s | Max: 59m 06s | Hits: 260%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 47m | Avg: 53m 53s | Max: 55m 59s
      🔍 nvcc12.6           Pass:  97%/34  | Total: 17h 36m | Avg: 31m 04s | Max:  1h 02m | Hits: 261%/5532  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 53m 26s | Avg: 26m 43s | Max: 28m 20s
      🔍 nvcc               Pass:  97%/41  | Total: 22h 34m | Avg: 33m 02s | Max:  1h 02m | Hits: 261%/7376  
    🔍 cxx: MSVC14.39 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  2h 04m | Avg: 31m 12s | Max: 33m 15s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 00m | Avg: 30m 08s | Max: 31m 33s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 38s | Max: 33m 05s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 02m | Avg: 31m 25s | Max: 32m 20s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 54m | Avg: 24m 56s | Max: 31m 57s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 59s | Max: 33m 42s
      🟩 GCC8               Pass: 100%/1   | Total: 32m 05s | Avg: 32m 05s | Max: 32m 05s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 49s | Max: 33m 17s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 41s | Max: 30m 42s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 53s | Max: 34m 18s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 29s | Max: 33m 42s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 05m | Avg: 23m 08s | Max: 35m 51s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 58m | Avg: 59m 06s | Max: 59m 07s | Hits: 261%/3688  
      🔍 MSVC14.39          Pass:  66%/3   | Total:  2h 33m | Avg: 51m 03s | Max:  1h 02m | Hits: 260%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 47m | Avg: 53m 53s | Max: 55m 59s
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  8h 07m | Avg: 28m 41s | Max: 33m 15s
      🟩 GCC                Pass: 100%/19  | Total:  9h 00m | Avg: 28m 28s | Max: 35m 51s
      🔍 MSVC               Pass:  80%/5   | Total:  4h 31m | Avg: 54m 16s | Max:  1h 02m | Hits: 261%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 53s | Max: 55m 59s
    🔍 jobs: TestCPU 🔍
      🟩 Build              Pass: 100%/37  | Total: 21h 38m | Avg: 35m 06s | Max:  1h 02m | Hits: 261%/7376  
      🔍 TestCPU            Pass:  66%/3   | Total: 51m 30s | Avg: 17m 10s | Max: 35m 19s
      🟩 TestGPU            Pass: 100%/3   | Total: 57m 31s | Avg: 19m 10s | Max: 25m 04s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 12h 17m | Avg: 36m 52s | Max: 59m 07s | Hits: 261%/5532  
      🔍 20                 Pass:  95%/21  | Total: 10h 31m | Avg: 30m 04s | Max:  1h 02m | Hits: 260%/1844  
    🟨 gpu
      🟨 v100               Pass:  97%/43  | Total: 23h 27m | Avg: 32m 44s | Max:  1h 02m | Hits: 261%/7376  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 38m 50s | Avg: 19m 25s | Max: 26m 56s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 18m 33s | Avg: 18m 33s | Max: 18m 33s
    
  • 🟩 cub: Pass: 100%/44 | Total: 1d 14h | Avg: 52m 22s | Max: 1h 13m | Hits: 361%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 12h | Avg: 52m 01s | Max:  1h 13m | Hits: 361%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 31s | Max:  1h 00m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  5h 16m | Avg:  1h 03m | Max:  1h 13m | Hits: 361%/888   
      🟩 12.5               Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 06m
      🟩 12.6               Pass: 100%/37  | Total:  1d 06h | Avg: 50m 06s | Max:  1h 12m | Hits: 361%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 16m | Avg:  1h 03m | Max:  1h 13m | Hits: 361%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 06m
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1d 04h | Avg: 49m 32s | Max:  1h 12m | Hits: 361%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟩 nvcc               Pass: 100%/42  | Total:  1d 12h | Avg: 51m 59s | Max:  1h 13m | Hits: 361%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 43m | Avg: 55m 59s | Max:  1h 00m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 56m | Avg: 58m 14s | Max:  1h 00m
      🟩 Clang16            Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 08m
      🟩 Clang17            Pass: 100%/2   | Total:  1h 52m | Avg: 56m 03s | Max: 56m 40s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 40m | Avg: 48m 40s | Max:  1h 01m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 59m | Avg: 59m 47s | Max: 59m 57s
      🟩 GCC8               Pass: 100%/1   | Total: 56m 24s | Avg: 56m 24s | Max: 56m 24s
      🟩 GCC9               Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 13m
      🟩 GCC10              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 02s | Max: 57m 27s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 58m | Avg: 59m 10s | Max:  1h 01m
      🟩 GCC12              Pass: 100%/4   | Total:  2h 38m | Avg: 39m 31s | Max: 57m 00s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 44m | Avg: 35m 36s | Max: 58m 57s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m | Hits: 361%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 23m | Avg:  1h 11m | Max:  1h 12m | Hits: 361%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 06m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 16m | Avg: 53m 54s | Max:  1h 08m
      🟩 GCC                Pass: 100%/21  | Total: 16h 17m | Avg: 46m 32s | Max:  1h 13m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 37m | Avg:  1h 09m | Max:  1h 12m | Hits: 361%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 06m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 44m 45s | Avg: 22m 22s | Max: 25m 15s
      🟩 v100               Pass: 100%/42  | Total:  1d 13h | Avg: 53m 47s | Max:  1h 13m | Hits: 361%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 58m 08s | Max:  1h 13m | Hits: 361%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 22m 38s | Avg: 22m 38s | Max: 22m 38s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 38s | Avg: 14m 38s | Max: 14m 38s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 01m | Avg: 20m 35s | Max: 21m 14s
      🟩 TestGPU            Pass: 100%/2   | Total: 54m 08s | Avg: 27m 04s | Max: 28m 32s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 44m 45s | Avg: 22m 22s | Max: 25m 15s
      🟩 90a                Pass: 100%/1   | Total: 25m 52s | Avg: 25m 52s | Max: 25m 52s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 07m | Avg:  1h 00m | Max:  1h 13m | Hits: 362%/2664  
      🟩 20                 Pass: 100%/24  | Total: 18h 16m | Avg: 45m 42s | Max:  1h 10m | Hits: 360%/888   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 18s | Avg: 4m 39s | Max: 7m 15s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 15s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 15s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 15s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 15s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 15s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 15s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 18s | Avg:  4m 39s | Max:  7m 15s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 03s | Avg:  2m 03s | Max:  2m 03s
      🟩 Test               Pass: 100%/1   | Total:  7m 15s | Avg:  7m 15s | Max:  7m 15s
    
  • 🟩 python: Pass: 100%/1 | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 46m 59s | Avg: 46m 59s | Max: 46m 59s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber bernhardmgruber force-pushed the tune_select_if_flag_unique branch 2 times, most recently from dfdd5d1 to 0cebb2c Compare January 28, 2025 08:29
Copy link
Contributor

🟨 CI finished in 3h 59m: Pass: 96%/90 | Total: 2d 14h | Avg: 41m 22s | Max: 1h 22m | Hits: 295%/10928
  • 🟨 cub: Pass: 95%/44 | Total: 1d 13h | Avg: 51m 18s | Max: 1h 22m | Hits: 363%/3552

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  95%/42  | Total:  1d 11h | Avg: 51m 01s | Max:  1h 22m | Hits: 363%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 17s | Max: 57m 18s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  4h 51m | Avg: 58m 21s | Max:  1h 07m | Hits: 363%/888   
      🟩 12.5               Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 09m
      🔍 12.6               Pass:  94%/37  | Total:  1d 06h | Avg: 49m 28s | Max:  1h 22m | Hits: 362%/2664  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 55m | Avg: 57m 35s | Max: 57m 50s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 51m | Avg: 58m 21s | Max:  1h 07m | Hits: 363%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 09m
      🔍 nvcc12.6           Pass:  94%/35  | Total:  1d 04h | Avg: 49m 00s | Max:  1h 22m | Hits: 362%/2664  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 35s | Max: 57m 50s
      🔍 nvcc               Pass:  95%/42  | Total:  1d 11h | Avg: 51m 00s | Max:  1h 22m | Hits: 363%/3552  
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 42m 37s | Avg: 21m 18s | Max: 23m 11s
      🔍 v100               Pass:  95%/42  | Total:  1d 12h | Avg: 52m 44s | Max:  1h 22m | Hits: 363%/3552  
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 19h 55m | Avg: 59m 47s | Max:  1h 22m | Hits: 363%/2664  
      🔍 20                 Pass:  91%/24  | Total: 17h 41m | Avg: 44m 14s | Max:  1h 11m | Hits: 361%/888   
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 43m | Avg: 55m 46s | Max: 56m 32s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 57m | Avg: 58m 47s | Max: 58m 56s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 53m | Avg: 56m 58s | Max: 59m 40s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 56m | Avg: 58m 27s | Max: 59m 39s
      🟨 Clang18            Pass:  85%/7   | Total:  5h 18m | Avg: 45m 30s | Max: 57m 50s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 53m | Avg: 56m 46s | Max: 58m 51s
      🟩 GCC8               Pass: 100%/1   | Total: 54m 54s | Avg: 54m 54s | Max: 54m 54s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 49m | Avg: 54m 59s | Max: 56m 25s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 57s | Max: 58m 25s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 59m | Avg: 59m 48s | Max:  1h 00m
      🟩 GCC12              Pass: 100%/4   | Total:  2h 38m | Avg: 39m 34s | Max:  1h 01m
      🟨 GCC13              Pass:  87%/8   | Total:  4h 30m | Avg: 33m 50s | Max: 57m 48s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 08m | Hits: 363%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 33m | Avg:  1h 16m | Max:  1h 22m | Hits: 362%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 09m
    🟨 cxx_family
      🟨 Clang              Pass:  94%/17  | Total: 14h 50m | Avg: 52m 21s | Max: 59m 40s
      🟨 GCC                Pass:  95%/21  | Total: 15h 42m | Avg: 44m 53s | Max:  1h 01m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 49m | Avg:  1h 12m | Max:  1h 22m | Hits: 363%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 09m
    🟨 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 57m 30s | Max:  1h 22m | Hits: 363%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 25m 24s | Avg: 25m 24s | Max: 25m 24s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 20s | Avg: 17m 20s | Max: 17m 20s
      🟨 HostLaunch         Pass:  66%/3   | Total: 52m 41s | Avg: 17m 33s | Max: 26m 44s
      🟨 TestGPU            Pass:  50%/2   | Total: 34m 20s | Avg: 17m 10s | Max: 25m 50s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 42m 37s | Avg: 21m 18s | Max: 23m 11s
      🟩 90a                Pass: 100%/1   | Total: 24m 44s | Avg: 24m 44s | Max: 24m 44s
    
  • 🟨 thrust: Pass: 97%/43 | Total: 23h 31m | Avg: 32m 50s | Max: 1h 01m | Hits: 262%/7376

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/41  | Total: 22h 32m | Avg: 32m 59s | Max:  1h 01m | Hits: 262%/7376  
      🟩 arm64              Pass: 100%/2   | Total: 59m 17s | Avg: 29m 38s | Max: 32m 29s
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  3h 06m | Avg: 37m 18s | Max: 56m 41s | Hits: 262%/1844  
      🟩 12.5               Pass: 100%/2   | Total:  1h 46m | Avg: 53m 14s | Max: 55m 23s
      🔍 12.6               Pass:  97%/36  | Total: 18h 38m | Avg: 31m 04s | Max:  1h 01m | Hits: 262%/5532  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 51m 48s | Avg: 25m 54s | Max: 27m 13s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 06m | Avg: 37m 18s | Max: 56m 41s | Hits: 262%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 46m | Avg: 53m 14s | Max: 55m 23s
      🔍 nvcc12.6           Pass:  97%/34  | Total: 17h 47m | Avg: 31m 23s | Max:  1h 01m | Hits: 262%/5532  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 51m 48s | Avg: 25m 54s | Max: 27m 13s
      🔍 nvcc               Pass:  97%/41  | Total: 22h 40m | Avg: 33m 10s | Max:  1h 01m | Hits: 262%/7376  
    🔍 cxx: MSVC14.39 🔍
      🟩 Clang14            Pass: 100%/4   | Total:  2h 02m | Avg: 30m 42s | Max: 32m 40s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 45s | Max: 33m 00s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 02m | Avg: 31m 09s | Max: 33m 19s
      🟩 Clang17            Pass: 100%/2   | Total: 59m 42s | Avg: 29m 51s | Max: 30m 20s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 08m | Avg: 26m 53s | Max: 42m 03s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 00m | Avg: 30m 07s | Max: 32m 09s
      🟩 GCC8               Pass: 100%/1   | Total: 30m 16s | Avg: 30m 16s | Max: 30m 16s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 10m | Avg: 35m 07s | Max: 35m 16s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 36s | Max: 31m 03s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 07m | Avg: 33m 38s | Max: 33m 57s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 12m | Avg: 36m 07s | Max: 36m 12s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 58m | Avg: 22m 15s | Max: 33m 48s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 55m | Avg: 57m 35s | Max: 58m 29s | Hits: 262%/3688  
      🔍 MSVC14.39          Pass:  66%/3   | Total:  2h 32m | Avg: 50m 40s | Max:  1h 01m | Hits: 262%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 46m | Avg: 53m 14s | Max: 55m 23s
    🔍 cxx_family: MSVC 🔍
      🟩 Clang              Pass: 100%/17  | Total:  8h 18m | Avg: 29m 19s | Max: 42m 03s
      🟩 GCC                Pass: 100%/19  | Total:  8h 59m | Avg: 28m 24s | Max: 36m 12s
      🔍 MSVC               Pass:  80%/5   | Total:  4h 27m | Avg: 53m 26s | Max:  1h 01m | Hits: 262%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 46m | Avg: 53m 14s | Max: 55m 23s
    🔍 jobs: TestCPU 🔍
      🟩 Build              Pass: 100%/37  | Total: 21h 33m | Avg: 34m 57s | Max:  1h 01m | Hits: 262%/7376  
      🔍 TestCPU            Pass:  66%/3   | Total: 47m 46s | Avg: 15m 55s | Max: 32m 25s
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 10m | Avg: 23m 34s | Max: 42m 03s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 12h 12m | Avg: 36m 36s | Max: 58m 29s | Hits: 262%/5532  
      🔍 20                 Pass:  95%/21  | Total: 10h 43m | Avg: 30m 37s | Max:  1h 01m | Hits: 262%/1844  
    🟨 gpu
      🟨 v100               Pass:  97%/43  | Total: 23h 31m | Avg: 32m 50s | Max:  1h 01m | Hits: 262%/7376  
    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 36m 29s | Avg: 18m 14s | Max: 25m 06s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 16m 55s | Avg: 16m 55s | Max: 16m 55s
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 8m 46s | Avg: 4m 23s | Max: 6m 45s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  8m 46s | Avg:  4m 23s | Max:  6m 45s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  8m 46s | Avg:  4m 23s | Max:  6m 45s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  8m 46s | Avg:  4m 23s | Max:  6m 45s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  8m 46s | Avg:  4m 23s | Max:  6m 45s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  8m 46s | Avg:  4m 23s | Max:  6m 45s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  8m 46s | Avg:  4m 23s | Max:  6m 45s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  8m 46s | Avg:  4m 23s | Max:  6m 45s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 01s | Avg:  2m 01s | Max:  2m 01s
      🟩 Test               Pass: 100%/1   | Total:  6m 45s | Avg:  6m 45s | Max:  6m 45s
    
  • 🟩 python: Pass: 100%/1 | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 44m 57s | Avg: 44m 57s | Max: 44m 57s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber bernhardmgruber force-pushed the tune_select_if_flag_unique branch from 0cebb2c to 54d6410 Compare January 28, 2025 13:47
Copy link
Contributor

🟩 CI finished in 2h 22m: Pass: 100%/89 | Total: 16h 47m | Avg: 11m 19s | Max: 1h 07m | Hits: 422%/10928
  • 🟩 cub: Pass: 100%/44 | Total: 8h 16m | Avg: 11m 16s | Max: 36m 34s | Hits: 540%/3552

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  8h 06m | Avg: 11m 34s | Max: 36m 34s | Hits: 540%/3552  
      🟩 arm64              Pass: 100%/2   | Total: 10m 02s | Avg:  5m 01s | Max:  5m 11s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 45m 23s | Avg:  9m 04s | Max: 23m 03s | Hits: 540%/888   
      🟩 12.5               Pass: 100%/2   | Total: 21m 53s | Avg: 10m 56s | Max: 11m 46s
      🟩 12.6               Pass: 100%/37  | Total:  7h 08m | Avg: 11m 35s | Max: 36m 34s | Hits: 540%/2664  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  9m 03s | Avg:  4m 31s | Max:  4m 39s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 45m 23s | Avg:  9m 04s | Max: 23m 03s | Hits: 540%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 21m 53s | Avg: 10m 56s | Max: 11m 46s
      🟩 nvcc12.6           Pass: 100%/35  | Total:  6h 59m | Avg: 11m 59s | Max: 36m 34s | Hits: 540%/2664  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  9m 03s | Avg:  4m 31s | Max:  4m 39s
      🟩 nvcc               Pass: 100%/42  | Total:  8h 07m | Avg: 11m 35s | Max: 36m 34s | Hits: 540%/3552  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 22m 04s | Avg:  5m 31s | Max:  5m 47s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 37s | Avg:  5m 48s | Max:  6m 03s
      🟩 Clang16            Pass: 100%/2   | Total: 11m 41s | Avg:  5m 50s | Max:  5m 55s
      🟩 Clang17            Pass: 100%/2   | Total: 11m 45s | Avg:  5m 52s | Max:  6m 01s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 30m | Avg: 12m 55s | Max: 36m 34s
      🟩 GCC7               Pass: 100%/2   | Total: 11m 19s | Avg:  5m 39s | Max:  5m 42s
      🟩 GCC8               Pass: 100%/1   | Total:  6m 26s | Avg:  6m 26s | Max:  6m 26s
      🟩 GCC9               Pass: 100%/2   | Total: 12m 35s | Avg:  6m 17s | Max:  6m 26s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 28s | Avg:  5m 44s | Max:  5m 53s
      🟩 GCC11              Pass: 100%/2   | Total: 13m 13s | Avg:  6m 36s | Max:  6m 46s
      🟩 GCC12              Pass: 100%/4   | Total: 37m 09s | Avg:  9m 17s | Max: 19m 44s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 02m | Avg: 15m 18s | Max: 30m 26s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 50m 59s | Avg: 25m 29s | Max: 27m 56s | Hits: 540%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 00m | Avg: 30m 28s | Max: 32m 13s | Hits: 540%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 21m 53s | Avg: 10m 56s | Max: 11m 46s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 27m | Avg:  8m 41s | Max: 36m 34s
      🟩 GCC                Pass: 100%/21  | Total:  3h 34m | Avg: 10m 13s | Max: 30m 26s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 51m | Avg: 27m 59s | Max: 32m 13s | Hits: 540%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total: 21m 53s | Avg: 10m 56s | Max: 11m 46s
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 24m 05s | Avg: 12m 02s | Max: 19m 44s
      🟩 v100               Pass: 100%/42  | Total:  7h 51m | Avg: 11m 14s | Max: 36m 34s | Hits: 540%/3552  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 10m | Avg:  8m 24s | Max: 32m 13s | Hits: 540%/3552  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 26m 40s | Avg: 26m 40s | Max: 26m 40s
      🟩 GraphCapture       Pass: 100%/1   | Total: 18m 08s | Avg: 18m 08s | Max: 18m 08s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 13m | Avg: 24m 26s | Max: 28m 52s
      🟩 TestGPU            Pass: 100%/2   | Total:  1h 07m | Avg: 33m 30s | Max: 36m 34s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 24m 05s | Avg: 12m 02s | Max: 19m 44s
      🟩 90a                Pass: 100%/1   | Total:  4m 33s | Avg:  4m 33s | Max:  4m 33s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 04m | Avg:  9m 12s | Max: 28m 44s | Hits: 540%/2664  
      🟩 20                 Pass: 100%/24  | Total:  5h 11m | Avg: 12m 59s | Max: 36m 34s | Hits: 540%/888   
    
  • 🟩 thrust: Pass: 100%/42 | Total: 7h 11m | Avg: 10m 16s | Max: 32m 26s | Hits: 365%/7376

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 33m 04s | Avg: 16m 32s | Max: 27m 28s
    🟩 cpu
      🟩 amd64              Pass: 100%/40  | Total:  7h 01m | Avg: 10m 32s | Max: 32m 26s | Hits: 365%/7376  
      🟩 arm64              Pass: 100%/2   | Total:  9m 52s | Avg:  4m 56s | Max:  5m 03s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 10m | Avg: 14m 08s | Max: 32m 26s | Hits: 365%/1844  
      🟩 12.5               Pass: 100%/2   | Total: 31m 34s | Avg: 15m 47s | Max: 16m 27s
      🟩 12.6               Pass: 100%/35  | Total:  5h 29m | Avg:  9m 24s | Max: 31m 53s | Hits: 365%/5532  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 33s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 10m | Avg: 14m 08s | Max: 32m 26s | Hits: 365%/1844  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 31m 34s | Avg: 15m 47s | Max: 16m 27s
      🟩 nvcc12.6           Pass: 100%/33  | Total:  5h 18m | Avg:  9m 39s | Max: 31m 53s | Hits: 365%/5532  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 47s | Avg:  5m 23s | Max:  5m 33s
      🟩 nvcc               Pass: 100%/40  | Total:  7h 00m | Avg: 10m 31s | Max: 32m 26s | Hits: 365%/7376  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 22m 16s | Avg:  5m 34s | Max:  6m 01s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 46s | Avg:  5m 53s | Max:  5m 56s
      🟩 Clang16            Pass: 100%/2   | Total: 11m 56s | Avg:  5m 58s | Max:  6m 05s
      🟩 Clang17            Pass: 100%/2   | Total: 11m 24s | Avg:  5m 42s | Max:  5m 46s
      🟩 Clang18            Pass: 100%/7   | Total:  1h 02m | Avg:  8m 58s | Max: 28m 28s
      🟩 GCC7               Pass: 100%/2   | Total: 38m 13s | Avg: 19m 06s | Max: 32m 26s
      🟩 GCC8               Pass: 100%/1   | Total:  6m 00s | Avg:  6m 00s | Max:  6m 00s
      🟩 GCC9               Pass: 100%/2   | Total: 12m 27s | Avg:  6m 13s | Max:  7m 04s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 33s | Avg:  5m 46s | Max:  6m 02s
      🟩 GCC11              Pass: 100%/2   | Total: 13m 03s | Avg:  6m 31s | Max:  6m 47s
      🟩 GCC12              Pass: 100%/2   | Total: 12m 30s | Avg:  6m 15s | Max:  6m 16s
      🟩 GCC13              Pass: 100%/8   | Total:  1h 19m | Avg:  9m 53s | Max: 27m 28s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 47m 56s | Avg: 23m 58s | Max: 25m 55s | Hits: 365%/3688  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 58m 51s | Avg: 29m 25s | Max: 31m 53s | Hits: 365%/3688  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 31m 34s | Avg: 15m 47s | Max: 16m 27s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  2h 00m | Avg:  7m 04s | Max: 28m 28s
      🟩 GCC                Pass: 100%/19  | Total:  2h 52m | Avg:  9m 06s | Max: 32m 26s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 46m | Avg: 26m 41s | Max: 31m 53s | Hits: 365%/7376  
      🟩 NVHPC              Pass: 100%/2   | Total: 31m 34s | Avg: 15m 47s | Max: 16m 27s
    🟩 gpu
      🟩 v100               Pass: 100%/42  | Total:  7h 11m | Avg: 10m 16s | Max: 32m 26s | Hits: 365%/7376  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 44m | Avg:  9m 18s | Max: 32m 26s | Hits: 365%/7376  
      🟩 TestCPU            Pass: 100%/2   | Total: 15m 21s | Avg:  7m 40s | Max:  7m 54s
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 11m | Avg: 23m 59s | Max: 28m 28s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 57s | Avg:  4m 57s | Max:  4m 57s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 31m | Avg: 10m 34s | Max: 32m 26s | Hits: 365%/5532  
      🟩 20                 Pass: 100%/20  | Total:  3h 07m | Avg:  9m 21s | Max: 31m 53s | Hits: 365%/1844  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 12m 25s | Avg: 6m 12s | Max: 10m 18s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max: 10m 18s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max: 10m 18s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max: 10m 18s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max: 10m 18s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max: 10m 18s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max: 10m 18s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 12m 25s | Avg:  6m 12s | Max: 10m 18s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 07s | Avg:  2m 07s | Max:  2m 07s
      🟩 Test               Pass: 100%/1   | Total: 10m 18s | Avg: 10m 18s | Max: 10m 18s
    
  • 🟩 python: Pass: 100%/1 | Total: 1h 07m | Avg: 1h 07m | Max: 1h 07m

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total:  1h 07m | Avg:  1h 07m | Max:  1h 07m
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 89)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
8 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber bernhardmgruber force-pushed the tune_select_if_flag_unique branch from 54d6410 to 139edcd Compare January 29, 2025 22:02
Copy link
Contributor

🟨 CI finished in 4h 25m: Pass: 95%/89 | Total: 2d 14h | Avg: 41m 55s | Max: 1h 13m | Hits: 226%/10936
  • 🟨 cub: Pass: 90%/44 | Total: 1d 13h | Avg: 51m 03s | Max: 1h 12m | Hits: 331%/3552

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  90%/42  | Total:  1d 11h | Avg: 50m 30s | Max:  1h 12m | Hits: 331%/3552  
      🟩 arm64              Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 07m
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  4h 55m | Avg: 59m 04s | Max:  1h 06m | Hits: 332%/888   
      🟩 12.5               Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 10m
      🔍 12.6               Pass:  89%/37  | Total:  1d 06h | Avg: 48m 54s | Max:  1h 12m | Hits: 331%/2664  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 04m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 55m | Avg: 59m 04s | Max:  1h 06m | Hits: 332%/888   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 10m
      🔍 nvcc12.6           Pass:  88%/35  | Total:  1d 04h | Avg: 48m 12s | Max:  1h 12m | Hits: 331%/2664  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 02m | Avg:  1h 01m | Max:  1h 04m
      🔍 nvcc               Pass:  90%/42  | Total:  1d 11h | Avg: 50m 34s | Max:  1h 12m | Hits: 331%/3552  
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 44m 01s | Avg: 22m 00s | Max: 24m 46s
      🔍 v100               Pass:  90%/42  | Total:  1d 12h | Avg: 52m 26s | Max:  1h 12m | Hits: 331%/3552  
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 19h 59m | Avg: 59m 59s | Max:  1h 12m | Hits: 332%/2664  
      🔍 20                 Pass:  83%/24  | Total: 17h 26m | Avg: 43m 36s | Max:  1h 10m | Hits: 331%/888   
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 46m | Avg: 56m 41s | Max: 59m 04s
      🟩 Clang15            Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m
      🟩 Clang16            Pass: 100%/2   | Total:  1h 53m | Avg: 56m 31s | Max: 58m 21s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 14s | Max: 55m 54s
      🟨 Clang18            Pass:  85%/7   | Total:  5h 41m | Avg: 48m 50s | Max:  1h 04m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 51m | Avg: 55m 57s | Max: 56m 52s
      🟩 GCC8               Pass: 100%/1   | Total: 57m 08s | Avg: 57m 08s | Max: 57m 08s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 52s | Max:  1h 00m
      🟩 GCC10              Pass: 100%/2   | Total:  1h 53m | Avg: 56m 56s | Max: 57m 20s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 52m | Avg: 56m 16s | Max: 57m 07s
      🟩 GCC12              Pass: 100%/4   | Total:  2h 35m | Avg: 38m 53s | Max: 56m 16s
      🟨 GCC13              Pass:  62%/8   | Total:  4h 08m | Avg: 31m 04s | Max:  1h 07m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m | Hits: 332%/1776  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 10m | Hits: 331%/1776  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 10m
    🟨 cxx_family
      🟨 Clang              Pass:  94%/17  | Total: 15h 13m | Avg: 53m 44s | Max:  1h 04m
      🟨 GCC                Pass:  85%/21  | Total: 15h 17m | Avg: 43m 41s | Max:  1h 07m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 33m | Avg:  1h 08m | Max:  1h 12m | Hits: 331%/3552  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 21m | Avg:  1h 10m | Max:  1h 10m
    🟨 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 57m 51s | Max:  1h 12m | Hits: 331%/3552  
      🟥 DeviceLaunch       Pass:   0%/1   | Total:  6m 59s | Avg:  6m 59s | Max:  6m 59s
      🟩 GraphCapture       Pass: 100%/1   | Total: 15m 46s | Avg: 15m 46s | Max: 15m 46s
      🟨 HostLaunch         Pass:  33%/3   | Total: 36m 07s | Avg: 12m 02s | Max: 19m 15s
      🟨 TestGPU            Pass:  50%/2   | Total: 46m 49s | Avg: 23m 24s | Max: 37m 46s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 44m 01s | Avg: 22m 00s | Max: 24m 46s
      🟩 90a                Pass: 100%/1   | Total: 25m 11s | Avg: 25m 11s | Max: 25m 11s
    
  • 🟩 thrust: Pass: 100%/42 | Total: 23h 46m | Avg: 33m 58s | Max: 1h 13m | Hits: 175%/7384

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 42m 07s | Avg: 21m 03s | Max: 24m 21s
    🟩 cpu
      🟩 amd64              Pass: 100%/40  | Total: 22h 47m | Avg: 34m 11s | Max:  1h 13m | Hits: 175%/7384  
      🟩 arm64              Pass: 100%/2   | Total: 58m 55s | Avg: 29m 27s | Max: 31m 13s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 24m | Avg: 40m 57s | Max:  1h 13m | Hits: 175%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 51m | Avg: 55m 40s | Max: 55m 49s
      🟩 12.6               Pass: 100%/35  | Total: 18h 30m | Avg: 31m 43s | Max:  1h 03m | Hits: 175%/5538  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 55m 38s | Avg: 27m 49s | Max: 28m 29s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 24m | Avg: 40m 57s | Max:  1h 13m | Hits: 175%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 51m | Avg: 55m 40s | Max: 55m 49s
      🟩 nvcc12.6           Pass: 100%/33  | Total: 17h 34m | Avg: 31m 57s | Max:  1h 03m | Hits: 175%/5538  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 55m 38s | Avg: 27m 49s | Max: 28m 29s
      🟩 nvcc               Pass: 100%/40  | Total: 22h 51m | Avg: 34m 16s | Max:  1h 13m | Hits: 175%/7384  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 07m | Avg: 31m 49s | Max: 34m 11s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 04m | Avg: 32m 01s | Max: 33m 25s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 06m | Avg: 33m 00s | Max: 33m 37s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 06m | Avg: 33m 00s | Max: 33m 13s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 47m | Avg: 23m 54s | Max: 31m 42s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 55s | Max: 33m 42s
      🟩 GCC8               Pass: 100%/1   | Total: 31m 39s | Avg: 31m 39s | Max: 31m 39s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 07m | Avg: 33m 43s | Max: 34m 04s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 01m | Avg: 30m 37s | Max: 31m 04s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 22s | Max: 34m 54s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 13s | Max: 34m 14s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 26m | Avg: 25m 51s | Max: 41m 00s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 13m | Hits: 175%/3692  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 03m | Hits: 176%/3692  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 40s | Max: 55m 49s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  8h 10m | Avg: 28m 51s | Max: 34m 11s
      🟩 GCC                Pass: 100%/19  | Total:  9h 26m | Avg: 29m 48s | Max: 41m 00s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 18m | Avg:  1h 04m | Max:  1h 13m | Hits: 175%/7384  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 51m | Avg: 55m 40s | Max: 55m 49s
    🟩 gpu
      🟩 v100               Pass: 100%/42  | Total: 23h 46m | Avg: 33m 58s | Max:  1h 13m | Hits: 175%/7384  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 22h 30m | Avg: 36m 30s | Max:  1h 13m | Hits: 175%/7384  
      🟩 TestCPU            Pass: 100%/2   | Total: 15m 45s | Avg:  7m 52s | Max:  8m 23s
      🟩 TestGPU            Pass: 100%/3   | Total:  1h 00m | Avg: 20m 01s | Max: 28m 03s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 19m 30s | Avg: 19m 30s | Max: 19m 30s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 12h 58m | Avg: 38m 55s | Max:  1h 13m | Hits: 175%/5538  
      🟩 20                 Pass: 100%/20  | Total: 10h 06m | Avg: 30m 18s | Max:  1h 03m | Hits: 175%/1846  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 13m 04s | Avg: 6m 32s | Max: 11m 02s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max: 11m 02s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max: 11m 02s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max: 11m 02s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max: 11m 02s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max: 11m 02s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max: 11m 02s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total: 13m 04s | Avg:  6m 32s | Max: 11m 02s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 02s | Avg:  2m 02s | Max:  2m 02s
      🟩 Test               Pass: 100%/1   | Total: 11m 02s | Avg: 11m 02s | Max: 11m 02s
    
  • 🟩 python: Pass: 100%/1 | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 45m 00s | Avg: 45m 00s | Max: 45m 00s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 89)

# Runner
65 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
8 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber force-pushed the tune_select_if_flag_unique branch 2 times, most recently from 3d1ce4a to 969866b Compare January 31, 2025 16:23
Copy link
Contributor

🟩 CI finished in 1h 59m: Pass: 100%/89 | Total: 2d 12h | Avg: 40m 36s | Max: 1h 12m | Hits: 290%/10896
  • 🟩 cub: Pass: 100%/44 | Total: 1d 13h | Avg: 51m 20s | Max: 1h 12m | Hits: 351%/3512

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 11h | Avg: 51m 03s | Max:  1h 12m | Hits: 351%/3512  
      🟩 arm64              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 13s | Max: 57m 48s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 45m | Avg: 57m 10s | Max:  1h 01m | Hits: 351%/878   
      🟩 12.5               Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 10m
      🟩 12.6               Pass: 100%/37  | Total:  1d 06h | Avg: 49m 38s | Max:  1h 12m | Hits: 351%/2634  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 00m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 45m | Avg: 57m 10s | Max:  1h 01m | Hits: 351%/878   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 10m
      🟩 nvcc12.6           Pass: 100%/35  | Total:  1d 04h | Avg: 49m 00s | Max:  1h 12m | Hits: 351%/2634  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 00m
      🟩 nvcc               Pass: 100%/42  | Total:  1d 11h | Avg: 50m 54s | Max:  1h 12m | Hits: 351%/3512  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 49m | Avg: 57m 26s | Max:  1h 01m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 51m | Avg: 55m 46s | Max: 56m 47s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 14s | Max: 57m 05s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 49m | Avg: 54m 47s | Max: 55m 57s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 36m | Avg: 48m 02s | Max:  1h 00m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 51m | Avg: 55m 50s | Max: 57m 11s
      🟩 GCC8               Pass: 100%/1   | Total: 56m 42s | Avg: 56m 42s | Max: 56m 42s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 03s | Max: 59m 17s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 54m | Avg: 57m 24s | Max:  1h 01m
      🟩 GCC11              Pass: 100%/2   | Total:  1h 50m | Avg: 55m 18s | Max: 55m 58s
      🟩 GCC12              Pass: 100%/4   | Total:  2h 49m | Avg: 42m 17s | Max:  1h 00m
      🟩 GCC13              Pass: 100%/8   | Total:  4h 40m | Avg: 35m 06s | Max:  1h 01m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 07m | Hits: 351%/1756  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 12m | Hits: 351%/1756  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 10m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 14h 57m | Avg: 52m 48s | Max:  1h 01m
      🟩 GCC                Pass: 100%/21  | Total: 15h 55m | Avg: 45m 31s | Max:  1h 01m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 29m | Avg:  1h 07m | Max:  1h 12m | Hits: 351%/3512  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 10m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 48m 31s | Avg: 24m 15s | Max: 24m 16s
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 02m | Avg: 30m 16s | Max:  1h 00m
      🟩 v100               Pass: 100%/34  | Total:  1d 08h | Avg: 57m 53s | Max:  1h 12m | Hits: 351%/3512  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 11h | Avg: 57m 07s | Max:  1h 12m | Hits: 351%/3512  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 19m 51s | Avg: 19m 51s | Max: 19m 51s
      🟩 GraphCapture       Pass: 100%/1   | Total: 15m 51s | Avg: 15m 51s | Max: 15m 51s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 13m | Avg: 24m 21s | Max: 24m 47s
      🟩 TestGPU            Pass: 100%/2   | Total: 37m 04s | Avg: 18m 32s | Max: 18m 56s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 48m 31s | Avg: 24m 15s | Max: 24m 16s
      🟩 90a                Pass: 100%/1   | Total: 23m 53s | Avg: 23m 53s | Max: 23m 53s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 19h 38m | Avg: 58m 55s | Max:  1h 12m | Hits: 351%/2634  
      🟩 20                 Pass: 100%/24  | Total: 18h 00m | Avg: 45m 01s | Max:  1h 10m | Hits: 350%/878   
    
  • 🟩 thrust: Pass: 100%/42 | Total: 22h 02m | Avg: 31m 29s | Max: 1h 08m | Hits: 261%/7384

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 35m 55s | Avg: 17m 57s | Max: 24m 45s
    🟩 cpu
      🟩 amd64              Pass: 100%/40  | Total: 21h 04m | Avg: 31m 37s | Max:  1h 08m | Hits: 261%/7384  
      🟩 arm64              Pass: 100%/2   | Total: 57m 35s | Avg: 28m 47s | Max: 30m 09s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 54m | Avg: 34m 53s | Max: 51m 53s | Hits: 261%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 47m | Avg: 53m 56s | Max: 57m 50s
      🟩 12.6               Pass: 100%/35  | Total: 17h 20m | Avg: 29m 43s | Max:  1h 08m | Hits: 261%/5538  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 57m 36s | Avg: 28m 48s | Max: 28m 54s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 54m | Avg: 34m 53s | Max: 51m 53s | Hits: 261%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 47m | Avg: 53m 56s | Max: 57m 50s
      🟩 nvcc12.6           Pass: 100%/33  | Total: 16h 22m | Avg: 29m 46s | Max:  1h 08m | Hits: 261%/5538  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 57m 36s | Avg: 28m 48s | Max: 28m 54s
      🟩 nvcc               Pass: 100%/40  | Total: 21h 04m | Avg: 31m 37s | Max:  1h 08m | Hits: 261%/7384  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 55m | Avg: 28m 49s | Max: 29m 44s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 04m | Avg: 32m 29s | Max: 33m 08s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 05m | Avg: 32m 36s | Max: 32m 49s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 04m | Avg: 32m 10s | Max: 32m 20s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 42m | Avg: 23m 12s | Max: 30m 32s
      🟩 GCC7               Pass: 100%/2   | Total: 59m 02s | Avg: 29m 31s | Max: 29m 54s
      🟩 GCC8               Pass: 100%/1   | Total: 32m 11s | Avg: 32m 11s | Max: 32m 11s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 52s | Max: 34m 09s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 24s | Max: 31m 47s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 02m | Avg: 31m 02s | Max: 32m 04s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 04m | Avg: 32m 02s | Max: 32m 13s
      🟩 GCC13              Pass: 100%/8   | Total:  2h 45m | Avg: 20m 37s | Max: 30m 49s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 46m | Avg: 53m 26s | Max: 55m 00s | Hits: 261%/3692  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 08m | Hits: 262%/3692  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 47m | Avg: 53m 56s | Max: 57m 50s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 52m | Avg: 27m 46s | Max: 33m 08s
      🟩 GCC                Pass: 100%/19  | Total:  8h 30m | Avg: 26m 53s | Max: 34m 09s
      🟩 MSVC               Pass: 100%/4   | Total:  3h 51m | Avg: 57m 51s | Max:  1h 08m | Hits: 261%/7384  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 56s | Max: 57m 50s
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  2h 12m | Avg: 16m 37s | Max: 30m 49s
      🟩 v100               Pass: 100%/34  | Total: 19h 49m | Avg: 34m 59s | Max:  1h 08m | Hits: 261%/7384  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 21h 13m | Avg: 34m 24s | Max:  1h 08m | Hits: 261%/7384  
      🟩 TestCPU            Pass: 100%/2   | Total: 16m 16s | Avg:  8m 08s | Max:  8m 10s
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 53s | Avg: 10m 57s | Max: 11m 14s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 18m 55s | Avg: 18m 55s | Max: 18m 55s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 11h 46m | Avg: 35m 20s | Max: 55m 51s | Hits: 261%/5538  
      🟩 20                 Pass: 100%/20  | Total:  9h 39m | Avg: 28m 58s | Max:  1h 08m | Hits: 261%/1846  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 6m 54s | Avg: 3m 27s | Max: 4m 52s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  6m 54s | Avg:  3m 27s | Max:  4m 52s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  6m 54s | Avg:  3m 27s | Max:  4m 52s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  6m 54s | Avg:  3m 27s | Max:  4m 52s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  6m 54s | Avg:  3m 27s | Max:  4m 52s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  6m 54s | Avg:  3m 27s | Max:  4m 52s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  6m 54s | Avg:  3m 27s | Max:  4m 52s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  6m 54s | Avg:  3m 27s | Max:  4m 52s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 02s | Avg:  2m 02s | Max:  2m 02s
      🟩 Test               Pass: 100%/1   | Total:  4m 52s | Avg:  4m 52s | Max:  4m 52s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 17s | Avg: 26m 17s | Max: 26m 17s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 89)

# Runner
65 linux-amd64-cpu16
8 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber force-pushed the tune_select_if_flag_unique branch from 969866b to afeb37e Compare February 3, 2025 16:03
Copy link
Contributor

github-actions bot commented Feb 3, 2025

🟩 CI finished in 1h 45m: Pass: 100%/90 | Total: 2d 15h | Avg: 42m 14s | Max: 1h 20m | Hits: 303%/12730
  • 🟩 cub: Pass: 100%/44 | Total: 1d 15h | Avg: 53m 39s | Max: 1h 20m | Hits: 359%/3500

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 13h | Avg: 53m 24s | Max:  1h 20m | Hits: 359%/3500  
      🟩 arm64              Pass: 100%/2   | Total:  1h 57m | Avg: 58m 44s | Max:  1h 00m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 51m | Avg: 58m 16s | Max:  1h 06m | Hits: 360%/875   
      🟩 12.5               Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 13m
      🟩 12.8               Pass: 100%/37  | Total:  1d 08h | Avg: 52m 10s | Max:  1h 20m | Hits: 358%/2625  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 51m | Avg: 58m 16s | Max:  1h 06m | Hits: 360%/875   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 13m
      🟩 nvcc12.8           Pass: 100%/35  | Total:  1d 06h | Avg: 51m 28s | Max:  1h 20m | Hits: 358%/2625  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 08m | Avg:  1h 04m | Max:  1h 04m
      🟩 nvcc               Pass: 100%/42  | Total:  1d 13h | Avg: 53m 08s | Max:  1h 20m | Hits: 359%/3500  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 43m | Avg: 55m 54s | Max: 58m 43s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 54m | Avg: 57m 00s | Max: 59m 33s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 54m | Avg: 57m 09s | Max: 58m 52s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 01s | Max:  1h 00m
      🟩 Clang18            Pass: 100%/7   | Total:  5h 40m | Avg: 48m 42s | Max:  1h 04m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 28s | Max: 57m 09s
      🟩 GCC8               Pass: 100%/1   | Total: 59m 22s | Avg: 59m 22s | Max: 59m 22s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 59m | Avg: 59m 42s | Max: 59m 58s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 49m | Avg: 54m 43s | Max: 55m 14s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 01s | Max:  1h 00m
      🟩 GCC12              Pass: 100%/2   | Total:  1h 58m | Avg: 59m 24s | Max:  1h 01m
      🟩 GCC13              Pass: 100%/10  | Total:  6h 13m | Avg: 37m 21s | Max:  1h 01m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 22m | Avg:  1h 11m | Max:  1h 16m | Hits: 359%/1750  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 38m | Avg:  1h 19m | Max:  1h 20m | Hits: 358%/1750  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 13m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 10m | Avg: 53m 35s | Max:  1h 04m
      🟩 GCC                Pass: 100%/21  | Total: 16h 49m | Avg: 48m 04s | Max:  1h 01m
      🟩 MSVC               Pass: 100%/4   | Total:  5h 01m | Avg:  1h 15m | Max:  1h 20m | Hits: 359%/3500  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 13m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 50m 36s | Avg: 25m 18s | Max: 27m 13s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 10h | Avg:  1h 00m | Max:  1h 20m | Hits: 359%/3500  
      🟩 rtxa6000           Pass: 100%/8   | Total:  4h 01m | Avg: 30m 12s | Max: 57m 56s
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 12h | Avg: 59m 43s | Max:  1h 20m | Hits: 359%/3500  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 21m 44s | Avg: 21m 44s | Max: 21m 44s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 45s | Avg: 14m 45s | Max: 14m 45s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 12m | Avg: 24m 04s | Max: 24m 51s
      🟩 TestGPU            Pass: 100%/2   | Total: 42m 21s | Avg: 21m 10s | Max: 21m 38s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 50m 36s | Avg: 25m 18s | Max: 27m 13s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 22m | Avg:  1h 01m | Max:  1h 20m | Hits: 359%/2625  
      🟩 20                 Pass: 100%/24  | Total: 18h 58m | Avg: 47m 25s | Max:  1h 18m | Hits: 357%/875   
    
  • 🟩 thrust: Pass: 100%/43 | Total: 23h 27m | Avg: 32m 43s | Max: 1h 00m | Hits: 282%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 35m 38s | Avg: 17m 49s | Max: 24m 31s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 22h 29m | Avg: 32m 55s | Max:  1h 00m | Hits: 282%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 57m 46s | Avg: 28m 53s | Max: 30m 50s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 01m | Avg: 36m 16s | Max: 56m 23s | Hits: 262%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 48m | Avg: 54m 13s | Max: 57m 38s
      🟩 12.8               Pass: 100%/36  | Total: 18h 37m | Avg: 31m 02s | Max:  1h 00m | Hits: 287%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 53m 45s | Avg: 26m 52s | Max: 28m 01s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 01m | Avg: 36m 16s | Max: 56m 23s | Hits: 262%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 48m | Avg: 54m 13s | Max: 57m 38s
      🟩 nvcc12.8           Pass: 100%/34  | Total: 17h 43m | Avg: 31m 17s | Max:  1h 00m | Hits: 287%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 53m 45s | Avg: 26m 52s | Max: 28m 01s
      🟩 nvcc               Pass: 100%/41  | Total: 22h 33m | Avg: 33m 00s | Max:  1h 00m | Hits: 282%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 00m | Avg: 30m 11s | Max: 32m 10s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 02m | Avg: 31m 18s | Max: 33m 02s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 03m | Avg: 31m 40s | Max: 32m 50s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 00m | Avg: 30m 10s | Max: 31m 48s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 41m | Avg: 23m 00s | Max: 31m 24s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 22s | Max: 35m 40s
      🟩 GCC8               Pass: 100%/1   | Total: 34m 37s | Avg: 34m 37s | Max: 34m 37s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 27s | Max: 34m 31s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 07m | Avg: 33m 44s | Max: 34m 00s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 10m | Avg: 35m 24s | Max: 37m 07s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 07m | Avg: 33m 40s | Max: 35m 12s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 12m | Avg: 24m 06s | Max: 36m 48s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 52m | Avg: 56m 25s | Max: 56m 27s | Hits: 261%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 35m | Avg: 51m 45s | Max:  1h 00m | Hits: 296%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 48m | Avg: 54m 13s | Max: 57m 38s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 48m | Avg: 27m 32s | Max: 33m 02s
      🟩 GCC                Pass: 100%/19  | Total:  9h 22m | Avg: 29m 37s | Max: 37m 07s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 28m | Avg: 53m 37s | Max:  1h 00m | Hits: 282%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 48m | Avg: 54m 13s | Max: 57m 38s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 19h 33m | Avg: 35m 33s | Max:  1h 00m | Hits: 262%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 53m | Avg: 23m 23s | Max: 58m 45s | Hits: 313%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 22h 02m | Avg: 35m 44s | Max:  1h 00m | Hits: 261%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 51m 12s | Avg: 17m 04s | Max: 35m 41s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 33m 28s | Avg: 11m 09s | Max: 11m 57s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 36m 48s | Avg: 36m 48s | Max: 36m 48s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 12h 11m | Avg: 36m 33s | Max:  1h 00m | Hits: 262%/5538  
      🟩 20                 Pass: 100%/21  | Total: 10h 40m | Avg: 30m 30s | Max: 58m 45s | Hits: 313%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 6m 59s | Avg: 3m 29s | Max: 4m 55s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  4m 55s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  4m 55s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  4m 55s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  4m 55s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  4m 55s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  4m 55s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  4m 55s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 04s | Avg:  2m 04s | Max:  2m 04s
      🟩 Test               Pass: 100%/1   | Total:  4m 55s | Avg:  4m 55s | Max:  4m 55s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 55s | Avg: 26m 55s | Max: 26m 55s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

Copy link
Contributor

github-actions bot commented Feb 3, 2025

🟩 CI finished in 1h 33m: Pass: 100%/90 | Total: 2d 14h | Avg: 41m 39s | Max: 1h 19m | Hits: 304%/12730
  • 🟩 cub: Pass: 100%/44 | Total: 1d 15h | Avg: 53m 14s | Max: 1h 19m | Hits: 359%/3500

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 13h | Avg: 52m 53s | Max:  1h 19m | Hits: 359%/3500  
      🟩 arm64              Pass: 100%/2   | Total:  2h 01m | Avg:  1h 00m | Max:  1h 01m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  4h 49m | Avg: 57m 55s | Max:  1h 01m | Hits: 360%/875   
      🟩 12.5               Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
      🟩 12.8               Pass: 100%/37  | Total:  1d 07h | Avg: 51m 44s | Max:  1h 19m | Hits: 359%/2625  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 58m | Avg: 59m 27s | Max:  1h 02m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 49m | Avg: 57m 55s | Max:  1h 01m | Hits: 360%/875   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
      🟩 nvcc12.8           Pass: 100%/35  | Total:  1d 05h | Avg: 51m 18s | Max:  1h 19m | Hits: 359%/2625  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 58m | Avg: 59m 27s | Max:  1h 02m
      🟩 nvcc               Pass: 100%/42  | Total:  1d 13h | Avg: 52m 56s | Max:  1h 19m | Hits: 359%/3500  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 52m | Avg: 58m 03s | Max:  1h 01m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 51m | Avg: 55m 52s | Max: 56m 51s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 48m | Avg: 54m 09s | Max: 54m 54s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 52m | Avg: 56m 29s | Max: 59m 35s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 36m | Avg: 48m 02s | Max:  1h 02m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 40s | Max: 59m 03s
      🟩 GCC8               Pass: 100%/1   | Total: 56m 07s | Avg: 56m 07s | Max: 56m 07s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 55m | Avg: 57m 32s | Max:  1h 01m
      🟩 GCC10              Pass: 100%/2   | Total:  1h 51m | Avg: 55m 35s | Max: 55m 38s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 12s | Max:  1h 02m
      🟩 GCC12              Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 05m
      🟩 GCC13              Pass: 100%/10  | Total:  6h 24m | Avg: 38m 26s | Max:  1h 14m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 09m | Hits: 360%/1750  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 25m | Avg:  1h 12m | Max:  1h 19m | Hits: 359%/1750  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 01m | Avg: 53m 01s | Max:  1h 02m
      🟩 GCC                Pass: 100%/21  | Total: 17h 08m | Avg: 48m 57s | Max:  1h 14m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 34m | Avg:  1h 08m | Max:  1h 19m | Hits: 359%/3500  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 18m | Avg:  1h 09m | Max:  1h 12m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 49m 51s | Avg: 24m 55s | Max: 26m 29s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 10h | Avg:  1h 00m | Max:  1h 19m | Hits: 359%/3500  
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 55m | Avg: 29m 28s | Max:  1h 02m
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 12h | Avg: 59m 25s | Max:  1h 19m | Hits: 359%/3500  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 19m 06s | Avg: 19m 06s | Max: 19m 06s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 55s | Avg: 14m 55s | Max: 14m 55s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 10m | Avg: 23m 23s | Max: 23m 37s
      🟩 TestGPU            Pass: 100%/2   | Total: 39m 45s | Avg: 19m 52s | Max: 20m 48s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 49m 51s | Avg: 24m 55s | Max: 26m 29s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 14m | Avg:  1h 14m | Max:  1h 14m
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 19h 46m | Avg: 59m 19s | Max:  1h 12m | Hits: 360%/2625  
      🟩 20                 Pass: 100%/24  | Total: 19h 15m | Avg: 48m 09s | Max:  1h 19m | Hits: 359%/875   
    
  • 🟩 thrust: Pass: 100%/43 | Total: 22h 53m | Avg: 31m 57s | Max: 57m 37s | Hits: 283%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 38m 19s | Avg: 19m 09s | Max: 27m 05s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 21h 56m | Avg: 32m 06s | Max: 57m 37s | Hits: 283%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 57m 18s | Avg: 28m 39s | Max: 30m 13s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  2h 59m | Avg: 35m 58s | Max: 55m 50s | Hits: 262%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 36m | Avg: 48m 12s | Max: 48m 34s
      🟩 12.8               Pass: 100%/36  | Total: 18h 17m | Avg: 30m 29s | Max: 57m 37s | Hits: 288%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 50m 27s | Avg: 25m 13s | Max: 25m 30s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  2h 59m | Avg: 35m 58s | Max: 55m 50s | Hits: 262%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 36m | Avg: 48m 12s | Max: 48m 34s
      🟩 nvcc12.8           Pass: 100%/34  | Total: 17h 27m | Avg: 30m 47s | Max: 57m 37s | Hits: 288%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 50m 27s | Avg: 25m 13s | Max: 25m 30s
      🟩 nvcc               Pass: 100%/41  | Total: 22h 03m | Avg: 32m 16s | Max: 57m 37s | Hits: 283%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  1h 59m | Avg: 29m 55s | Max: 32m 28s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 03m | Avg: 31m 44s | Max: 33m 02s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 02m | Avg: 31m 26s | Max: 33m 04s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 00m | Avg: 30m 14s | Max: 31m 41s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 37m | Avg: 22m 34s | Max: 31m 57s
      🟩 GCC7               Pass: 100%/2   | Total: 58m 35s | Avg: 29m 17s | Max: 30m 08s
      🟩 GCC8               Pass: 100%/1   | Total: 33m 16s | Avg: 33m 16s | Max: 33m 16s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 06m | Avg: 33m 04s | Max: 33m 32s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 05m | Avg: 32m 55s | Max: 34m 41s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 06m | Avg: 33m 03s | Max: 35m 17s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 10m | Avg: 35m 14s | Max: 35m 25s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 14m | Avg: 24m 21s | Max: 36m 27s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 51m | Avg: 55m 41s | Max: 55m 50s | Hits: 262%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 26m | Avg: 48m 45s | Max: 57m 37s | Hits: 297%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 36m | Avg: 48m 12s | Max: 48m 34s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 44m | Avg: 27m 19s | Max: 33m 04s
      🟩 GCC                Pass: 100%/19  | Total:  9h 15m | Avg: 29m 13s | Max: 36m 27s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 17m | Avg: 51m 31s | Max: 57m 37s | Hits: 283%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 36m | Avg: 48m 12s | Max: 48m 34s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 19h 00m | Avg: 34m 34s | Max: 57m 37s | Hits: 262%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 53m | Avg: 23m 18s | Max: 55m 55s | Hits: 314%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 21h 32m | Avg: 34m 55s | Max: 57m 37s | Hits: 262%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 48m 31s | Avg: 16m 10s | Max: 32m 44s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 33m 17s | Avg: 11m 05s | Max: 11m 27s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 36m 27s | Avg: 36m 27s | Max: 36m 27s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 11h 54m | Avg: 35m 42s | Max: 57m 37s | Hits: 262%/5538  
      🟩 20                 Pass: 100%/21  | Total: 10h 21m | Avg: 29m 35s | Max: 55m 55s | Hits: 314%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 6m 59s | Avg: 3m 29s | Max: 5m 00s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  5m 00s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  5m 00s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  5m 00s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  5m 00s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  5m 00s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  5m 00s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  5m 00s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
      🟩 Test               Pass: 100%/1   | Total:  5m 00s | Avg:  5m 00s | Max:  5m 00s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 32s | Avg: 26m 32s | Max: 26m 32s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@gonidelis
Copy link
Member

select.if

|  T{ct}  |  OffsetT{ct}  |  MayAlias{ct}  |  Elements{io}  |  Entropy  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |        Diff |   %Diff |  Status  |
|---------|---------------|----------------|----------------|-----------|------------|-------------|------------|-------------|-------------|---------|----------|
|   I8    |      I64      |     false      |      2^16      |     1     |  21.173 us |       4.01% |  21.080 us |       3.57% |   -0.093 us |  -0.44% |   SAME   |
|   I8    |      I64      |     false      |      2^20      |     1     |  30.536 us |       5.68% |  24.299 us |       5.32% |   -6.237 us | -20.42% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |     1     |  87.911 us |       2.00% |  87.686 us |       1.71% |   -0.225 us |  -0.26% |   SAME   |
|   I8    |      I64      |     false      |      2^28      |     1     |   1.109 ms |       0.53% |   1.093 ms |       0.38% |  -16.294 us |  -1.47% |   FAST   |
|   I8    |      I64      |     false      |      2^16      |   0.544   |  21.868 us |       3.81% |  21.910 us |       3.61% |    0.042 us |   0.19% |   SAME   |
|   I8    |      I64      |     false      |      2^20      |   0.544   |  31.009 us |       5.29% |  23.759 us |       5.14% |   -7.250 us | -23.38% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |   0.544   |  83.715 us |       2.12% |  83.252 us |       1.89% |   -0.462 us |  -0.55% |   SAME   |
|   I8    |      I64      |     false      |      2^28      |   0.544   |   1.046 ms |       0.58% |   1.030 ms |       0.38% |  -16.230 us |  -1.55% |   FAST   |
|   I8    |      I64      |     false      |      2^16      |     0     |  19.140 us |       2.77% |  18.927 us |       2.58% |   -0.213 us |  -1.11% |   SAME   |
|   I8    |      I64      |     false      |      2^20      |     0     |  27.895 us |       6.23% |  23.160 us |       8.62% |   -4.735 us | -16.97% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |     0     |  75.062 us |       3.12% |  73.349 us |       2.31% |   -1.714 us |  -2.28% |   SAME   |
|   I8    |      I64      |     false      |      2^28      |     0     | 912.819 us |       0.88% | 878.143 us |       0.51% |  -34.676 us |  -3.80% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |     1     |  23.796 us |       2.92% |  21.987 us |       4.76% |   -1.809 us |  -7.60% |   FAST   |
|   I8    |      I64      |      true      |      2^20      |     1     |  30.502 us |       4.85% |  26.156 us |       4.84% |   -4.346 us | -14.25% |   FAST   |
|   I8    |      I64      |      true      |      2^24      |     1     | 106.071 us |       2.42% |  91.920 us |       1.89% |  -14.151 us | -13.34% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |     1     |   1.377 ms |       0.46% |   1.174 ms |       0.53% | -203.766 us | -14.79% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |   0.544   |  23.585 us |       4.03% |  21.910 us |       3.67% |   -1.675 us |  -7.10% |   FAST   |
|   I8    |      I64      |      true      |      2^20      |   0.544   |  29.970 us |       4.33% |  25.421 us |       4.29% |   -4.549 us | -15.18% |   FAST   |
|   I8    |      I64      |      true      |      2^24      |   0.544   | 102.489 us |       2.48% |  86.484 us |       1.98% |  -16.005 us | -15.62% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |   0.544   |   1.317 ms |       0.48% |   1.097 ms |       0.50% | -219.283 us | -16.65% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |     0     |  20.961 us |       2.93% |  19.309 us |       4.15% |   -1.652 us |  -7.88% |   FAST   |
|   I8    |      I64      |      true      |      2^20      |     0     |  29.793 us |       5.35% |  23.933 us |       6.73% |   -5.860 us | -19.67% |   FAST   |
|   I8    |      I64      |      true      |      2^24      |     0     |  93.309 us |       2.61% |  76.732 us |       2.44% |  -16.576 us | -17.77% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |     0     |   1.184 ms |       0.61% | 965.130 us |       0.80% | -218.637 us | -18.47% |   FAST   |
|   I16   |      I64      |     false      |      2^16      |     1     |  22.610 us |       4.62% |  22.367 us |       5.44% |   -0.243 us |  -1.08% |   SAME   |
|   I16   |      I64      |     false      |      2^20      |     1     |  29.717 us |       6.18% |  30.009 us |       6.91% |    0.292 us |   0.98% |   SAME   |
|   I16   |      I64      |     false      |      2^24      |     1     |  83.966 us |       2.38% |  84.116 us |       2.55% |    0.150 us |   0.18% |   SAME   |
|   I16   |      I64      |     false      |      2^28      |     1     |   1.029 ms |       0.72% |   1.029 ms |       0.75% |    0.009 us |   0.00% |   SAME   |
|   I16   |      I64      |     false      |      2^16      |   0.544   |  20.487 us |       7.78% |  20.535 us |       8.30% |    0.048 us |   0.24% |   SAME   |
|   I16   |      I64      |     false      |      2^20      |   0.544   |  31.548 us |       6.17% |  31.717 us |       6.25% |    0.169 us |   0.54% |   SAME   |
|   I16   |      I64      |     false      |      2^24      |   0.544   |  82.407 us |       2.55% |  82.603 us |       2.53% |    0.196 us |   0.24% |   SAME   |
|   I16   |      I64      |     false      |      2^28      |   0.544   | 992.312 us |       0.75% | 992.481 us |       0.73% |    0.169 us |   0.02% |   SAME   |
|   I16   |      I64      |     false      |      2^16      |     0     |  21.514 us |       3.10% |  21.482 us |       4.24% |   -0.031 us |  -0.15% |   SAME   |
|   I16   |      I64      |     false      |      2^20      |     0     |  28.701 us |       6.52% |  28.773 us |       6.46% |    0.072 us |   0.25% |   SAME   |
|   I16   |      I64      |     false      |      2^24      |     0     |  72.920 us |       3.09% |  72.532 us |       3.09% |   -0.388 us |  -0.53% |   SAME   |
|   I16   |      I64      |     false      |      2^28      |     0     | 845.092 us |       1.01% | 845.217 us |       1.02% |    0.125 us |   0.01% |   SAME   |
|   I16   |      I64      |      true      |      2^16      |     1     |  22.536 us |       4.89% |  22.360 us |       4.99% |   -0.176 us |  -0.78% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |     1     |  30.439 us |       5.30% |  30.103 us |       5.86% |   -0.336 us |  -1.10% |   SAME   |
|   I16   |      I64      |      true      |      2^24      |     1     |  84.493 us |       2.47% |  84.216 us |       2.40% |   -0.277 us |  -0.33% |   SAME   |
|   I16   |      I64      |      true      |      2^28      |     1     |   1.028 ms |       0.75% |   1.028 ms |       0.68% |   -0.661 us |  -0.06% |   SAME   |
|   I16   |      I64      |      true      |      2^16      |   0.544   |  20.409 us |       6.95% |  20.790 us |       8.20% |    0.381 us |   1.87% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |   0.544   |  31.075 us |       5.33% |  30.684 us |       4.87% |   -0.391 us |  -1.26% |   SAME   |
|   I16   |      I64      |      true      |      2^24      |   0.544   |  81.964 us |       2.48% |  82.333 us |       2.37% |    0.370 us |   0.45% |   SAME   |
|   I16   |      I64      |      true      |      2^28      |   0.544   | 993.373 us |       0.72% | 993.349 us |       0.72% |   -0.024 us |  -0.00% |   SAME   |
|   I16   |      I64      |      true      |      2^16      |     0     |  21.638 us |       4.21% |  21.893 us |       5.05% |    0.255 us |   1.18% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |     0     |  28.890 us |       6.27% |  28.878 us |       6.65% |   -0.012 us |  -0.04% |   SAME   |
|   I16   |      I64      |      true      |      2^24      |     0     |  73.011 us |       3.11% |  72.609 us |       3.23% |   -0.403 us |  -0.55% |   SAME   |
|   I16   |      I64      |      true      |      2^28      |     0     | 844.967 us |       0.96% | 844.861 us |       1.01% |   -0.106 us |  -0.01% |   SAME   |
|   I32   |      I64      |     false      |      2^16      |     1     |  22.548 us |       6.71% |  20.786 us |       7.82% |   -1.763 us |  -7.82% |   FAST   |
|   I32   |      I64      |     false      |      2^20      |     1     |  30.477 us |       4.51% |  25.216 us |       4.10% |   -5.261 us | -17.26% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |     1     | 103.396 us |       2.15% |  94.374 us |       3.09% |   -9.021 us |  -8.73% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |     1     |   1.323 ms |       0.59% |   1.214 ms |       0.88% | -109.362 us |  -8.27% |   FAST   |
|   I32   |      I64      |     false      |      2^16      |   0.544   |  21.772 us |       3.08% |  20.741 us |       5.46% |   -1.031 us |  -4.74% |   FAST   |
|   I32   |      I64      |     false      |      2^20      |   0.544   |  30.336 us |       4.58% |  25.184 us |       3.12% |   -5.151 us | -16.98% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |   0.544   | 101.407 us |       2.16% |  92.840 us |       3.21% |   -8.568 us |  -8.45% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |   0.544   |   1.297 ms |       0.58% |   1.179 ms |       0.92% | -118.498 us |  -9.14% |   FAST   |
|   I32   |      I64      |     false      |      2^16      |     0     |  21.575 us |       3.59% |  20.571 us |       6.50% |   -1.004 us |  -4.65% |   FAST   |
|   I32   |      I64      |     false      |      2^20      |     0     |  30.232 us |       5.27% |  24.796 us |       7.15% |   -5.436 us | -17.98% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |     0     |  89.097 us |       2.65% |  81.421 us |       3.87% |   -7.676 us |  -8.62% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |     0     |   1.106 ms |       0.72% | 962.433 us |       1.28% | -143.800 us | -13.00% |   FAST   |
|   I32   |      I64      |      true      |      2^16      |     1     |  22.318 us |       5.33% |  22.293 us |       5.93% |   -0.025 us |  -0.11% |   SAME   |
|   I32   |      I64      |      true      |      2^20      |     1     |  30.738 us |       4.04% |  30.749 us |       4.15% |    0.011 us |   0.04% |   SAME   |
|   I32   |      I64      |      true      |      2^24      |     1     | 103.340 us |       2.25% | 102.997 us |       1.98% |   -0.343 us |  -0.33% |   SAME   |
|   I32   |      I64      |      true      |      2^28      |     1     |   1.323 ms |       0.60% |   1.323 ms |       0.62% |   -0.723 us |  -0.05% |   SAME   |
|   I32   |      I64      |      true      |      2^16      |   0.544   |  21.845 us |       4.46% |  21.530 us |       4.19% |   -0.315 us |  -1.44% |   SAME   |
|   I32   |      I64      |      true      |      2^20      |   0.544   |  30.332 us |       4.58% |  30.199 us |       4.29% |   -0.133 us |  -0.44% |   SAME   |
|   I32   |      I64      |      true      |      2^24      |   0.544   | 101.758 us |       2.10% | 101.043 us |       2.07% |   -0.715 us |  -0.70% |   SAME   |
|   I32   |      I64      |      true      |      2^28      |   0.544   |   1.297 ms |       0.58% |   1.296 ms |       0.56% |   -0.707 us |  -0.05% |   SAME   |
|   I32   |      I64      |      true      |      2^16      |     0     |  21.622 us |       4.22% |  21.673 us |       3.64% |    0.051 us |   0.24% |   SAME   |
|   I32   |      I64      |      true      |      2^20      |     0     |  29.622 us |       4.69% |  29.412 us |       4.84% |   -0.209 us |  -0.71% |   SAME   |
|   I32   |      I64      |      true      |      2^24      |     0     |  89.265 us |       2.59% |  89.621 us |       2.76% |    0.356 us |   0.40% |   SAME   |
|   I32   |      I64      |      true      |      2^28      |     0     |   1.106 ms |       0.75% |   1.106 ms |       0.75% |   -0.007 us |  -0.00% |   SAME   |
|   I64   |      I64      |     false      |      2^16      |     1     |  21.885 us |       4.28% |  21.907 us |       4.05% |    0.022 us |   0.10% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |     1     |  32.909 us |       4.11% |  32.978 us |       3.98% |    0.069 us |   0.21% |   SAME   |
|   I64   |      I64      |     false      |      2^24      |     1     | 133.333 us |       2.78% | 133.613 us |       2.84% |    0.280 us |   0.21% |   SAME   |
|   I64   |      I64      |     false      |      2^28      |     1     |   1.832 ms |       0.85% |   1.833 ms |       0.87% |    1.681 us |   0.09% |   SAME   |
|   I64   |      I64      |     false      |      2^16      |   0.544   |  21.343 us |       4.10% |  21.379 us |       4.28% |    0.036 us |   0.17% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |   0.544   |  32.595 us |       5.15% |  32.386 us |       5.18% |   -0.209 us |  -0.64% |   SAME   |
|   I64   |      I64      |     false      |      2^24      |   0.544   | 128.430 us |       2.38% | 128.353 us |       2.47% |   -0.076 us |  -0.06% |   SAME   |
|   I64   |      I64      |     false      |      2^28      |   0.544   |   1.730 ms |       0.71% |   1.730 ms |       0.71% |   -0.387 us |  -0.02% |   SAME   |
|   I64   |      I64      |     false      |      2^16      |     0     |  20.933 us |       6.31% |  20.903 us |       6.07% |   -0.030 us |  -0.14% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |     0     |  30.973 us |       5.29% |  30.927 us |       5.29% |   -0.046 us |  -0.15% |   SAME   |
|   I64   |      I64      |     false      |      2^24      |     0     | 106.235 us |       3.27% | 106.001 us |       2.96% |   -0.233 us |  -0.22% |   SAME   |
|   I64   |      I64      |     false      |      2^28      |     0     |   1.362 ms |       1.01% |   1.361 ms |       0.98% |   -1.033 us |  -0.08% |   SAME   |
|   I64   |      I64      |      true      |      2^16      |     1     |  21.960 us |       3.82% |  22.006 us |       4.15% |    0.047 us |   0.21% |   SAME   |
|   I64   |      I64      |      true      |      2^20      |     1     |  32.849 us |       4.62% |  32.864 us |       4.84% |    0.015 us |   0.05% |   SAME   |
|   I64   |      I64      |      true      |      2^24      |     1     | 133.259 us |       2.67% | 133.538 us |       2.91% |    0.279 us |   0.21% |   SAME   |
|   I64   |      I64      |      true      |      2^28      |     1     |   1.834 ms |       0.86% |   1.834 ms |       0.84% |    0.285 us |   0.02% |   SAME   |
|   I64   |      I64      |      true      |      2^16      |   0.544   |  21.382 us |       4.82% |  21.521 us |       4.69% |    0.139 us |   0.65% |   SAME   |
|   I64   |      I64      |      true      |      2^20      |   0.544   |  32.588 us |       4.25% |  32.563 us |       4.11% |   -0.026 us |  -0.08% |   SAME   |
|   I64   |      I64      |      true      |      2^24      |   0.544   | 128.186 us |       2.47% | 128.102 us |       2.41% |   -0.085 us |  -0.07% |   SAME   |
|   I64   |      I64      |      true      |      2^28      |   0.544   |   1.730 ms |       0.75% |   1.733 ms |       0.75% |    3.084 us |   0.18% |   SAME   |
|   I64   |      I64      |      true      |      2^16      |     0     |  21.209 us |       5.37% |  21.299 us |       5.23% |    0.091 us |   0.43% |   SAME   |
|   I64   |      I64      |      true      |      2^20      |     0     |  31.023 us |       4.54% |  30.969 us |       4.51% |   -0.053 us |  -0.17% |   SAME   |
|   I64   |      I64      |      true      |      2^24      |     0     | 106.127 us |       3.04% | 106.371 us |       3.16% |    0.244 us |   0.23% |   SAME   |
|   I64   |      I64      |      true      |      2^28      |     0     |   1.361 ms |       1.04% |   1.361 ms |       1.01% |    0.208 us |   0.02% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |     1     |  24.159 us |       4.37% |  24.182 us |       4.41% |    0.024 us |   0.10% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |     1     |  43.224 us |       3.86% |  43.386 us |       4.09% |    0.163 us |   0.38% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |     1     | 278.994 us |       1.44% | 278.921 us |       1.43% |   -0.073 us |  -0.03% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |     1     |   4.145 ms |       0.37% |   4.144 ms |       0.36% |   -1.362 us |  -0.03% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |   0.544   |  23.983 us |       6.02% |  24.086 us |       5.80% |    0.103 us |   0.43% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |   0.544   |  41.915 us |       4.34% |  41.929 us |       4.06% |    0.014 us |   0.03% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |   0.544   | 256.257 us |       1.35% | 256.463 us |       1.32% |    0.206 us |   0.08% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |   0.544   |   3.789 ms |       0.32% |   3.788 ms |       0.35% |   -0.299 us |  -0.01% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |     0     |  22.777 us |       6.03% |  22.662 us |       6.30% |   -0.115 us |  -0.50% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |     0     |  38.820 us |       4.38% |  38.652 us |       4.43% |   -0.169 us |  -0.43% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |     0     | 210.166 us |       1.70% | 209.969 us |       1.64% |   -0.197 us |  -0.09% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |     0     |   3.027 ms |       0.39% |   3.027 ms |       0.41% |   -0.207 us |  -0.01% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |     1     |  25.869 us |       4.16% |  25.909 us |       4.35% |    0.040 us |   0.16% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |     1     |  45.117 us |       3.32% |  45.186 us |       3.39% |    0.069 us |   0.15% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |     1     | 321.006 us |       0.90% | 320.940 us |       0.98% |   -0.065 us |  -0.02% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |     1     |   4.793 ms |       0.22% |   4.791 ms |       0.21% |   -1.458 us |  -0.03% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |   0.544   |  25.787 us |       5.24% |  25.697 us |       5.11% |   -0.090 us |  -0.35% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |   0.544   |  43.656 us |       3.55% |  43.676 us |       3.67% |    0.019 us |   0.04% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |   0.544   | 298.517 us |       0.97% | 298.451 us |       0.96% |   -0.066 us |  -0.02% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |   0.544   |   4.429 ms |       0.22% |   4.429 ms |       0.22% |   -0.013 us |  -0.00% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |     0     |  24.110 us |       4.66% |  24.168 us |       4.60% |    0.058 us |   0.24% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |     0     |  39.467 us |       3.39% |  39.572 us |       3.66% |    0.105 us |   0.27% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |     0     | 250.085 us |       1.06% | 250.001 us |       1.04% |   -0.083 us |  -0.03% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |     0     |   3.662 ms |       0.24% |   3.662 ms |       0.23% |   -0.716 us |  -0.02% |   SAME   |
|   F32   |      I64      |     false      |      2^16      |     1     |  22.811 us |       7.05% |  20.814 us |       7.59% |   -1.997 us |  -8.75% |   FAST   |
|   F32   |      I64      |     false      |      2^20      |     1     |  30.676 us |       4.19% |  25.230 us |       2.93% |   -5.446 us | -17.75% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |     1     | 103.441 us |       2.15% |  95.536 us |       2.96% |   -7.905 us |  -7.64% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |     1     |   1.330 ms |       0.66% |   1.243 ms |       1.03% |  -87.486 us |  -6.58% |   FAST   |
|   F32   |      I64      |     false      |      2^16      |   0.544   |  21.836 us |       5.67% |  20.581 us |       6.75% |   -1.255 us |  -5.75% |   FAST   |
|   F32   |      I64      |     false      |      2^20      |   0.544   |  29.152 us |       5.54% |  24.542 us |       6.64% |   -4.610 us | -15.81% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |   0.544   |  91.148 us |       2.65% |  83.868 us |       3.70% |   -7.280 us |  -7.99% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |   0.544   |   1.123 ms |       0.78% |   1.011 ms |       1.30% | -111.701 us |  -9.95% |   FAST   |
|   F32   |      I64      |     false      |      2^16      |     0     |  21.714 us |       3.59% |  20.849 us |       5.40% |   -0.864 us |  -3.98% |   FAST   |
|   F32   |      I64      |     false      |      2^20      |     0     |  29.821 us |       5.65% |  24.856 us |       8.47% |   -4.965 us | -16.65% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |     0     |  90.036 us |       2.58% |  80.629 us |       3.76% |   -9.408 us | -10.45% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |     0     |   1.104 ms |       0.74% | 962.380 us |       1.24% | -141.921 us | -12.85% |   FAST   |
|   F32   |      I64      |      true      |      2^16      |     1     |  22.586 us |       7.13% |  22.247 us |       6.22% |   -0.339 us |  -1.50% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |     1     |  30.980 us |       5.41% |  31.176 us |       5.20% |    0.196 us |   0.63% |   SAME   |
|   F32   |      I64      |      true      |      2^24      |     1     | 102.890 us |       2.11% | 103.447 us |       2.21% |    0.558 us |   0.54% |   SAME   |
|   F32   |      I64      |      true      |      2^28      |     1     |   1.331 ms |       0.60% |   1.330 ms |       0.61% |   -0.452 us |  -0.03% |   SAME   |
|   F32   |      I64      |      true      |      2^16      |   0.544   |  22.138 us |       6.37% |  22.471 us |       6.65% |    0.333 us |   1.50% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |   0.544   |  29.094 us |       4.35% |  29.127 us |       4.01% |    0.034 us |   0.12% |   SAME   |
|   F32   |      I64      |      true      |      2^24      |   0.544   |  90.555 us |       2.69% |  91.082 us |       2.67% |    0.527 us |   0.58% |   SAME   |
|   F32   |      I64      |      true      |      2^28      |   0.544   |   1.123 ms |       0.74% |   1.123 ms |       0.77% |    0.442 us |   0.04% |   SAME   |
|   F32   |      I64      |      true      |      2^16      |     0     |  21.099 us |       4.70% |  21.867 us |       5.34% |    0.768 us |   3.64% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |     0     |  29.751 us |       4.70% |  29.649 us |       5.18% |   -0.102 us |  -0.34% |   SAME   |
|   F32   |      I64      |      true      |      2^24      |     0     |  89.376 us |       2.62% |  89.249 us |       2.81% |   -0.127 us |  -0.14% |   SAME   |
|   F32   |      I64      |      true      |      2^28      |     0     |   1.104 ms |       0.73% |   1.105 ms |       0.74% |    0.738 us |   0.07% |   SAME   |
|   F64   |      I64      |     false      |      2^16      |     1     |  21.799 us |       3.94% |  21.720 us |       3.71% |   -0.080 us |  -0.37% |   SAME   |
|   F64   |      I64      |     false      |      2^20      |     1     |  32.885 us |       4.06% |  33.009 us |       4.02% |    0.124 us |   0.38% |   SAME   |
|   F64   |      I64      |     false      |      2^24      |     1     | 133.543 us |       2.99% | 133.420 us |       2.89% |   -0.123 us |  -0.09% |   SAME   |
|   F64   |      I64      |     false      |      2^28      |     1     |   1.825 ms |       0.88% |   1.825 ms |       0.81% |    0.513 us |   0.03% |   SAME   |
|   F64   |      I64      |     false      |      2^16      |   0.544   |  21.491 us |       4.21% |  21.638 us |       4.11% |    0.146 us |   0.68% |   SAME   |
|   F64   |      I64      |     false      |      2^20      |   0.544   |  32.729 us |       4.85% |  32.508 us |       4.86% |   -0.221 us |  -0.67% |   SAME   |
|   F64   |      I64      |     false      |      2^24      |   0.544   | 114.056 us |       2.76% | 114.130 us |       2.58% |    0.074 us |   0.06% |   SAME   |
|   F64   |      I64      |     false      |      2^28      |   0.544   |   1.484 ms |       0.92% |   1.485 ms |       0.87% |    1.394 us |   0.09% |   SAME   |
|   F64   |      I64      |     false      |      2^16      |     0     |  21.048 us |       5.56% |  21.057 us |       5.46% |    0.008 us |   0.04% |   SAME   |
|   F64   |      I64      |     false      |      2^20      |     0     |  30.994 us |       4.65% |  30.899 us |       4.55% |   -0.095 us |  -0.31% |   SAME   |
|   F64   |      I64      |     false      |      2^24      |     0     | 105.609 us |       3.30% | 105.542 us |       3.22% |   -0.067 us |  -0.06% |   SAME   |
|   F64   |      I64      |     false      |      2^28      |     0     |   1.346 ms |       1.04% |   1.346 ms |       1.06% |   -0.088 us |  -0.01% |   SAME   |
|   F64   |      I64      |      true      |      2^16      |     1     |  22.085 us |       3.44% |  21.909 us |       4.04% |   -0.176 us |  -0.80% |   SAME   |
|   F64   |      I64      |      true      |      2^20      |     1     |  33.287 us |       4.29% |  32.993 us |       4.21% |   -0.295 us |  -0.88% |   SAME   |
|   F64   |      I64      |      true      |      2^24      |     1     | 133.432 us |       3.00% | 133.444 us |       2.90% |    0.012 us |   0.01% |   SAME   |
|   F64   |      I64      |      true      |      2^28      |     1     |   1.826 ms |       0.86% |   1.825 ms |       0.86% |   -0.870 us |  -0.05% |   SAME   |
|   F64   |      I64      |      true      |      2^16      |   0.544   |  21.606 us |       4.29% |  21.332 us |       4.88% |   -0.274 us |  -1.27% |   SAME   |
|   F64   |      I64      |      true      |      2^20      |   0.544   |  32.795 us |       5.10% |  32.348 us |       4.81% |   -0.447 us |  -1.36% |   SAME   |
|   F64   |      I64      |      true      |      2^24      |   0.544   | 113.907 us |       2.73% | 114.292 us |       2.71% |    0.384 us |   0.34% |   SAME   |
|   F64   |      I64      |      true      |      2^28      |   0.544   |   1.484 ms |       0.90% |   1.486 ms |       0.89% |    2.175 us |   0.15% |   SAME   |
|   F64   |      I64      |      true      |      2^16      |     0     |  20.921 us |       5.07% |  20.861 us |       5.40% |   -0.060 us |  -0.29% |   SAME   |
|   F64   |      I64      |      true      |      2^20      |     0     |  30.814 us |       5.08% |  30.827 us |       5.31% |    0.012 us |   0.04% |   SAME   |
|   F64   |      I64      |      true      |      2^24      |     0     | 105.505 us |       3.25% | 105.455 us |       3.28% |   -0.051 us |  -0.05% |   SAME   |
|   F64   |      I64      |      true      |      2^28      |     0     |   1.347 ms |       1.07% |   1.347 ms |       1.05% |    0.551 us |   0.04% |   SAME   |

@gonidelis
Copy link
Member

select.unique

|  T{ct}  |  OffsetT{ct}  |  MayAlias{ct}  |  Elements{io}  |  MaxSegSize  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |        Diff |   %Diff |  Status  |
|---------|---------------|----------------|----------------|--------------|------------|-------------|------------|-------------|-------------|---------|----------|
|   I8    |      I64      |     false      |      2^16      |     2^1      |  21.181 us |       3.97% |  21.135 us |       3.68% |   -0.046 us |  -0.22% |   SAME   |
|   I8    |      I64      |     false      |      2^20      |     2^1      |  30.715 us |       5.64% |  24.443 us |       6.16% |   -6.272 us | -20.42% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |     2^1      |  85.744 us |       2.22% |  85.165 us |       1.88% |   -0.579 us |  -0.68% |   SAME   |
|   I8    |      I64      |     false      |      2^28      |     2^1      |   1.072 ms |       0.59% |   1.059 ms |       0.44% |  -13.039 us |  -1.22% |   FAST   |
|   I8    |      I64      |     false      |      2^16      |     2^4      |  21.554 us |       4.19% |  21.495 us |       4.38% |   -0.059 us |  -0.27% |   SAME   |
|   I8    |      I64      |     false      |      2^20      |     2^4      |  30.838 us |       5.78% |  23.727 us |       5.30% |   -7.112 us | -23.06% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |     2^4      |  74.995 us |       2.77% |  74.262 us |       2.15% |   -0.734 us |  -0.98% |   SAME   |
|   I8    |      I64      |     false      |      2^28      |     2^4      | 913.101 us |       0.77% | 896.381 us |       0.52% |  -16.720 us |  -1.83% |   FAST   |
|   I8    |      I64      |     false      |      2^16      |     2^8      |  22.040 us |       4.20% |  22.085 us |       2.66% |    0.045 us |   0.21% |   SAME   |
|   I8    |      I64      |     false      |      2^20      |     2^8      |  30.761 us |       6.06% |  24.720 us |       5.58% |   -6.040 us | -19.64% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |     2^8      |  79.444 us |       2.81% |  77.790 us |       2.11% |   -1.654 us |  -2.08% |   SAME   |
|   I8    |      I64      |     false      |      2^28      |     2^8      | 965.941 us |       0.78% | 941.270 us |       0.49% |  -24.670 us |  -2.55% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |     2^1      |  23.792 us |       4.24% |  22.750 us |       5.81% |   -1.043 us |  -4.38% |   FAST   |
|   I8    |      I64      |      true      |      2^20      |     2^1      |  33.165 us |       5.09% |  26.326 us |       5.04% |   -6.839 us | -20.62% |   FAST   |
|   I8    |      I64      |      true      |      2^24      |     2^1      | 104.193 us |       2.23% |  93.362 us |       1.88% |  -10.832 us | -10.40% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |     2^1      |   1.337 ms |       0.50% |   1.195 ms |       0.56% | -141.693 us | -10.60% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |     2^4      |  22.357 us |       4.97% |  23.961 us |       3.94% |    1.604 us |   7.18% |   SLOW   |
|   I8    |      I64      |      true      |      2^20      |     2^4      |  33.398 us |       5.59% |  25.475 us |       5.52% |   -7.924 us | -23.72% |   FAST   |
|   I8    |      I64      |      true      |      2^24      |     2^4      |  93.200 us |       2.64% |  82.111 us |       2.11% |  -11.089 us | -11.90% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |     2^4      |   1.184 ms |       0.59% |   1.034 ms |       0.64% | -150.934 us | -12.74% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |     2^8      |  23.550 us |       3.47% |  21.793 us |       3.52% |   -1.757 us |  -7.46% |   FAST   |
|   I8    |      I64      |      true      |      2^20      |     2^8      |  32.243 us |       4.55% |  24.514 us |       6.40% |   -7.730 us | -23.97% |   FAST   |
|   I8    |      I64      |      true      |      2^24      |     2^8      |  97.093 us |       2.35% |  83.379 us |       2.06% |  -13.714 us | -14.12% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |     2^8      |   1.245 ms |       0.50% |   1.059 ms |       0.65% | -185.558 us | -14.91% |   FAST   |
|   I16   |      I64      |     false      |      2^16      |     2^1      |  22.207 us |       5.22% |  22.512 us |       5.12% |    0.304 us |   1.37% |   SAME   |
|   I16   |      I64      |     false      |      2^20      |     2^1      |  31.442 us |       5.33% |  31.444 us |       4.84% |    0.003 us |   0.01% |   SAME   |
|   I16   |      I64      |     false      |      2^24      |     2^1      |  84.519 us |       2.39% |  84.737 us |       2.58% |    0.219 us |   0.26% |   SAME   |
|   I16   |      I64      |     false      |      2^28      |     2^1      |   1.038 ms |       0.76% |   1.038 ms |       0.70% |   -0.120 us |  -0.01% |   SAME   |
|   I16   |      I64      |     false      |      2^16      |     2^4      |  21.261 us |       3.97% |  21.200 us |       3.16% |   -0.062 us |  -0.29% |   SAME   |
|   I16   |      I64      |     false      |      2^20      |     2^4      |  32.497 us |       5.06% |  32.490 us |       5.01% |   -0.008 us |  -0.02% |   SAME   |
|   I16   |      I64      |     false      |      2^24      |     2^4      |  76.135 us |       2.81% |  76.706 us |       2.83% |    0.571 us |   0.75% |   SAME   |
|   I16   |      I64      |     false      |      2^28      |     2^4      | 893.243 us |       0.79% | 893.449 us |       0.76% |    0.206 us |   0.02% |   SAME   |
|   I16   |      I64      |     false      |      2^16      |     2^8      |  22.267 us |       5.66% |  23.109 us |       7.41% |    0.843 us |   3.78% |   SAME   |
|   I16   |      I64      |     false      |      2^20      |     2^8      |  30.964 us |       4.13% |  30.916 us |       4.06% |   -0.048 us |  -0.16% |   SAME   |
|   I16   |      I64      |     false      |      2^24      |     2^8      |  77.099 us |       2.95% |  76.958 us |       3.14% |   -0.140 us |  -0.18% |   SAME   |
|   I16   |      I64      |     false      |      2^28      |     2^8      | 907.496 us |       0.88% | 907.503 us |       0.88% |    0.007 us |   0.00% |   SAME   |
|   I16   |      I64      |      true      |      2^16      |     2^1      |  22.045 us |       4.27% |  22.844 us |       5.30% |    0.799 us |   3.63% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |     2^1      |  31.178 us |       4.47% |  31.164 us |       4.50% |   -0.014 us |  -0.05% |   SAME   |
|   I16   |      I64      |      true      |      2^24      |     2^1      |  85.278 us |       2.64% |  85.079 us |       2.59% |   -0.199 us |  -0.23% |   SAME   |
|   I16   |      I64      |      true      |      2^28      |     2^1      |   1.038 ms |       0.74% |   1.037 ms |       0.73% |   -0.213 us |  -0.02% |   SAME   |
|   I16   |      I64      |      true      |      2^16      |     2^4      |  21.243 us |       3.49% |  21.505 us |       4.69% |    0.262 us |   1.23% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |     2^4      |  31.759 us |       4.63% |  31.989 us |       4.38% |    0.230 us |   0.73% |   SAME   |
|   I16   |      I64      |      true      |      2^24      |     2^4      |  76.071 us |       2.68% |  75.948 us |       2.52% |   -0.123 us |  -0.16% |   SAME   |
|   I16   |      I64      |      true      |      2^28      |     2^4      | 893.980 us |       0.76% | 893.548 us |       0.77% |   -0.432 us |  -0.05% |   SAME   |
|   I16   |      I64      |      true      |      2^16      |     2^8      |  22.426 us |       6.81% |  22.348 us |       6.06% |   -0.078 us |  -0.35% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |     2^8      |  30.825 us |       4.03% |  30.768 us |       4.35% |   -0.057 us |  -0.19% |   SAME   |
|   I16   |      I64      |      true      |      2^24      |     2^8      |  76.805 us |       2.98% |  76.999 us |       2.82% |    0.194 us |   0.25% |   SAME   |
|   I16   |      I64      |      true      |      2^28      |     2^8      | 907.525 us |       0.91% | 907.320 us |       0.89% |   -0.206 us |  -0.02% |   SAME   |
|   I32   |      I64      |     false      |      2^16      |     2^1      |  23.201 us |       6.97% |  21.661 us |       4.60% |   -1.541 us |  -6.64% |   FAST   |
|   I32   |      I64      |     false      |      2^20      |     2^1      |  30.949 us |       4.56% |  25.531 us |       4.03% |   -5.418 us | -17.51% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |     2^1      | 106.869 us |       1.79% |  96.777 us |       2.99% |  -10.092 us |  -9.44% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |     2^1      |   1.402 ms |       0.55% |   1.247 ms |       0.91% | -155.713 us | -11.10% |   FAST   |
|   I32   |      I64      |     false      |      2^16      |     2^4      |  22.456 us |       5.29% |  21.972 us |       4.27% |   -0.484 us |  -2.16% |   SAME   |
|   I32   |      I64      |     false      |      2^20      |     2^4      |  29.879 us |       4.99% |  25.336 us |       4.61% |   -4.543 us | -15.20% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |     2^4      |  94.372 us |       2.66% |  85.092 us |       3.66% |   -9.280 us |  -9.83% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |     2^4      |   1.178 ms |       0.69% |   1.042 ms |       1.21% | -136.210 us | -11.56% |   FAST   |
|   I32   |      I64      |     false      |      2^16      |     2^8      |  21.574 us |       2.57% |  21.798 us |       3.94% |    0.224 us |   1.04% |   SAME   |
|   I32   |      I64      |     false      |      2^20      |     2^8      |  30.888 us |       5.43% |  24.986 us |       4.72% |   -5.902 us | -19.11% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |     2^8      |  95.836 us |       2.53% |  85.472 us |       3.75% |  -10.364 us | -10.81% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |     2^8      |   1.197 ms |       0.61% |   1.024 ms |       1.03% | -172.441 us | -14.41% |   FAST   |
|   I32   |      I64      |      true      |      2^16      |     2^1      |  22.797 us |       4.90% |  22.820 us |       5.03% |    0.024 us |   0.10% |   SAME   |
|   I32   |      I64      |      true      |      2^20      |     2^1      |  31.034 us |       4.49% |  30.998 us |       4.04% |   -0.036 us |  -0.12% |   SAME   |
|   I32   |      I64      |      true      |      2^24      |     2^1      | 107.172 us |       1.82% | 107.224 us |       1.94% |    0.052 us |   0.05% |   SAME   |
|   I32   |      I64      |      true      |      2^28      |     2^1      |   1.402 ms |       0.57% |   1.402 ms |       0.49% |   -0.204 us |  -0.01% |   SAME   |
|   I32   |      I64      |      true      |      2^16      |     2^4      |  22.704 us |       6.89% |  22.231 us |       6.68% |   -0.473 us |  -2.09% |   SAME   |
|   I32   |      I64      |      true      |      2^20      |     2^4      |  29.870 us |       5.11% |  29.859 us |       4.72% |   -0.011 us |  -0.04% |   SAME   |
|   I32   |      I64      |      true      |      2^24      |     2^4      |  93.970 us |       2.56% |  93.782 us |       2.50% |   -0.188 us |  -0.20% |   SAME   |
|   I32   |      I64      |      true      |      2^28      |     2^4      |   1.178 ms |       0.68% |   1.178 ms |       0.68% |    0.355 us |   0.03% |   SAME   |
|   I32   |      I64      |      true      |      2^16      |     2^8      |  21.651 us |       4.66% |  21.535 us |       4.27% |   -0.116 us |  -0.54% |   SAME   |
|   I32   |      I64      |      true      |      2^20      |     2^8      |  30.388 us |       5.03% |  30.223 us |       4.83% |   -0.165 us |  -0.54% |   SAME   |
|   I32   |      I64      |      true      |      2^24      |     2^8      |  95.422 us |       2.52% |  95.374 us |       2.29% |   -0.047 us |  -0.05% |   SAME   |
|   I32   |      I64      |      true      |      2^28      |     2^8      |   1.197 ms |       0.63% |   1.196 ms |       0.62% |   -0.665 us |  -0.06% |   SAME   |
|   I64   |      I64      |     false      |      2^16      |     2^1      |  22.207 us |       4.85% |  22.298 us |       5.02% |    0.091 us |   0.41% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |     2^1      |  33.375 us |       4.19% |  33.388 us |       4.13% |    0.013 us |   0.04% |   SAME   |
|   I64   |      I64      |     false      |      2^24      |     2^1      | 140.052 us |       2.10% | 139.925 us |       2.07% |   -0.127 us |  -0.09% |   SAME   |
|   I64   |      I64      |     false      |      2^28      |     2^1      |   1.884 ms |       0.58% |   1.884 ms |       0.61% |    0.168 us |   0.01% |   SAME   |
|   I64   |      I64      |     false      |      2^16      |     2^4      |  21.589 us |       4.57% |  21.718 us |       5.03% |    0.129 us |   0.60% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |     2^4      |  31.746 us |       4.62% |  31.823 us |       4.78% |    0.077 us |   0.24% |   SAME   |
|   I64   |      I64      |     false      |      2^24      |     2^4      | 115.338 us |       2.48% | 115.303 us |       2.52% |   -0.035 us |  -0.03% |   SAME   |
|   I64   |      I64      |     false      |      2^28      |     2^4      |   1.514 ms |       0.87% |   1.514 ms |       0.87% |   -0.147 us |  -0.01% |   SAME   |
|   I64   |      I64      |     false      |      2^16      |     2^8      |  21.486 us |       4.78% |  21.525 us |       4.73% |    0.039 us |   0.18% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |     2^8      |  31.413 us |       5.04% |  31.231 us |       4.76% |   -0.182 us |  -0.58% |   SAME   |
|   I64   |      I64      |     false      |      2^24      |     2^8      | 112.637 us |       2.99% | 112.825 us |       3.14% |    0.188 us |   0.17% |   SAME   |
|   I64   |      I64      |     false      |      2^28      |     2^8      |   1.451 ms |       0.85% |   1.451 ms |       0.86% |   -0.518 us |  -0.04% |   SAME   |
|   I64   |      I64      |      true      |      2^16      |     2^1      |  22.395 us |       4.97% |  21.958 us |       4.29% |   -0.437 us |  -1.95% |   SAME   |
|   I64   |      I64      |      true      |      2^20      |     2^1      |  33.016 us |       4.61% |  28.005 us |       4.52% |   -5.011 us | -15.18% |   FAST   |
|   I64   |      I64      |      true      |      2^24      |     2^1      | 139.714 us |       2.05% | 141.527 us |       2.80% |    1.813 us |   1.30% |   SAME   |
|   I64   |      I64      |      true      |      2^28      |     2^1      |   1.884 ms |       0.64% |   1.910 ms |       0.74% |   26.690 us |   1.42% |   SLOW   |
|   I64   |      I64      |      true      |      2^16      |     2^4      |  22.120 us |       4.43% |  21.854 us |       3.41% |   -0.266 us |  -1.20% |   SAME   |
|   I64   |      I64      |      true      |      2^20      |     2^4      |  31.998 us |       4.27% |  27.188 us |       5.47% |   -4.810 us | -15.03% |   FAST   |
|   I64   |      I64      |      true      |      2^24      |     2^4      | 116.018 us |       2.85% | 115.337 us |       3.13% |   -0.681 us |  -0.59% |   SAME   |
|   I64   |      I64      |      true      |      2^28      |     2^4      |   1.515 ms |       0.84% |   1.497 ms |       0.91% |  -17.200 us |  -1.14% |   FAST   |
|   I64   |      I64      |      true      |      2^16      |     2^8      |  21.778 us |       3.14% |  19.724 us |       3.26% |   -2.054 us |  -9.43% |   FAST   |
|   I64   |      I64      |      true      |      2^20      |     2^8      |  31.899 us |       4.51% |  26.616 us |       5.85% |   -5.283 us | -16.56% |   FAST   |
|   I64   |      I64      |      true      |      2^24      |     2^8      | 113.014 us |       3.04% | 112.265 us |       3.46% |   -0.749 us |  -0.66% |   SAME   |
|   I64   |      I64      |      true      |      2^28      |     2^8      |   1.450 ms |       0.87% |   1.441 ms |       0.92% |   -9.626 us |  -0.66% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |     2^1      |  22.003 us |       5.34% |  21.956 us |       5.35% |   -0.047 us |  -0.21% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |     2^1      |  43.033 us |       3.81% |  43.367 us |       3.94% |    0.335 us |   0.78% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |     2^1      | 272.820 us |       1.13% | 273.085 us |       1.21% |    0.265 us |   0.10% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |     2^1      |   4.057 ms |       0.31% |   4.057 ms |       0.33% |   -0.334 us |  -0.01% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |     2^4      |  23.944 us |       5.90% |  23.939 us |       6.18% |   -0.005 us |  -0.02% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |     2^4      |  39.926 us |       4.26% |  39.829 us |       4.42% |   -0.097 us |  -0.24% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |     2^4      | 229.269 us |       1.54% | 229.218 us |       1.51% |   -0.051 us |  -0.02% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |     2^4      |   3.339 ms |       0.34% |   3.338 ms |       0.35% |   -1.141 us |  -0.03% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |     2^8      |  23.455 us |       5.84% |  23.331 us |       5.73% |   -0.124 us |  -0.53% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |     2^8      |  39.798 us |       4.71% |  39.873 us |       4.59% |    0.074 us |   0.19% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |     2^8      | 219.599 us |       1.58% | 219.407 us |       1.59% |   -0.192 us |  -0.09% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |     2^8      |   3.168 ms |       0.37% |   3.169 ms |       0.36% |    1.288 us |   0.04% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |     2^1      |  23.699 us |       5.07% |  23.678 us |       5.13% |   -0.022 us |  -0.09% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |     2^1      |  45.111 us |       3.47% |  45.181 us |       3.63% |    0.070 us |   0.15% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |     2^1      | 316.472 us |       0.88% | 316.227 us |       0.85% |   -0.244 us |  -0.08% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |     2^1      |   4.718 ms |       0.20% |   4.718 ms |       0.20% |    0.513 us |   0.01% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |     2^4      |  25.011 us |       5.69% |  25.007 us |       5.24% |   -0.004 us |  -0.02% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |     2^4      |  41.094 us |       3.82% |  41.218 us |       3.65% |    0.124 us |   0.30% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |     2^4      | 269.502 us |       0.91% | 269.596 us |       0.91% |    0.094 us |   0.03% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |     2^4      |   3.985 ms |       0.22% |   3.985 ms |       0.20% |   -0.498 us |  -0.01% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |     2^8      |  25.233 us |       5.72% |  24.861 us |       5.53% |   -0.372 us |  -1.47% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |     2^8      |  40.121 us |       2.97% |  40.143 us |       3.12% |    0.022 us |   0.06% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |     2^8      | 259.765 us |       0.99% | 260.285 us |       1.01% |    0.520 us |   0.20% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |     2^8      |   3.814 ms |       0.20% |   3.814 ms |       0.21% |   -0.112 us |  -0.00% |   SAME   |
|   F32   |      I64      |     false      |      2^16      |     2^1      |  24.051 us |       6.85% |  22.272 us |       5.49% |   -1.779 us |  -7.40% |   FAST   |
|   F32   |      I64      |     false      |      2^20      |     2^1      |  31.064 us |       4.70% |  25.660 us |       4.37% |   -5.404 us | -17.40% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |     2^1      | 108.184 us |       1.82% |  97.401 us |       2.95% |  -10.783 us |  -9.97% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |     2^1      |   1.231 ms |       0.63% |   1.085 ms |       1.14% | -146.245 us | -11.88% |   FAST   |
|   F32   |      I64      |     false      |      2^16      |     2^4      |  22.423 us |       5.89% |  21.393 us |       3.72% |   -1.030 us |  -4.59% |   FAST   |
|   F32   |      I64      |     false      |      2^20      |     2^4      |  30.481 us |       5.56% |  25.939 us |       4.56% |   -4.542 us | -14.90% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |     2^4      |  94.288 us |       2.52% |  84.737 us |       3.05% |   -9.551 us | -10.13% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |     2^4      |   1.202 ms |       0.59% |   1.078 ms |       1.09% | -123.613 us | -10.28% |   FAST   |
|   F32   |      I64      |     false      |      2^16      |     2^8      |  21.673 us |       3.02% |  21.683 us |       2.84% |    0.009 us |   0.04% |   SAME   |
|   F32   |      I64      |     false      |      2^20      |     2^8      |  30.492 us |       5.64% |  24.933 us |       5.27% |   -5.558 us | -18.23% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |     2^8      |  94.990 us |       2.57% |  84.428 us |       3.43% |  -10.562 us | -11.12% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |     2^8      |   1.193 ms |       0.62% |   1.024 ms |       1.01% | -169.211 us | -14.18% |   FAST   |
|   F32   |      I64      |      true      |      2^16      |     2^1      |  23.175 us |       6.26% |  22.992 us |       6.62% |   -0.183 us |  -0.79% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |     2^1      |  30.955 us |       4.84% |  31.063 us |       4.84% |    0.108 us |   0.35% |   SAME   |
|   F32   |      I64      |      true      |      2^24      |     2^1      | 107.856 us |       1.82% | 108.168 us |       1.80% |    0.312 us |   0.29% |   SAME   |
|   F32   |      I64      |      true      |      2^28      |     2^1      |   1.231 ms |       0.63% |   1.231 ms |       0.63% |   -0.159 us |  -0.01% |   SAME   |
|   F32   |      I64      |      true      |      2^16      |     2^4      |  23.084 us |       7.35% |  22.934 us |       7.31% |   -0.149 us |  -0.65% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |     2^4      |  29.949 us |       4.80% |  29.914 us |       4.80% |   -0.035 us |  -0.12% |   SAME   |
|   F32   |      I64      |      true      |      2^24      |     2^4      |  94.807 us |       2.55% |  94.730 us |       2.59% |   -0.078 us |  -0.08% |   SAME   |
|   F32   |      I64      |      true      |      2^28      |     2^4      |   1.202 ms |       0.57% |   1.202 ms |       0.60% |   -0.132 us |  -0.01% |   SAME   |
|   F32   |      I64      |      true      |      2^16      |     2^8      |  21.664 us |       4.60% |  21.460 us |       4.07% |   -0.205 us |  -0.94% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |     2^8      |  30.377 us |       4.60% |  30.482 us |       4.62% |    0.104 us |   0.34% |   SAME   |
|   F32   |      I64      |      true      |      2^24      |     2^8      |  95.290 us |       2.60% |  95.113 us |       2.51% |   -0.177 us |  -0.19% |   SAME   |
|   F32   |      I64      |      true      |      2^28      |     2^8      |   1.193 ms |       0.60% |   1.192 ms |       0.58% |   -0.552 us |  -0.05% |   SAME   |
|   F64   |      I64      |     false      |      2^16      |     2^1      |  21.881 us |       3.86% |  22.099 us |       4.42% |    0.218 us |   1.00% |   SAME   |
|   F64   |      I64      |     false      |      2^20      |     2^1      |  33.156 us |       4.00% |  33.257 us |       4.32% |    0.101 us |   0.31% |   SAME   |
|   F64   |      I64      |     false      |      2^24      |     2^1      | 158.914 us |       1.34% | 159.211 us |       1.54% |    0.297 us |   0.19% |   SAME   |
|   F64   |      I64      |     false      |      2^28      |     2^1      |   2.218 ms |       0.41% |   2.219 ms |       0.41% |    0.930 us |   0.04% |   SAME   |
|   F64   |      I64      |     false      |      2^16      |     2^4      |  21.944 us |       4.30% |  22.109 us |       4.42% |    0.165 us |   0.75% |   SAME   |
|   F64   |      I64      |     false      |      2^20      |     2^4      |  31.608 us |       4.67% |  31.654 us |       4.75% |    0.046 us |   0.14% |   SAME   |
|   F64   |      I64      |     false      |      2^24      |     2^4      | 135.681 us |       1.89% | 135.748 us |       1.94% |    0.067 us |   0.05% |   SAME   |
|   F64   |      I64      |     false      |      2^28      |     2^4      |   1.824 ms |       0.49% |   1.824 ms |       0.48% |   -0.009 us |  -0.00% |   SAME   |
|   F64   |      I64      |     false      |      2^16      |     2^8      |  21.712 us |       4.66% |  21.715 us |       4.63% |    0.003 us |   0.01% |   SAME   |
|   F64   |      I64      |     false      |      2^20      |     2^8      |  31.631 us |       4.53% |  31.550 us |       4.74% |   -0.081 us |  -0.26% |   SAME   |
|   F64   |      I64      |     false      |      2^24      |     2^8      | 135.056 us |       1.98% | 135.020 us |       1.95% |   -0.036 us |  -0.03% |   SAME   |
|   F64   |      I64      |     false      |      2^28      |     2^8      |   1.794 ms |       0.47% |   1.793 ms |       0.48% |   -0.213 us |  -0.01% |   SAME   |
|   F64   |      I64      |      true      |      2^16      |     2^1      |  22.064 us |       4.25% |  21.901 us |       3.51% |   -0.163 us |  -0.74% |   SAME   |
|   F64   |      I64      |      true      |      2^20      |     2^1      |  33.399 us |       3.97% |  28.159 us |       4.78% |   -5.240 us | -15.69% |   FAST   |
|   F64   |      I64      |      true      |      2^24      |     2^1      | 159.237 us |       1.44% | 159.058 us |       1.66% |   -0.179 us |  -0.11% |   SAME   |
|   F64   |      I64      |      true      |      2^28      |     2^1      |   2.218 ms |       0.41% |   2.220 ms |       0.44% |    2.145 us |   0.10% |   SAME   |
|   F64   |      I64      |      true      |      2^16      |     2^4      |  21.652 us |       4.92% |  21.260 us |       3.25% |   -0.392 us |  -1.81% |   SAME   |
|   F64   |      I64      |      true      |      2^20      |     2^4      |  31.569 us |       4.82% |  26.466 us |       5.10% |   -5.103 us | -16.16% |   FAST   |
|   F64   |      I64      |      true      |      2^24      |     2^4      | 134.889 us |       1.88% | 134.097 us |       2.08% |   -0.792 us |  -0.59% |   SAME   |
|   F64   |      I64      |      true      |      2^28      |     2^4      |   1.824 ms |       0.48% |   1.817 ms |       0.49% |   -7.011 us |  -0.38% |   SAME   |
|   F64   |      I64      |      true      |      2^16      |     2^8      |  21.337 us |       3.98% |  19.544 us |       5.38% |   -1.793 us |  -8.40% |   FAST   |
|   F64   |      I64      |      true      |      2^20      |     2^8      |  31.205 us |       5.15% |  25.988 us |       4.88% |   -5.217 us | -16.72% |   FAST   |
|   F64   |      I64      |      true      |      2^24      |     2^8      | 134.570 us |       1.94% | 133.627 us |       2.13% |   -0.942 us |  -0.70% |   SAME   |
|   F64   |      I64      |      true      |      2^28      |     2^8      |   1.794 ms |       0.47% |   1.789 ms |       0.48% |   -4.577 us |  -0.26% |   SAME   |

@gonidelis
Copy link
Member

select.flagged

|  T{ct}  |  OffsetT{ct}  |  MayAlias{ct}  |  Elements{io}  |  Entropy  |   Ref Time |   Ref Noise |   Cmp Time |   Cmp Noise |        Diff |   %Diff |  Status  |
|---------|---------------|----------------|----------------|-----------|------------|-------------|------------|-------------|-------------|---------|----------|
|   I8    |      I64      |     false      |      2^16      |     1     |  23.128 us |       3.15% |  25.399 us |       3.86% |    2.272 us |   9.82% |   SLOW   |
|   I8    |      I64      |     false      |      2^20      |     1     |  29.817 us |       7.32% |  27.541 us |       4.02% |   -2.276 us |  -7.63% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |     1     | 126.524 us |       1.07% |  99.822 us |       1.31% |  -26.702 us | -21.10% |   FAST   |
|   I8    |      I64      |     false      |      2^28      |     1     |   1.698 ms |       0.23% |   1.230 ms |       0.34% | -468.427 us | -27.58% |   FAST   |
|   I8    |      I64      |     false      |      2^16      |   0.544   |  21.646 us |       3.67% |  25.748 us |       3.23% |    4.101 us |  18.95% |   SLOW   |
|   I8    |      I64      |     false      |      2^20      |   0.544   |  28.823 us |       5.46% |  26.642 us |       6.79% |   -2.180 us |  -7.56% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |   0.544   | 122.516 us |       1.13% |  97.080 us |       1.40% |  -25.437 us | -20.76% |   FAST   |
|   I8    |      I64      |     false      |      2^28      |   0.544   |   1.650 ms |       0.24% |   1.189 ms |       0.37% | -461.258 us | -27.96% |   FAST   |
|   I8    |      I64      |     false      |      2^16      |     0     |  23.380 us |       4.33% |  25.956 us |       3.27% |    2.576 us |  11.02% |   SLOW   |
|   I8    |      I64      |     false      |      2^20      |     0     |  30.916 us |       4.54% |  25.594 us |       4.14% |   -5.322 us | -17.22% |   FAST   |
|   I8    |      I64      |     false      |      2^24      |     0     | 112.751 us |       1.22% |  86.163 us |       1.53% |  -26.587 us | -23.58% |   FAST   |
|   I8    |      I64      |     false      |      2^28      |     0     |   1.494 ms |       0.30% |   1.032 ms |       0.47% | -461.222 us | -30.88% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |     1     |  25.561 us |       3.64% |  30.064 us |       3.03% |    4.503 us |  17.62% |   SLOW   |
|   I8    |      I64      |      true      |      2^20      |     1     |  30.100 us |       5.07% |  31.050 us |       5.67% |    0.950 us |   3.16% |   SAME   |
|   I8    |      I64      |      true      |      2^24      |     1     | 147.969 us |       1.04% | 109.801 us |       1.63% |  -38.167 us | -25.79% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |     1     |   2.044 ms |       0.19% |   1.362 ms |       0.22% | -681.634 us | -33.35% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |   0.544   |  23.846 us |       2.71% |  28.634 us |       4.05% |    4.788 us |  20.08% |   SLOW   |
|   I8    |      I64      |      true      |      2^20      |   0.544   |  29.893 us |       4.72% |  30.159 us |       5.43% |    0.266 us |   0.89% |   SAME   |
|   I8    |      I64      |      true      |      2^24      |   0.544   | 144.194 us |       0.97% | 107.225 us |       1.29% |  -36.968 us | -25.64% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |   0.544   |   1.990 ms |       0.20% |   1.326 ms |       0.26% | -664.314 us | -33.38% |   FAST   |
|   I8    |      I64      |      true      |      2^16      |     0     |  24.364 us |       3.07% |  27.872 us |       2.40% |    3.508 us |  14.40% |   SLOW   |
|   I8    |      I64      |      true      |      2^20      |     0     |  31.716 us |       4.32% |  28.219 us |       4.65% |   -3.496 us | -11.02% |   FAST   |
|   I8    |      I64      |      true      |      2^24      |     0     | 134.928 us |       0.94% |  95.733 us |       1.23% |  -39.195 us | -29.05% |   FAST   |
|   I8    |      I64      |      true      |      2^28      |     0     |   1.847 ms |       0.21% |   1.160 ms |       0.31% | -687.885 us | -37.23% |   FAST   |
|   I16   |      I64      |     false      |      2^16      |     1     |  24.261 us |       3.80% |  21.983 us |       4.29% |   -2.277 us |  -9.39% |   FAST   |
|   I16   |      I64      |     false      |      2^20      |     1     |  32.274 us |       4.79% |  26.438 us |       6.53% |   -5.837 us | -18.08% |   FAST   |
|   I16   |      I64      |     false      |      2^24      |     1     | 137.177 us |       1.12% | 122.920 us |       1.40% |  -14.256 us | -10.39% |   FAST   |
|   I16   |      I64      |     false      |      2^28      |     1     |   1.863 ms |       0.20% |   1.673 ms |       0.25% | -189.942 us | -10.20% |   FAST   |
|   I16   |      I64      |     false      |      2^16      |   0.544   |  21.384 us |       3.82% |  21.486 us |       3.45% |    0.102 us |   0.48% |   SAME   |
|   I16   |      I64      |     false      |      2^20      |   0.544   |  29.789 us |       5.77% |  26.281 us |       4.61% |   -3.508 us | -11.77% |   FAST   |
|   I16   |      I64      |     false      |      2^24      |   0.544   | 129.332 us |       1.11% | 121.886 us |       1.35% |   -7.445 us |  -5.76% |   FAST   |
|   I16   |      I64      |     false      |      2^28      |   0.544   |   1.755 ms |       0.24% |   1.648 ms |       0.26% | -107.216 us |  -6.11% |   FAST   |
|   I16   |      I64      |     false      |      2^16      |     0     |  23.027 us |       4.58% |  21.720 us |       5.70% |   -1.307 us |  -5.68% |   FAST   |
|   I16   |      I64      |     false      |      2^20      |     0     |  29.023 us |       6.00% |  25.259 us |       4.31% |   -3.764 us | -12.97% |   FAST   |
|   I16   |      I64      |     false      |      2^24      |     0     | 117.118 us |       1.19% | 110.988 us |       1.80% |   -6.130 us |  -5.23% |   FAST   |
|   I16   |      I64      |     false      |      2^28      |     0     |   1.576 ms |       0.30% |   1.479 ms |       0.37% |  -96.829 us |  -6.14% |   FAST   |
|   I16   |      I64      |      true      |      2^16      |     1     |  25.940 us |       2.33% |  24.731 us |       4.79% |   -1.209 us |  -4.66% |   FAST   |
|   I16   |      I64      |      true      |      2^20      |     1     |  32.955 us |       3.91% |  26.755 us |       4.68% |   -6.200 us | -18.81% |   FAST   |
|   I16   |      I64      |      true      |      2^24      |     1     | 155.987 us |       0.82% | 109.282 us |       2.13% |  -46.705 us | -29.94% |   FAST   |
|   I16   |      I64      |      true      |      2^28      |     1     |   2.185 ms |       0.15% |   1.437 ms |       0.56% | -747.702 us | -34.22% |   FAST   |
|   I16   |      I64      |      true      |      2^16      |   0.544   |  23.614 us |       2.87% |  23.643 us |       3.96% |    0.029 us |   0.12% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |   0.544   |  31.188 us |       3.97% |  27.371 us |       4.76% |   -3.817 us | -12.24% |   FAST   |
|   I16   |      I64      |      true      |      2^24      |   0.544   | 150.900 us |       0.85% | 106.865 us |       2.23% |  -44.034 us | -29.18% |   FAST   |
|   I16   |      I64      |      true      |      2^28      |   0.544   |   2.075 ms |       0.18% |   1.391 ms |       0.58% | -683.390 us | -32.94% |   FAST   |
|   I16   |      I64      |      true      |      2^16      |     0     |  24.613 us |       5.56% |  23.908 us |       4.17% |   -0.705 us |  -2.86% |   SAME   |
|   I16   |      I64      |      true      |      2^20      |     0     |  29.339 us |       3.79% |  26.114 us |       5.62% |   -3.224 us | -10.99% |   FAST   |
|   I16   |      I64      |      true      |      2^24      |     0     | 137.787 us |       0.95% |  94.901 us |       2.71% |  -42.886 us | -31.12% |   FAST   |
|   I16   |      I64      |      true      |      2^28      |     0     |   1.894 ms |       0.18% |   1.221 ms |       0.76% | -672.818 us | -35.53% |   FAST   |
|   I32   |      I64      |     false      |      2^16      |     1     |  23.351 us |       5.27% |  21.969 us |       5.02% |   -1.381 us |  -5.92% |   FAST   |
|   I32   |      I64      |     false      |      2^20      |     1     |  31.316 us |       5.41% |  26.065 us |       5.37% |   -5.251 us | -16.77% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |     1     | 125.211 us |       1.34% | 117.432 us |       1.59% |   -7.778 us |  -6.21% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |     1     |   1.689 ms |       0.28% |   1.555 ms |       0.39% | -134.513 us |  -7.96% |   FAST   |
|   I32   |      I64      |     false      |      2^16      |   0.544   |  23.494 us |       3.76% |  21.841 us |       2.93% |   -1.653 us |  -7.04% |   FAST   |
|   I32   |      I64      |     false      |      2^20      |   0.544   |  31.193 us |       5.63% |  25.869 us |       4.97% |   -5.324 us | -17.07% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |   0.544   | 122.524 us |       1.47% | 114.101 us |       1.67% |   -8.423 us |  -6.87% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |   0.544   |   1.642 ms |       0.29% |   1.499 ms |       0.38% | -142.689 us |  -8.69% |   FAST   |
|   I32   |      I64      |     false      |      2^16      |     0     |  22.872 us |       5.16% |  21.791 us |       3.79% |   -1.081 us |  -4.73% |   FAST   |
|   I32   |      I64      |     false      |      2^20      |     0     |  29.809 us |       5.82% |  24.524 us |       7.38% |   -5.285 us | -17.73% |   FAST   |
|   I32   |      I64      |     false      |      2^24      |     0     | 111.248 us |       1.87% | 101.057 us |       2.95% |  -10.191 us |  -9.16% |   FAST   |
|   I32   |      I64      |     false      |      2^28      |     0     |   1.444 ms |       0.42% |   1.287 ms |       0.70% | -157.387 us | -10.90% |   FAST   |
|   I32   |      I64      |      true      |      2^16      |     1     |  24.990 us |       4.51% |  23.432 us |       5.06% |   -1.558 us |  -6.24% |   FAST   |
|   I32   |      I64      |      true      |      2^20      |     1     |  33.656 us |       4.93% |  27.870 us |       4.38% |   -5.786 us | -17.19% |   FAST   |
|   I32   |      I64      |      true      |      2^24      |     1     | 145.440 us |       1.05% | 117.951 us |       2.74% |  -27.489 us | -18.90% |   FAST   |
|   I32   |      I64      |      true      |      2^28      |     1     |   2.007 ms |       0.21% |   1.570 ms |       0.74% | -437.072 us | -21.78% |   FAST   |
|   I32   |      I64      |      true      |      2^16      |   0.544   |  24.386 us |       5.74% |  22.762 us |       7.35% |   -1.623 us |  -6.66% |   FAST   |
|   I32   |      I64      |      true      |      2^20      |   0.544   |  33.393 us |       4.59% |  27.869 us |       4.40% |   -5.524 us | -16.54% |   FAST   |
|   I32   |      I64      |      true      |      2^24      |   0.544   | 141.988 us |       1.09% | 114.900 us |       2.62% |  -27.088 us | -19.08% |   FAST   |
|   I32   |      I64      |      true      |      2^28      |   0.544   |   1.952 ms |       0.24% |   1.520 ms |       0.75% | -432.470 us | -22.15% |   FAST   |
|   I32   |      I64      |      true      |      2^16      |     0     |  24.268 us |       2.53% |  21.736 us |       3.16% |   -2.532 us | -10.43% |   FAST   |
|   I32   |      I64      |      true      |      2^20      |     0     |  32.429 us |       4.98% |  26.294 us |       5.09% |   -6.135 us | -18.92% |   FAST   |
|   I32   |      I64      |      true      |      2^24      |     0     | 130.634 us |       1.16% | 102.884 us |       3.94% |  -27.750 us | -21.24% |   FAST   |
|   I32   |      I64      |      true      |      2^28      |     0     |   1.760 ms |       0.24% |   1.297 ms |       1.17% | -463.267 us | -26.32% |   FAST   |
|   I64   |      I64      |     false      |      2^16      |     1     |  23.767 us |       3.89% |  21.908 us |       3.80% |   -1.859 us |  -7.82% |   FAST   |
|   I64   |      I64      |     false      |      2^20      |     1     |  32.747 us |       4.84% |  28.238 us |       5.32% |   -4.509 us | -13.77% |   FAST   |
|   I64   |      I64      |     false      |      2^24      |     1     | 174.084 us |       1.85% | 160.595 us |       2.23% |  -13.490 us |  -7.75% |   FAST   |
|   I64   |      I64      |     false      |      2^28      |     1     |   2.431 ms |       0.43% |   2.260 ms |       0.76% | -171.725 us |  -7.06% |   FAST   |
|   I64   |      I64      |     false      |      2^16      |   0.544   |  21.716 us |       4.41% |  21.700 us |       4.28% |   -0.016 us |  -0.07% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |   0.544   |  32.804 us |       4.80% |  28.281 us |       4.83% |   -4.523 us | -13.79% |   FAST   |
|   I64   |      I64      |     false      |      2^24      |   0.544   | 168.377 us |       1.54% | 155.836 us |       2.01% |  -12.541 us |  -7.45% |   FAST   |
|   I64   |      I64      |     false      |      2^28      |   0.544   |   2.345 ms |       0.38% |   2.165 ms |       0.68% | -179.813 us |  -7.67% |   FAST   |
|   I64   |      I64      |     false      |      2^16      |     0     |  22.006 us |       6.38% |  20.722 us |       7.03% |   -1.284 us |  -5.83% |   SAME   |
|   I64   |      I64      |     false      |      2^20      |     0     |  31.385 us |       5.75% |  26.801 us |       5.33% |   -4.584 us | -14.61% |   FAST   |
|   I64   |      I64      |     false      |      2^24      |     0     | 144.279 us |       2.09% | 135.039 us |       3.12% |   -9.240 us |  -6.40% |   FAST   |
|   I64   |      I64      |     false      |      2^28      |     0     |   1.953 ms |       0.50% |   1.823 ms |       0.99% | -130.028 us |  -6.66% |   FAST   |
|   I64   |      I64      |      true      |      2^16      |     1     |  24.915 us |       4.56% |  23.941 us |       2.80% |   -0.974 us |  -3.91% |   FAST   |
|   I64   |      I64      |      true      |      2^20      |     1     |  35.571 us |       4.16% |  29.921 us |       3.99% |   -5.650 us | -15.89% |   FAST   |
|   I64   |      I64      |      true      |      2^24      |     1     | 199.174 us |       0.91% | 169.815 us |       2.54% |  -29.359 us | -14.74% |   FAST   |
|   I64   |      I64      |      true      |      2^28      |     1     |   2.847 ms |       0.23% |   2.364 ms |       0.61% | -482.797 us | -16.96% |   FAST   |
|   I64   |      I64      |      true      |      2^16      |   0.544   |  23.796 us |       3.69% |  21.867 us |       3.43% |   -1.929 us |  -8.11% |   FAST   |
|   I64   |      I64      |      true      |      2^20      |   0.544   |  34.835 us |       5.08% |  29.064 us |       6.45% |   -5.771 us | -16.57% |   FAST   |
|   I64   |      I64      |      true      |      2^24      |   0.544   | 193.860 us |       1.05% | 157.705 us |       2.47% |  -36.155 us | -18.65% |   FAST   |
|   I64   |      I64      |      true      |      2^28      |   0.544   |   2.759 ms |       0.24% |   2.164 ms |       0.59% | -594.543 us | -21.55% |   FAST   |
|   I64   |      I64      |      true      |      2^16      |     0     |  23.795 us |       3.45% |  21.857 us |       4.33% |   -1.938 us |  -8.15% |   FAST   |
|   I64   |      I64      |      true      |      2^20      |     0     |  32.815 us |       4.44% |  27.754 us |       6.21% |   -5.060 us | -15.42% |   FAST   |
|   I64   |      I64      |      true      |      2^24      |     0     | 169.100 us |       1.09% | 134.744 us |       3.70% |  -34.356 us | -20.32% |   FAST   |
|   I64   |      I64      |      true      |      2^28      |     0     |   2.374 ms |       0.21% |   1.797 ms |       1.01% | -576.971 us | -24.31% |   FAST   |
|  I128   |      I64      |     false      |      2^16      |     1     |  26.189 us |       4.71% |  25.988 us |       5.19% |   -0.201 us |  -0.77% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |     1     |  37.498 us |       4.43% |  37.626 us |       4.83% |    0.129 us |   0.34% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |     1     | 293.365 us |       2.05% | 294.223 us |       1.96% |    0.857 us |   0.29% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |     1     |   4.442 ms |       0.60% |   4.442 ms |       0.59% |   -0.297 us |  -0.01% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |   0.544   |  25.363 us |       6.54% |  25.269 us |       6.24% |   -0.094 us |  -0.37% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |   0.544   |  36.349 us |       4.76% |  36.135 us |       4.79% |   -0.214 us |  -0.59% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |   0.544   | 274.360 us |       2.07% | 274.828 us |       1.96% |    0.468 us |   0.17% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |   0.544   |   4.125 ms |       0.65% |   4.125 ms |       0.63% |    0.054 us |   0.00% |   SAME   |
|  I128   |      I64      |     false      |      2^16      |     0     |  25.438 us |       4.96% |  25.370 us |       5.57% |   -0.068 us |  -0.27% |   SAME   |
|  I128   |      I64      |     false      |      2^20      |     0     |  33.175 us |       5.61% |  33.634 us |       5.61% |    0.459 us |   1.38% |   SAME   |
|  I128   |      I64      |     false      |      2^24      |     0     | 228.188 us |       2.28% | 228.894 us |       2.41% |    0.706 us |   0.31% |   SAME   |
|  I128   |      I64      |     false      |      2^28      |     0     |   3.354 ms |       0.77% |   3.354 ms |       0.75% |    0.362 us |   0.01% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |     1     |  26.656 us |       4.87% |  26.440 us |       4.79% |   -0.216 us |  -0.81% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |     1     |  42.614 us |       4.27% |  43.013 us |       4.32% |    0.400 us |   0.94% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |     1     | 348.983 us |       1.17% | 349.182 us |       1.16% |    0.199 us |   0.06% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |     1     |   5.264 ms |       0.32% |   5.263 ms |       0.30% |   -1.404 us |  -0.03% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |   0.544   |  26.352 us |       5.09% |  26.157 us |       5.11% |   -0.195 us |  -0.74% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |   0.544   |  41.106 us |       4.11% |  41.195 us |       4.65% |    0.089 us |   0.22% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |   0.544   | 327.496 us |       1.14% | 327.833 us |       1.12% |    0.336 us |   0.10% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |   0.544   |   4.908 ms |       0.30% |   4.909 ms |       0.30% |    0.555 us |   0.01% |   SAME   |
|  I128   |      I64      |      true      |      2^16      |     0     |  25.467 us |       5.21% |  25.693 us |       5.20% |    0.226 us |   0.89% |   SAME   |
|  I128   |      I64      |      true      |      2^20      |     0     |  38.408 us |       4.26% |  38.350 us |       3.92% |   -0.059 us |  -0.15% |   SAME   |
|  I128   |      I64      |      true      |      2^24      |     0     | 280.739 us |       1.21% | 281.034 us |       1.24% |    0.295 us |   0.11% |   SAME   |
|  I128   |      I64      |      true      |      2^28      |     0     |   4.159 ms |       0.32% |   4.158 ms |       0.31% |   -0.844 us |  -0.02% |   SAME   |
|   F32   |      I64      |     false      |      2^16      |     1     |  23.549 us |       6.22% |  22.188 us |       5.99% |   -1.361 us |  -5.78% |   SAME   |
|   F32   |      I64      |     false      |      2^20      |     1     |  31.090 us |       4.91% |  26.087 us |       5.74% |   -5.004 us | -16.09% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |     1     | 125.726 us |       1.37% | 117.206 us |       1.58% |   -8.520 us |  -6.78% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |     1     |   1.689 ms |       0.27% |   1.555 ms |       0.39% | -134.612 us |  -7.97% |   FAST   |
|   F32   |      I64      |     false      |      2^16      |   0.544   |  23.386 us |       5.33% |  21.742 us |       4.59% |   -1.645 us |  -7.03% |   FAST   |
|   F32   |      I64      |     false      |      2^20      |   0.544   |  31.115 us |       5.87% |  25.874 us |       5.36% |   -5.240 us | -16.84% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |   0.544   | 123.082 us |       1.41% | 114.263 us |       1.62% |   -8.819 us |  -7.17% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |   0.544   |   1.643 ms |       0.28% |   1.499 ms |       0.37% | -143.393 us |  -8.73% |   FAST   |
|   F32   |      I64      |     false      |      2^16      |     0     |  22.939 us |       5.40% |  21.745 us |       2.72% |   -1.194 us |  -5.21% |   FAST   |
|   F32   |      I64      |     false      |      2^20      |     0     |  30.314 us |       6.45% |  24.334 us |       7.50% |   -5.979 us | -19.73% |   FAST   |
|   F32   |      I64      |     false      |      2^24      |     0     | 110.939 us |       1.86% | 101.927 us |       2.92% |   -9.012 us |  -8.12% |   FAST   |
|   F32   |      I64      |     false      |      2^28      |     0     |   1.445 ms |       0.41% |   1.286 ms |       0.74% | -158.721 us | -10.99% |   FAST   |
|   F32   |      I64      |      true      |      2^16      |     1     |  24.559 us |       5.72% |  23.222 us |       7.79% |   -1.337 us |  -5.44% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |     1     |  33.165 us |       5.21% |  27.168 us |       4.70% |   -5.997 us | -18.08% |   FAST   |
|   F32   |      I64      |      true      |      2^24      |     1     | 145.257 us |       1.04% | 118.020 us |       2.46% |  -27.237 us | -18.75% |   FAST   |
|   F32   |      I64      |      true      |      2^28      |     1     |   2.007 ms |       0.21% |   1.570 ms |       0.72% | -437.040 us | -21.77% |   FAST   |
|   F32   |      I64      |      true      |      2^16      |   0.544   |  24.217 us |       5.16% |  23.106 us |       8.10% |   -1.111 us |  -4.59% |   SAME   |
|   F32   |      I64      |      true      |      2^20      |   0.544   |  33.079 us |       4.53% |  27.850 us |       4.30% |   -5.229 us | -15.81% |   FAST   |
|   F32   |      I64      |      true      |      2^24      |   0.544   | 142.710 us |       1.16% | 115.061 us |       2.53% |  -27.649 us | -19.37% |   FAST   |
|   F32   |      I64      |      true      |      2^28      |   0.544   |   1.952 ms |       0.22% |   1.520 ms |       0.72% | -432.088 us | -22.13% |   FAST   |
|   F32   |      I64      |      true      |      2^16      |     0     |  23.950 us |       4.12% |  21.542 us |       3.93% |   -2.407 us | -10.05% |   FAST   |
|   F32   |      I64      |      true      |      2^20      |     0     |  32.379 us |       5.06% |  26.263 us |       5.63% |   -6.116 us | -18.89% |   FAST   |
|   F32   |      I64      |      true      |      2^24      |     0     | 130.412 us |       1.22% | 102.523 us |       3.98% |  -27.889 us | -21.39% |   FAST   |
|   F32   |      I64      |      true      |      2^28      |     0     |   1.760 ms |       0.24% |   1.296 ms |       1.20% | -464.201 us | -26.37% |   FAST   |
|   F64   |      I64      |     false      |      2^16      |     1     |  23.676 us |       3.26% |  21.750 us |       2.94% |   -1.926 us |  -8.13% |   FAST   |
|   F64   |      I64      |     false      |      2^20      |     1     |  33.072 us |       5.16% |  27.950 us |       5.27% |   -5.122 us | -15.49% |   FAST   |
|   F64   |      I64      |     false      |      2^24      |     1     | 173.220 us |       1.77% | 161.564 us |       2.25% |  -11.656 us |  -6.73% |   FAST   |
|   F64   |      I64      |     false      |      2^28      |     1     |   2.432 ms |       0.45% |   2.262 ms |       0.80% | -170.167 us |  -7.00% |   FAST   |
|   F64   |      I64      |     false      |      2^16      |   0.544   |  22.095 us |       3.37% |  21.751 us |       3.96% |   -0.344 us |  -1.56% |   SAME   |
|   F64   |      I64      |     false      |      2^20      |   0.544   |  33.188 us |       4.74% |  28.209 us |       4.22% |   -4.979 us | -15.00% |   FAST   |
|   F64   |      I64      |     false      |      2^24      |   0.544   | 167.866 us |       1.52% | 155.458 us |       2.04% |  -12.408 us |  -7.39% |   FAST   |
|   F64   |      I64      |     false      |      2^28      |   0.544   |   2.344 ms |       0.36% |   2.165 ms |       0.67% | -179.069 us |  -7.64% |   FAST   |
|   F64   |      I64      |     false      |      2^16      |     0     |  22.585 us |       4.42% |  20.783 us |       6.58% |   -1.802 us |  -7.98% |   FAST   |
|   F64   |      I64      |     false      |      2^20      |     0     |  30.796 us |       5.71% |  26.610 us |       6.23% |   -4.186 us | -13.59% |   FAST   |
|   F64   |      I64      |     false      |      2^24      |     0     | 144.284 us |       2.06% | 134.653 us |       3.00% |   -9.630 us |  -6.67% |   FAST   |
|   F64   |      I64      |     false      |      2^28      |     0     |   1.953 ms |       0.49% |   1.824 ms |       0.97% | -129.133 us |  -6.61% |   FAST   |
|   F64   |      I64      |      true      |      2^16      |     1     |  24.813 us |       4.79% |  23.936 us |       3.34% |   -0.877 us |  -3.53% |   FAST   |
|   F64   |      I64      |      true      |      2^20      |     1     |  35.252 us |       4.97% |  30.065 us |       5.93% |   -5.187 us | -14.71% |   FAST   |
|   F64   |      I64      |      true      |      2^24      |     1     | 199.216 us |       1.05% | 169.777 us |       2.45% |  -29.439 us | -14.78% |   FAST   |
|   F64   |      I64      |      true      |      2^28      |     1     |   2.847 ms |       0.23% |   2.367 ms |       0.65% | -479.958 us | -16.86% |   FAST   |
|   F64   |      I64      |      true      |      2^16      |   0.544   |  23.522 us |       4.14% |  21.512 us |       4.67% |   -2.010 us |  -8.54% |   FAST   |
|   F64   |      I64      |      true      |      2^20      |   0.544   |  35.166 us |       4.31% |  29.137 us |       4.42% |   -6.029 us | -17.15% |   FAST   |
|   F64   |      I64      |      true      |      2^24      |   0.544   | 193.907 us |       1.00% | 157.819 us |       2.52% |  -36.088 us | -18.61% |   FAST   |
|   F64   |      I64      |      true      |      2^28      |   0.544   |   2.759 ms |       0.23% |   2.165 ms |       0.60% | -594.223 us | -21.53% |   FAST   |
|   F64   |      I64      |      true      |      2^16      |     0     |  23.593 us |       3.90% |  21.468 us |       3.48% |   -2.126 us |  -9.01% |   FAST   |
|   F64   |      I64      |      true      |      2^20      |     0     |  33.946 us |       4.11% |  27.747 us |       4.68% |   -6.200 us | -18.26% |   FAST   |
|   F64   |      I64      |      true      |      2^24      |     0     | 169.143 us |       1.05% | 135.948 us |       4.03% |  -33.194 us | -19.63% |   FAST   |
|   F64   |      I64      |      true      |      2^28      |     0     |   2.374 ms |       0.22% |   1.797 ms |       0.97% | -577.067 us | -24.31% |   FAST   |

Copy link
Contributor

github-actions bot commented Feb 4, 2025

🟩 CI finished in 1h 37m: Pass: 100%/90 | Total: 2d 14h | Avg: 41m 38s | Max: 1h 16m | Hits: 304%/12730
  • 🟩 cub: Pass: 100%/44 | Total: 1d 14h | Avg: 52m 52s | Max: 1h 16m | Hits: 359%/3500

    🟩 cpu
      🟩 amd64              Pass: 100%/42  | Total:  1d 12h | Avg: 52m 24s | Max:  1h 16m | Hits: 359%/3500  
      🟩 arm64              Pass: 100%/2   | Total:  2h 05m | Avg:  1h 02m | Max:  1h 03m
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  5h 05m | Avg:  1h 01m | Max:  1h 10m | Hits: 360%/875   
      🟩 12.5               Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 12m
      🟩 12.8               Pass: 100%/37  | Total:  1d 07h | Avg: 50m 53s | Max:  1h 16m | Hits: 359%/2625  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 57m | Avg: 58m 47s | Max:  1h 01m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 05m | Avg:  1h 01m | Max:  1h 10m | Hits: 360%/875   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 12m
      🟩 nvcc12.8           Pass: 100%/35  | Total:  1d 05h | Avg: 50m 26s | Max:  1h 16m | Hits: 359%/2625  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 57m | Avg: 58m 47s | Max:  1h 01m
      🟩 nvcc               Pass: 100%/42  | Total:  1d 12h | Avg: 52m 35s | Max:  1h 16m | Hits: 359%/3500  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 06m | Avg:  1h 01m | Max:  1h 10m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 11s | Max: 57m 29s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 27s | Max: 55m 55s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 48m | Avg: 54m 15s | Max: 55m 51s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 33m | Avg: 47m 35s | Max:  1h 02m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 45s | Max: 58m 58s
      🟩 GCC8               Pass: 100%/1   | Total: 58m 01s | Avg: 58m 01s | Max: 58m 01s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 57m | Avg: 58m 49s | Max: 59m 59s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 28s | Max:  1h 00m
      🟩 GCC11              Pass: 100%/2   | Total:  1h 48m | Avg: 54m 02s | Max: 55m 00s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 55m | Avg: 57m 53s | Max: 59m 55s
      🟩 GCC13              Pass: 100%/10  | Total:  6h 13m | Avg: 37m 20s | Max:  1h 03m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 16m | Avg:  1h 08m | Max:  1h 16m | Hits: 360%/1750  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 08m | Hits: 359%/1750  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 12m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total: 15h 09m | Avg: 53m 30s | Max:  1h 10m
      🟩 GCC                Pass: 100%/21  | Total: 16h 47m | Avg: 47m 58s | Max:  1h 03m
      🟩 MSVC               Pass: 100%/4   | Total:  4h 31m | Avg:  1h 07m | Max:  1h 16m | Hits: 359%/3500  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 17m | Avg:  1h 08m | Max:  1h 12m
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 48m 21s | Avg: 24m 10s | Max: 24m 51s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 10h | Avg:  1h 00m | Max:  1h 16m | Hits: 359%/3500  
      🟩 rtxa6000           Pass: 100%/8   | Total:  3h 57m | Avg: 29m 44s | Max:  1h 01m
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 12h | Avg: 58m 56s | Max:  1h 16m | Hits: 359%/3500  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 19m 26s | Avg: 19m 26s | Max: 19m 26s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 09s | Avg: 16m 09s | Max: 16m 09s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 11m | Avg: 23m 42s | Max: 24m 17s
      🟩 TestGPU            Pass: 100%/2   | Total: 38m 58s | Avg: 19m 29s | Max: 20m 32s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 48m 21s | Avg: 24m 10s | Max: 24m 51s
      🟩 90;90a;100         Pass: 100%/1   | Total: 59m 03s | Avg: 59m 03s | Max: 59m 03s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 20h 05m | Avg:  1h 00m | Max:  1h 16m | Hits: 360%/2625  
      🟩 20                 Pass: 100%/24  | Total: 18h 40m | Avg: 46m 41s | Max:  1h 12m | Hits: 359%/875   
    
  • 🟩 thrust: Pass: 100%/43 | Total: 23h 06m | Avg: 32m 14s | Max: 58m 12s | Hits: 283%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 35m 25s | Avg: 17m 42s | Max: 24m 17s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total: 22h 09m | Avg: 32m 25s | Max: 58m 12s | Hits: 283%/9230  
      🟩 arm64              Pass: 100%/2   | Total: 56m 36s | Avg: 28m 18s | Max: 30m 07s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 06m | Avg: 37m 14s | Max: 56m 27s | Hits: 262%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  1h 46m | Avg: 53m 11s | Max: 56m 25s
      🟩 12.8               Pass: 100%/36  | Total: 18h 13m | Avg: 30m 22s | Max: 58m 12s | Hits: 288%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 54m 01s | Avg: 27m 00s | Max: 27m 30s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 06m | Avg: 37m 14s | Max: 56m 27s | Hits: 262%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 46m | Avg: 53m 11s | Max: 56m 25s
      🟩 nvcc12.8           Pass: 100%/34  | Total: 17h 19m | Avg: 30m 34s | Max: 58m 12s | Hits: 288%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 54m 01s | Avg: 27m 00s | Max: 27m 30s
      🟩 nvcc               Pass: 100%/41  | Total: 22h 12m | Avg: 32m 29s | Max: 58m 12s | Hits: 283%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 09m | Avg: 32m 15s | Max: 33m 43s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 06m | Avg: 33m 10s | Max: 33m 46s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 04m | Avg: 32m 03s | Max: 32m 05s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 00m | Avg: 30m 19s | Max: 31m 47s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 39m | Avg: 22m 48s | Max: 31m 36s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 03m | Avg: 31m 32s | Max: 33m 23s
      🟩 GCC8               Pass: 100%/1   | Total: 28m 54s | Avg: 28m 54s | Max: 28m 54s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 05m | Avg: 32m 56s | Max: 34m 10s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 04m | Avg: 32m 29s | Max: 34m 02s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 04m | Avg: 32m 29s | Max: 33m 43s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 08m | Avg: 34m 11s | Max: 35m 38s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 04m | Avg: 23m 07s | Max: 36m 18s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 50m | Avg: 55m 04s | Max: 56m 27s | Hits: 262%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 28m | Avg: 49m 35s | Max: 58m 12s | Hits: 297%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 46m | Avg: 53m 11s | Max: 56m 25s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  7h 59m | Avg: 28m 13s | Max: 33m 46s
      🟩 GCC                Pass: 100%/19  | Total:  9h 01m | Avg: 28m 28s | Max: 36m 18s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 18m | Avg: 51m 46s | Max: 58m 12s | Hits: 283%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 46m | Avg: 53m 11s | Max: 56m 25s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 19h 14m | Avg: 34m 59s | Max: 58m 12s | Hits: 262%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  3h 51m | Avg: 23m 09s | Max: 57m 40s | Hits: 314%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total: 21h 44m | Avg: 35m 15s | Max: 58m 12s | Hits: 262%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 47m 57s | Avg: 15m 59s | Max: 32m 54s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 33m 43s | Avg: 11m 14s | Max: 12m 02s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 12h 01m | Avg: 36m 03s | Max: 58m 12s | Hits: 262%/5538  
      🟩 20                 Pass: 100%/21  | Total: 10h 29m | Avg: 29m 58s | Max: 57m 40s | Hits: 314%/3692  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 8m 22s | Avg: 4m 11s | Max: 6m 21s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  6m 21s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  6m 21s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  6m 21s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  6m 21s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  6m 21s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  6m 21s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  8m 22s | Avg:  4m 11s | Max:  6m 21s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 01s | Avg:  2m 01s | Max:  2m 01s
      🟩 Test               Pass: 100%/1   | Total:  6m 21s | Avg:  6m 21s | Max:  6m 21s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 14s | Avg: 26m 14s | Max: 26m 14s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 90)

# Runner
65 linux-amd64-cpu16
9 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber merged commit 743d491 into NVIDIA:main Feb 4, 2025
102 of 105 checks passed
Copy link
Contributor

github-actions bot commented Feb 4, 2025

Backport failed for branch/2.8.x, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin branch/2.8.x
git worktree add -d .worktree/backport-3545-to-branch/2.8.x origin/branch/2.8.x
cd .worktree/backport-3545-to-branch/2.8.x
git switch --create backport-3545-to-branch/2.8.x
git cherry-pick -x 743d49141d09ab68ead0b945783811e480dcbbdd

@bernhardmgruber
Copy link
Contributor Author

select.if and select.unique currently share the tuning here, but we want to separate them in a follow-up PR.

bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this pull request Feb 4, 2025
* Add b200 policies for device.select.if,flagged,unique
* Fix interpretation of tuning results
* Default back i16,i64,true and f32,i64,true workloads due regressions
* Default even more workloads that regress for large input sizes

Co-authored-by: Giannis Gonidelis <ggonidelis@nvidia.com>
miscco added a commit that referenced this pull request Feb 4, 2025
* Add b200 policies for device.select.if,flagged,unique
* Fix interpretation of tuning results
* Default back i16,i64,true and f32,i64,true workloads due regressions
* Default even more workloads that regress for large input sizes

Co-authored-by: Giannis Gonidelis <ggonidelis@nvidia.com>
Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants