Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace cub::Traits by numeric_limits and deprecate it #3384

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Jan 14, 2025

Fixes: #3381

@bernhardmgruber bernhardmgruber added the cub For all items related to CUB label Jan 14, 2025
@bernhardmgruber
Copy link
Contributor Author

/ok to test

@bernhardmgruber
Copy link
Contributor Author

bernhardmgruber commented Jan 14, 2025

@miscco I would love to deprecate cub::Traits in favor of standard facilities in libcu++. As it currently stands, we would still need:

  • support for FP16, BF16 and FP8 types by cuda::std::is_floating_point
  • support for FP16, BF16 and FP8 types by cuda::std::numeric_limits (only min and lowest)

Do you think it's possible we can have this support soonish?

@bernhardmgruber bernhardmgruber force-pushed the depr_cub_traits branch 7 times, most recently from cdf13ed to ac81fd5 Compare January 22, 2025 15:50
Copy link

copy-pr-bot bot commented Jan 22, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@bernhardmgruber bernhardmgruber changed the title Deprecate cub::Traits Replace cub::Traits by numeric_limits and deprecate it Jan 22, 2025
@bernhardmgruber bernhardmgruber changed the title Replace cub::Traits by numeric_limits and deprecate it Replace cub::Traits by numeric_limits and deprecate it Jan 22, 2025
@bernhardmgruber
Copy link
Contributor Author

/ok to test

@bernhardmgruber
Copy link
Contributor Author

/ok to test

@bernhardmgruber bernhardmgruber marked this pull request as ready for review January 22, 2025 19:28
@bernhardmgruber bernhardmgruber requested review from a team as code owners January 22, 2025 19:28
Copy link
Contributor

🟨 CI finished in 4h 49m: Pass: 91%/78 | Total: 2d 06h | Avg: 41m 37s | Max: 1h 14m | Hits: 183%/11826
  • 🟨 cub: Pass: 81%/38 | Total: 1d 08h | Avg: 51m 44s | Max: 1h 14m | Hits: 81%/2646

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  80%/36  | Total:  1d 06h | Avg: 50m 51s | Max:  1h 14m | Hits:  81%/2646  
      🟩 arm64              Pass: 100%/2   | Total:  2h 15m | Avg:  1h 07m | Max:  1h 09m
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🔍 nvcc               Pass:  80%/36  | Total:  1d 06h | Avg: 51m 16s | Max:  1h 14m | Hits:  81%/2646  
    🔍 gpu: v100 🔍
      🟩 h100               Pass: 100%/2   | Total: 45m 20s | Avg: 22m 40s | Max: 25m 54s
      🔍 v100               Pass:  80%/36  | Total:  1d 08h | Avg: 53m 21s | Max:  1h 14m | Hits:  81%/2646  
    🟨 ctk
      🟥 12.0               Pass:   0%/5   | Total:  3h 18m | Avg: 39m 43s | Max:  1h 01m
      🟩 12.5               Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
      🟨 12.6               Pass:  93%/31  | Total:  1d 03h | Avg: 52m 15s | Max:  1h 11m | Hits:  81%/2646  
    🟨 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 00m | Avg:  1h 00m | Max:  1h 01m
      🟥 nvcc12.0           Pass:   0%/5   | Total:  3h 18m | Avg: 39m 43s | Max:  1h 01m
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
      🟨 nvcc12.6           Pass:  93%/29  | Total:  1d 00h | Avg: 51m 42s | Max:  1h 11m | Hits:  81%/2646  
    🟨 cxx
      🟨 Clang14            Pass:  50%/4   | Total:  3h 10m | Avg: 47m 32s | Max:  1h 02m
      🟩 Clang15            Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟩 Clang16            Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 Clang17            Pass: 100%/1   | Total: 59m 06s | Avg: 59m 06s | Max: 59m 06s
      🟨 Clang18            Pass:  85%/7   | Total:  6h 20m | Avg: 54m 19s | Max:  1h 09m
      🟨 GCC7               Pass:  50%/2   | Total:  1h 35m | Avg: 47m 52s | Max:  1h 02m
      🟩 GCC8               Pass: 100%/1   | Total:  1h 02m | Avg:  1h 02m | Max:  1h 02m
      🟨 GCC9               Pass:  50%/2   | Total:  1h 35m | Avg: 47m 35s | Max:  1h 01m
      🟩 GCC10              Pass: 100%/1   | Total:  1h 03m | Avg:  1h 03m | Max:  1h 03m
      🟩 GCC11              Pass: 100%/1   | Total: 56m 34s | Avg: 56m 34s | Max: 56m 34s
      🟩 GCC12              Pass: 100%/3   | Total:  1h 47m | Avg: 35m 52s | Max:  1h 02m
      🟨 GCC13              Pass:  87%/8   | Total:  5h 12m | Avg: 39m 06s | Max:  1h 06m
      🟨 MSVC14.29          Pass:  50%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 10m | Hits:  84%/882   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 19m | Avg:  1h 09m | Max:  1h 11m | Hits:  80%/1764  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
    🟨 cxx_family
      🟨 Clang              Pass:  78%/14  | Total: 12h 33m | Avg: 53m 50s | Max:  1h 09m
      🟨 GCC                Pass:  83%/18  | Total: 13h 14m | Avg: 44m 07s | Max:  1h 06m
      🟨 MSVC               Pass:  75%/4   | Total:  4h 30m | Avg:  1h 07m | Max:  1h 11m | Hits:  81%/2646  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 14m
    🟨 jobs
      🟨 Build              Pass:  83%/31  | Total:  1d 05h | Avg: 57m 03s | Max:  1h 14m | Hits:  81%/2646  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 29m 04s | Avg: 29m 04s | Max: 29m 04s
      🟩 GraphCapture       Pass: 100%/1   | Total: 17m 49s | Avg: 17m 49s | Max: 17m 49s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 18m | Avg: 26m 15s | Max: 30m 00s
      🟥 TestGPU            Pass:   0%/2   | Total:  1h 12m | Avg: 36m 02s | Max: 45m 58s
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 45m 20s | Avg: 22m 40s | Max: 25m 54s
      🟩 90a                Pass: 100%/1   | Total: 26m 26s | Avg: 26m 26s | Max: 26m 26s
    🟨 std
      🟨 17                 Pass:  71%/14  | Total: 13h 18m | Avg: 57m 02s | Max:  1h 13m | Hits:  84%/1764  
      🟨 20                 Pass:  87%/24  | Total: 19h 27m | Avg: 48m 39s | Max:  1h 14m | Hits:  77%/882   
    
  • 🟩 thrust: Pass: 100%/37 | Total: 20h 24m | Avg: 33m 05s | Max: 1h 03m | Hits: 212%/9180

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 41m 31s | Avg: 20m 45s | Max: 27m 07s
    🟩 cpu
      🟩 amd64              Pass: 100%/35  | Total: 19h 24m | Avg: 33m 16s | Max:  1h 03m | Hits: 212%/9180  
      🟩 arm64              Pass: 100%/2   | Total: 59m 22s | Avg: 29m 41s | Max: 31m 07s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 03m | Avg: 36m 46s | Max: 53m 41s | Hits: 173%/1836  
      🟩 12.5               Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
      🟩 12.6               Pass: 100%/30  | Total: 15h 23m | Avg: 30m 47s | Max:  1h 03m | Hits: 221%/7344  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 52m 56s | Avg: 26m 28s | Max: 26m 57s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 03m | Avg: 36m 46s | Max: 53m 41s | Hits: 173%/1836  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
      🟩 nvcc12.6           Pass: 100%/28  | Total: 14h 30m | Avg: 31m 05s | Max:  1h 03m | Hits: 221%/7344  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 52m 56s | Avg: 26m 28s | Max: 26m 57s
      🟩 nvcc               Pass: 100%/35  | Total: 19h 31m | Avg: 33m 28s | Max:  1h 03m | Hits: 212%/9180  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 07m | Avg: 31m 55s | Max: 33m 27s
      🟩 Clang15            Pass: 100%/1   | Total: 32m 16s | Avg: 32m 16s | Max: 32m 16s
      🟩 Clang16            Pass: 100%/1   | Total: 31m 34s | Avg: 31m 34s | Max: 31m 34s
      🟩 Clang17            Pass: 100%/1   | Total: 29m 57s | Avg: 29m 57s | Max: 29m 57s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 40m | Avg: 22m 52s | Max: 30m 12s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 04m | Avg: 32m 07s | Max: 32m 36s
      🟩 GCC8               Pass: 100%/1   | Total: 32m 55s | Avg: 32m 55s | Max: 32m 55s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 08m | Avg: 34m 19s | Max: 35m 45s
      🟩 GCC10              Pass: 100%/1   | Total: 35m 15s | Avg: 35m 15s | Max: 35m 15s
      🟩 GCC11              Pass: 100%/1   | Total: 36m 31s | Avg: 36m 31s | Max: 36m 31s
      🟩 GCC12              Pass: 100%/1   | Total: 36m 03s | Avg: 36m 03s | Max: 36m 03s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 01m | Avg: 22m 42s | Max: 38m 59s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 47m | Avg: 53m 54s | Max: 54m 08s | Hits: 173%/3672  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 42m | Avg: 54m 17s | Max:  1h 03m | Hits: 237%/5508  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/14  | Total:  6h 21m | Avg: 27m 15s | Max: 33m 27s
      🟩 GCC                Pass: 100%/16  | Total:  7h 35m | Avg: 28m 27s | Max: 38m 59s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 30m | Avg: 54m 08s | Max:  1h 03m | Hits: 212%/9180  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 23s | Max: 59m 22s
    🟩 gpu
      🟩 v100               Pass: 100%/37  | Total: 20h 24m | Avg: 33m 05s | Max:  1h 03m | Hits: 212%/9180  
    🟩 jobs
      🟩 Build              Pass: 100%/31  | Total: 18h 54m | Avg: 36m 36s | Max:  1h 03m | Hits: 173%/7344  
      🟩 TestCPU            Pass: 100%/3   | Total: 50m 57s | Avg: 16m 59s | Max: 35m 42s | Hits: 365%/1836  
      🟩 TestGPU            Pass: 100%/3   | Total: 38m 20s | Avg: 12m 46s | Max: 14m 24s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 18m 32s | Avg: 18m 32s | Max: 18m 32s
    🟩 std
      🟩 17                 Pass: 100%/14  | Total:  9h 05m | Avg: 38m 59s | Max:  1h 03m | Hits: 173%/5508  
      🟩 20                 Pass: 100%/21  | Total: 10h 36m | Avg: 30m 19s | Max:  1h 03m | Hits: 269%/3672  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 40s | Avg: 4m 50s | Max: 7m 28s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 40s | Avg:  4m 50s | Max:  7m 28s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 12s | Avg:  2m 12s | Max:  2m 12s
      🟩 Test               Pass: 100%/1   | Total:  7m 28s | Avg:  7m 28s | Max:  7m 28s
    
  • 🟩 python: Pass: 100%/1 | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 46m 17s | Avg: 46m 17s | Max: 46m 17s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
+/- Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 78)

# Runner
53 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16
1 linux-amd64-gpu-h100-latest-1-testing

@bernhardmgruber
Copy link
Contributor Author

It increasingly seems that replacing cub::Traits will break a lot of behavior in CUB, since users need to move over to using and specializing numeric_limits. We should probably split this PR in the pure deprecation, which we backport to 2.8, and the replacement which should target 3.0.

@bernhardmgruber bernhardmgruber force-pushed the depr_cub_traits branch 2 times, most recently from cc83a5c to 3b27583 Compare January 26, 2025 19:59
Copy link
Contributor

🟨 CI finished in 2h 00m: Pass: 97%/89 | Total: 1d 16h | Avg: 27m 31s | Max: 1h 06m | Hits: 419%/10896
  • 🟨 cub: Pass: 95%/44 | Total: 1d 10h | Avg: 46m 44s | Max: 1h 06m | Hits: 533%/3512

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  95%/42  | Total:  1d 08h | Avg: 46m 10s | Max:  1h 06m | Hits: 533%/3512  
      🟩 arm64              Pass: 100%/2   | Total:  1h 56m | Avg: 58m 18s | Max:  1h 00m
    🔍 ctk: 12.6 🔍
      🟩 12.0               Pass: 100%/5   | Total:  4h 04m | Avg: 48m 51s | Max: 57m 20s | Hits: 532%/878   
      🟩 12.5               Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m
      🔍 12.6               Pass:  94%/37  | Total:  1d 04h | Avg: 45m 25s | Max:  1h 00m | Hits: 533%/2634  
    🔍 cudacxx: nvcc12.6 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 45m | Avg: 52m 46s | Max: 53m 02s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  4h 04m | Avg: 48m 51s | Max: 57m 20s | Hits: 532%/878   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m
      🔍 nvcc12.6           Pass:  94%/35  | Total:  1d 02h | Avg: 44m 59s | Max:  1h 00m | Hits: 533%/2634  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 45m | Avg: 52m 46s | Max: 53m 02s
      🔍 nvcc               Pass:  95%/42  | Total:  1d 08h | Avg: 46m 26s | Max:  1h 06m | Hits: 533%/3512  
    🔍 gpu: rtxa6000 🔍
      🟩 h100               Pass: 100%/2   | Total: 47m 25s | Avg: 23m 42s | Max: 24m 05s
      🔍 rtxa6000           Pass:  75%/8   | Total:  3h 55m | Avg: 29m 26s | Max: 57m 26s
      🟩 v100               Pass: 100%/34  | Total:  1d 05h | Avg: 52m 09s | Max:  1h 06m | Hits: 533%/3512  
    🚨 jobs: TestGPU 🚨
      🟩 Build              Pass: 100%/37  | Total:  1d 07h | Avg: 51m 36s | Max:  1h 06m | Hits: 533%/3512  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 09s | Avg: 20m 09s | Max: 20m 09s
      🟩 GraphCapture       Pass: 100%/1   | Total: 14m 52s | Avg: 14m 52s | Max: 14m 52s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 11m | Avg: 23m 53s | Max: 24m 05s
      🔥 TestGPU            Pass:   0%/2   | Total: 39m 55s | Avg: 19m 57s | Max: 20m 49s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 17h 18m | Avg: 51m 55s | Max:  1h 05m | Hits: 533%/2634  
      🔍 20                 Pass:  91%/24  | Total: 16h 57m | Avg: 42m 24s | Max:  1h 06m | Hits: 533%/878   
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  3h 35m | Avg: 53m 52s | Max: 55m 05s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 50m | Avg: 55m 05s | Max: 55m 40s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 52m | Avg: 56m 12s | Max: 59m 10s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 49m | Avg: 54m 50s | Max: 56m 27s
      🟨 Clang18            Pass:  85%/7   | Total:  5h 26m | Avg: 46m 39s | Max:  1h 00m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 52m | Avg: 56m 01s | Max: 57m 20s
      🟩 GCC8               Pass: 100%/1   | Total: 59m 19s | Avg: 59m 19s | Max: 59m 19s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 49m | Avg: 54m 55s | Max: 54m 59s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 54s | Max: 54m 37s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 47m | Avg: 53m 31s | Max: 54m 12s
      🟩 GCC12              Pass: 100%/4   | Total:  2h 43m | Avg: 40m 45s | Max:  1h 00m
      🟨 GCC13              Pass:  87%/8   | Total:  4h 29m | Avg: 33m 38s | Max: 56m 14s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 55m 53s | Avg: 27m 56s | Max: 30m 38s | Hits: 533%/1756  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 06m | Avg: 33m 11s | Max: 34m 28s | Hits: 533%/1756  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m
    🟨 cxx_family
      🟨 Clang              Pass:  94%/17  | Total: 14h 34m | Avg: 51m 25s | Max:  1h 00m
      🟨 GCC                Pass:  95%/21  | Total: 15h 28m | Avg: 44m 12s | Max:  1h 00m
      🟩 MSVC               Pass: 100%/4   | Total:  2h 02m | Avg: 30m 33s | Max: 34m 28s | Hits: 533%/3512  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 11m | Avg:  1h 05m | Max:  1h 06m
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 47m 25s | Avg: 23m 42s | Max: 24m 05s
      🟩 90a                Pass: 100%/1   | Total: 23m 52s | Avg: 23m 52s | Max: 23m 52s
    
  • 🟩 thrust: Pass: 100%/42 | Total: 5h 59m | Avg: 8m 33s | Max: 29m 50s | Hits: 365%/7384

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 16m 39s | Avg:  8m 19s | Max: 10m 42s
    🟩 cpu
      🟩 amd64              Pass: 100%/40  | Total:  5h 49m | Avg:  8m 44s | Max: 29m 50s | Hits: 365%/7384  
      🟩 arm64              Pass: 100%/2   | Total:  9m 51s | Avg:  4m 55s | Max:  5m 19s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total: 45m 02s | Avg:  9m 00s | Max: 23m 40s | Hits: 365%/1846  
      🟩 12.5               Pass: 100%/2   | Total: 29m 04s | Avg: 14m 32s | Max: 15m 32s
      🟩 12.6               Pass: 100%/35  | Total:  4h 45m | Avg:  8m 09s | Max: 29m 50s | Hits: 365%/5538  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 10m 25s | Avg:  5m 12s | Max:  5m 13s
      🟩 nvcc12.0           Pass: 100%/5   | Total: 45m 02s | Avg:  9m 00s | Max: 23m 40s | Hits: 365%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total: 29m 04s | Avg: 14m 32s | Max: 15m 32s
      🟩 nvcc12.6           Pass: 100%/33  | Total:  4h 35m | Avg:  8m 20s | Max: 29m 50s | Hits: 365%/5538  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 10m 25s | Avg:  5m 12s | Max:  5m 13s
      🟩 nvcc               Pass: 100%/40  | Total:  5h 49m | Avg:  8m 43s | Max: 29m 50s | Hits: 365%/7384  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 21m 37s | Avg:  5m 24s | Max:  5m 46s
      🟩 Clang15            Pass: 100%/2   | Total: 11m 18s | Avg:  5m 39s | Max:  5m 53s
      🟩 Clang16            Pass: 100%/2   | Total: 10m 46s | Avg:  5m 23s | Max:  5m 28s
      🟩 Clang17            Pass: 100%/2   | Total: 10m 41s | Avg:  5m 20s | Max:  5m 23s
      🟩 Clang18            Pass: 100%/7   | Total: 44m 17s | Avg:  6m 19s | Max: 10m 16s
      🟩 GCC7               Pass: 100%/2   | Total: 10m 50s | Avg:  5m 25s | Max:  5m 30s
      🟩 GCC8               Pass: 100%/1   | Total:  5m 22s | Avg:  5m 22s | Max:  5m 22s
      🟩 GCC9               Pass: 100%/2   | Total: 12m 01s | Avg:  6m 00s | Max:  6m 11s
      🟩 GCC10              Pass: 100%/2   | Total: 11m 46s | Avg:  5m 53s | Max:  6m 00s
      🟩 GCC11              Pass: 100%/2   | Total: 11m 11s | Avg:  5m 35s | Max:  5m 36s
      🟩 GCC12              Pass: 100%/2   | Total: 12m 38s | Avg:  6m 19s | Max:  6m 22s
      🟩 GCC13              Pass: 100%/8   | Total: 57m 42s | Avg:  7m 12s | Max: 11m 04s
      🟩 MSVC14.29          Pass: 100%/2   | Total: 51m 58s | Avg: 25m 59s | Max: 28m 18s | Hits: 365%/3692  
      🟩 MSVC14.39          Pass: 100%/2   | Total: 58m 20s | Avg: 29m 10s | Max: 29m 50s | Hits: 365%/3692  
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 29m 04s | Avg: 14m 32s | Max: 15m 32s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  1h 38m | Avg:  5m 48s | Max: 10m 16s
      🟩 GCC                Pass: 100%/19  | Total:  2h 01m | Avg:  6m 23s | Max: 11m 04s
      🟩 MSVC               Pass: 100%/4   | Total:  1h 50m | Avg: 27m 34s | Max: 29m 50s | Hits: 365%/7384  
      🟩 NVHPC              Pass: 100%/2   | Total: 29m 04s | Avg: 14m 32s | Max: 15m 32s
    🟩 gpu
      🟩 rtx4090            Pass: 100%/8   | Total:  1h 05m | Avg:  8m 14s | Max: 11m 04s
      🟩 v100               Pass: 100%/34  | Total:  4h 53m | Avg:  8m 38s | Max: 29m 50s | Hits: 365%/7384  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  5h 11m | Avg:  8m 24s | Max: 29m 50s | Hits: 365%/7384  
      🟩 TestCPU            Pass: 100%/2   | Total: 16m 28s | Avg:  8m 14s | Max:  8m 27s
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 02s | Avg: 10m 40s | Max: 11m 04s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total:  4m 21s | Avg:  4m 21s | Max:  4m 21s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total:  3h 05m | Avg:  9m 16s | Max: 29m 50s | Hits: 365%/5538  
      🟩 20                 Pass: 100%/20  | Total:  2h 37m | Avg:  7m 52s | Max: 28m 30s | Hits: 365%/1846  
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 6m 51s | Avg: 3m 25s | Max: 4m 46s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  6m 51s | Avg:  3m 25s | Max:  4m 46s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  6m 51s | Avg:  3m 25s | Max:  4m 46s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  6m 51s | Avg:  3m 25s | Max:  4m 46s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  6m 51s | Avg:  3m 25s | Max:  4m 46s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  6m 51s | Avg:  3m 25s | Max:  4m 46s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  6m 51s | Avg:  3m 25s | Max:  4m 46s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  6m 51s | Avg:  3m 25s | Max:  4m 46s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 05s | Avg:  2m 05s | Max:  2m 05s
      🟩 Test               Pass: 100%/1   | Total:  4m 46s | Avg:  4m 46s | Max:  4m 46s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 49s | Avg: 26m 49s | Max: 26m 49s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
+/- Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 89)

# Runner
65 linux-amd64-cpu16
8 windows-amd64-cpu16
6 linux-amd64-gpu-rtxa6000-latest-1
4 linux-arm64-cpu16
3 linux-amd64-gpu-rtx4090-latest-1
2 linux-amd64-gpu-rtx2080-latest-1
1 linux-amd64-gpu-h100-latest-1

@bernhardmgruber bernhardmgruber marked this pull request as draft February 4, 2025 23:39
Copy link

copy-pr-bot bot commented Feb 4, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@bernhardmgruber
Copy link
Contributor Author

/ok to test

@miscco miscco self-assigned this Feb 5, 2025
cub/cub/util_type.cuh Outdated Show resolved Hide resolved
@miscco
Copy link
Collaborator

miscco commented Feb 5, 2025

/ok to test

@miscco miscco marked this pull request as ready for review February 5, 2025 08:07
@miscco
Copy link
Collaborator

miscco commented Feb 5, 2025

huge shout-out to @davebayer for implementing the extended floating point support for numeric_limits

Copy link
Contributor

github-actions bot commented Feb 5, 2025

🟨 CI finished in 1h 58m: Pass: 98%/151 | Total: 3d 20h | Avg: 36m 33s | Max: 1h 29m | Hits: 221%/24193
  • 🟨 cub: Pass: 95%/44 | Total: 1d 18h | Avg: 57m 44s | Max: 1h 29m | Hits: 30%/4168

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  95%/42  | Total:  1d 16h | Avg: 57m 19s | Max:  1h 29m | Hits:  30%/4168  
      🟩 arm64              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 07m
    🔍 ctk: 12.8 🔍
      🟩 12.0               Pass: 100%/5   | Total:  5h 12m | Avg:  1h 02m | Max:  1h 08m | Hits:  30%/1042  
      🟩 12.5               Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
      🔍 12.8               Pass:  94%/37  | Total:  1d 10h | Avg: 56m 12s | Max:  1h 29m | Hits:  30%/3126  
    🔍 cudacxx: nvcc12.8 🔍
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 04m
      🟩 nvcc12.0           Pass: 100%/5   | Total:  5h 12m | Avg:  1h 02m | Max:  1h 08m | Hits:  30%/1042  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
      🔍 nvcc12.8           Pass:  94%/35  | Total:  1d 08h | Avg: 55m 47s | Max:  1h 29m | Hits:  30%/3126  
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total:  2h 07m | Avg:  1h 03m | Max:  1h 04m
      🔍 nvcc               Pass:  95%/42  | Total:  1d 16h | Avg: 57m 27s | Max:  1h 29m | Hits:  30%/4168  
    🔍 gpu: rtxa6000 🔍
      🟩 h100               Pass: 100%/2   | Total: 55m 58s | Avg: 27m 59s | Max: 30m 47s
      🟩 rtx2080            Pass: 100%/34  | Total:  1d 13h | Avg:  1h 05m | Max:  1h 29m | Hits:  30%/4168  
      🔍 rtxa6000           Pass:  75%/8   | Total:  4h 22m | Avg: 32m 49s | Max:  1h 06m
    🚨 jobs: TestGPU 🚨
      🟩 Build              Pass: 100%/37  | Total:  1d 15h | Avg:  1h 04m | Max:  1h 29m | Hits:  30%/4168  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 20m 18s | Avg: 20m 18s | Max: 20m 18s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 33s | Avg: 16m 33s | Max: 16m 33s
      🟩 HostLaunch         Pass: 100%/3   | Total:  1h 14m | Avg: 24m 54s | Max: 25m 11s
      🔥 TestGPU            Pass:   0%/2   | Total: 44m 45s | Avg: 22m 22s | Max: 22m 31s
    🔍 std: 20 🔍
      🟩 17                 Pass: 100%/20  | Total: 21h 24m | Avg:  1h 04m | Max:  1h 21m | Hits:  30%/3126  
      🔍 20                 Pass:  91%/24  | Total: 20h 56m | Avg: 52m 20s | Max:  1h 29m | Hits:  30%/1042  
    🟨 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  4h 05m | Avg:  1h 01m | Max:  1h 07m
      🟩 Clang15            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 10s | Max: 59m 47s
      🟩 Clang16            Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 03m
      🟩 Clang17            Pass: 100%/2   | Total:  1h 58m | Avg: 59m 28s | Max:  1h 00m
      🟨 Clang18            Pass:  85%/7   | Total:  6h 04m | Avg: 52m 05s | Max:  1h 05m
      🟩 GCC7               Pass: 100%/2   | Total:  1h 58m | Avg: 59m 24s | Max:  1h 00m
      🟩 GCC8               Pass: 100%/1   | Total:  1h 01m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC9               Pass: 100%/2   | Total:  2h 04m | Avg:  1h 02m | Max:  1h 05m
      🟩 GCC10              Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 01m
      🟩 GCC11              Pass: 100%/2   | Total:  2h 13m | Avg:  1h 06m | Max:  1h 08m
      🟩 GCC12              Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 09m
      🟨 GCC13              Pass:  90%/10  | Total:  6h 50m | Avg: 41m 03s | Max:  1h 16m
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 27m | Avg:  1h 13m | Max:  1h 18m | Hits:  30%/2084  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 50m | Avg:  1h 25m | Max:  1h 29m | Hits:  30%/2084  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
    🟨 cxx_family
      🟨 Clang              Pass:  94%/17  | Total: 16h 11m | Avg: 57m 08s | Max:  1h 07m
      🟨 GCC                Pass:  95%/21  | Total: 18h 23m | Avg: 52m 31s | Max:  1h 16m
      🟩 MSVC               Pass: 100%/4   | Total:  5h 17m | Avg:  1h 19m | Max:  1h 29m | Hits:  30%/4168  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 28m | Avg:  1h 14m | Max:  1h 16m
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 55m 58s | Avg: 27m 59s | Max: 30m 47s
      🟩 90;90a;100         Pass: 100%/1   | Total:  1h 16m | Avg:  1h 16m | Max:  1h 16m
    
  • 🟩 thrust: Pass: 100%/43 | Total: 1d 03h | Avg: 38m 41s | Max: 1h 19m | Hits: 126%/9230

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 42m 30s | Avg: 21m 15s | Max: 31m 21s
    🟩 cpu
      🟩 amd64              Pass: 100%/41  | Total:  1d 02h | Avg: 38m 53s | Max:  1h 19m | Hits: 126%/9230  
      🟩 arm64              Pass: 100%/2   | Total:  1h 09m | Avg: 34m 46s | Max: 36m 09s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  3h 32m | Avg: 42m 28s | Max:  1h 05m | Hits:  90%/1846  
      🟩 12.5               Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
      🟩 12.8               Pass: 100%/36  | Total: 21h 41m | Avg: 36m 08s | Max:  1h 19m | Hits: 135%/7384  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 03m | Avg: 31m 51s | Max: 32m 02s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  3h 32m | Avg: 42m 28s | Max:  1h 05m | Hits:  90%/1846  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
      🟩 nvcc12.8           Pass: 100%/34  | Total: 20h 37m | Avg: 36m 23s | Max:  1h 19m | Hits: 135%/7384  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 03m | Avg: 31m 51s | Max: 32m 02s
      🟩 nvcc               Pass: 100%/41  | Total:  1d 02h | Avg: 39m 01s | Max:  1h 19m | Hits: 126%/9230  
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total:  2h 21m | Avg: 35m 23s | Max: 35m 54s
      🟩 Clang15            Pass: 100%/2   | Total:  1h 15m | Avg: 37m 41s | Max: 37m 48s
      🟩 Clang16            Pass: 100%/2   | Total:  1h 13m | Avg: 36m 55s | Max: 38m 40s
      🟩 Clang17            Pass: 100%/2   | Total:  1h 14m | Avg: 37m 21s | Max: 37m 26s
      🟩 Clang18            Pass: 100%/7   | Total:  3h 13m | Avg: 27m 35s | Max: 39m 19s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 11m | Avg: 35m 56s | Max: 36m 03s
      🟩 GCC8               Pass: 100%/1   | Total: 36m 06s | Avg: 36m 06s | Max: 36m 06s
      🟩 GCC9               Pass: 100%/2   | Total:  1h 19m | Avg: 39m 58s | Max: 40m 03s
      🟩 GCC10              Pass: 100%/2   | Total:  1h 14m | Avg: 37m 22s | Max: 37m 37s
      🟩 GCC11              Pass: 100%/2   | Total:  1h 17m | Avg: 38m 48s | Max: 40m 01s
      🟩 GCC12              Pass: 100%/2   | Total:  1h 25m | Avg: 42m 31s | Max: 47m 25s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 37m | Avg: 27m 14s | Max: 41m 37s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  2h 10m | Avg:  1h 05m | Max:  1h 05m | Hits:  74%/3692  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  3h 01m | Avg:  1h 00m | Max:  1h 19m | Hits: 161%/5538  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/17  | Total:  9h 18m | Avg: 32m 51s | Max: 39m 19s
      🟩 GCC                Pass: 100%/19  | Total: 10h 43m | Avg: 33m 51s | Max: 47m 25s
      🟩 MSVC               Pass: 100%/5   | Total:  5h 11m | Avg:  1h 02m | Max:  1h 19m | Hits: 126%/9230  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 30m | Avg:  1h 15m | Max:  1h 18m
    🟩 gpu
      🟩 rtx2080            Pass: 100%/33  | Total: 23h 13m | Avg: 42m 13s | Max:  1h 18m | Hits:  69%/5538  
      🟩 rtx4090            Pass: 100%/10  | Total:  4h 30m | Avg: 27m 03s | Max:  1h 19m | Hits: 212%/3692  
    🟩 jobs
      🟩 Build              Pass: 100%/37  | Total:  1d 02h | Avg: 42m 45s | Max:  1h 19m | Hits:  66%/7384  
      🟩 TestCPU            Pass: 100%/3   | Total: 49m 06s | Avg: 16m 22s | Max: 33m 04s | Hits: 365%/1846  
      🟩 TestGPU            Pass: 100%/3   | Total: 32m 39s | Avg: 10m 53s | Max: 11m 23s
    🟩 sm
      🟩 90;90a;100         Pass: 100%/1   | Total: 39m 35s | Avg: 39m 35s | Max: 39m 35s
    🟩 std
      🟩 17                 Pass: 100%/20  | Total: 14h 28m | Avg: 43m 24s | Max:  1h 12m | Hits:  69%/5538  
      🟩 20                 Pass: 100%/21  | Total: 12h 33m | Avg: 35m 52s | Max:  1h 19m | Hits: 212%/3692  
    
  • 🟩 libcudacxx: Pass: 100%/41 | Total: 16h 48m | Avg: 24m 36s | Max: 50m 30s | Hits: 392%/10273

    🟩 cpu
      🟩 amd64              Pass: 100%/39  | Total: 16h 02m | Avg: 24m 40s | Max: 50m 30s | Hits: 392%/10273 
      🟩 arm64              Pass: 100%/2   | Total: 46m 21s | Avg: 23m 10s | Max: 23m 13s
    🟩 ctk
      🟩 12.0               Pass: 100%/5   | Total:  1h 09m | Avg: 13m 54s | Max: 31m 21s | Hits: 393%/2523  
      🟩 12.5               Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
      🟩 12.8               Pass: 100%/34  | Total: 14h 26m | Avg: 25m 29s | Max: 50m 30s | Hits: 391%/7750  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 44m 59s | Avg: 22m 29s | Max: 23m 48s
      🟩 nvcc12.0           Pass: 100%/5   | Total:  1h 09m | Avg: 13m 54s | Max: 31m 21s | Hits: 393%/2523  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
      🟩 nvcc12.8           Pass: 100%/32  | Total: 13h 41m | Avg: 25m 41s | Max: 50m 30s | Hits: 391%/7750  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 44m 59s | Avg: 22m 29s | Max: 23m 48s
      🟩 nvcc               Pass: 100%/39  | Total: 16h 04m | Avg: 24m 43s | Max: 50m 30s | Hits: 392%/10273 
    🟩 cxx
      🟩 Clang14            Pass: 100%/4   | Total: 59m 24s | Avg: 14m 51s | Max: 24m 08s
      🟩 Clang15            Pass: 100%/2   | Total: 50m 42s | Avg: 25m 21s | Max: 26m 43s
      🟩 Clang16            Pass: 100%/2   | Total: 49m 52s | Avg: 24m 56s | Max: 27m 45s
      🟩 Clang17            Pass: 100%/2   | Total: 50m 07s | Avg: 25m 03s | Max: 25m 33s
      🟩 Clang18            Pass: 100%/6   | Total:  2h 44m | Avg: 27m 29s | Max: 46m 56s
      🟩 GCC7               Pass: 100%/2   | Total: 29m 07s | Avg: 14m 33s | Max: 22m 30s
      🟩 GCC8               Pass: 100%/1   | Total: 22m 34s | Avg: 22m 34s | Max: 22m 34s
      🟩 GCC9               Pass: 100%/2   | Total: 43m 33s | Avg: 21m 46s | Max: 23m 11s
      🟩 GCC10              Pass: 100%/2   | Total: 47m 32s | Avg: 23m 46s | Max: 24m 26s
      🟩 GCC11              Pass: 100%/2   | Total: 47m 09s | Avg: 23m 34s | Max: 24m 33s
      🟩 GCC12              Pass: 100%/2   | Total: 47m 33s | Avg: 23m 46s | Max: 25m 30s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 04m | Avg: 23m 07s | Max: 50m 30s
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 05m | Avg: 32m 45s | Max: 34m 09s | Hits: 392%/5056  
      🟩 MSVC14.39          Pass: 100%/2   | Total:  1h 13m | Avg: 36m 45s | Max: 40m 51s | Hits: 391%/5217  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/16  | Total:  6h 15m | Avg: 23m 26s | Max: 46m 56s
      🟩 GCC                Pass: 100%/19  | Total:  7h 02m | Avg: 22m 13s | Max: 50m 30s
      🟩 MSVC               Pass: 100%/4   | Total:  2h 19m | Avg: 34m 45s | Max: 40m 51s | Hits: 392%/10273 
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 12m | Avg: 36m 15s | Max: 38m 05s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/41  | Total: 16h 48m | Avg: 24m 36s | Max: 50m 30s | Hits: 392%/10273 
    🟩 jobs
      🟩 Build              Pass: 100%/36  | Total: 14h 38m | Avg: 24m 24s | Max: 40m 51s | Hits: 392%/10273 
      🟩 NVRTC              Pass: 100%/2   | Total: 30m 43s | Avg: 15m 21s | Max: 15m 41s
      🟩 Test               Pass: 100%/2   | Total:  1h 37m | Avg: 48m 43s | Max: 50m 30s
      🟩 VerifyCodegen      Pass: 100%/1   | Total:  1m 59s | Avg:  1m 59s | Max:  1m 59s
    🟩 sm
      🟩 75                 Pass: 100%/2   | Total: 30m 43s | Avg: 15m 21s | Max: 15m 41s
      🟩 90;90a;100         Pass: 100%/1   | Total: 30m 26s | Avg: 30m 26s | Max: 30m 26s
    🟩 std
      🟩 17                 Pass: 100%/21  | Total:  7h 56m | Avg: 22m 42s | Max: 34m 26s | Hits: 392%/7589  
      🟩 20                 Pass: 100%/19  | Total:  8h 50m | Avg: 27m 53s | Max: 50m 30s | Hits: 391%/2684  
    
  • 🟩 cudax: Pass: 100%/20 | Total: 4h 33m | Avg: 13m 40s | Max: 18m 53s | Hits: 78%/522

    🟩 cpu
      🟩 amd64              Pass: 100%/16  | Total:  3h 39m | Avg: 13m 42s | Max: 18m 53s | Hits:  78%/522   
      🟩 arm64              Pass: 100%/4   | Total: 54m 09s | Avg: 13m 32s | Max: 14m 29s
    🟩 ctk
      🟩 12.0               Pass: 100%/1   | Total: 10m 14s | Avg: 10m 14s | Max: 10m 14s | Hits:  80%/261   
      🟩 12.5               Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
      🟩 12.8               Pass: 100%/17  | Total:  4h 05m | Avg: 14m 24s | Max: 18m 53s | Hits:  75%/261   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/1   | Total: 10m 14s | Avg: 10m 14s | Max: 10m 14s | Hits:  80%/261   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
      🟩 nvcc12.8           Pass: 100%/17  | Total:  4h 05m | Avg: 14m 24s | Max: 18m 53s | Hits:  75%/261   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/20  | Total:  4h 33m | Avg: 13m 40s | Max: 18m 53s | Hits:  78%/522   
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total: 14m 32s | Avg: 14m 32s | Max: 14m 32s
      🟩 Clang15            Pass: 100%/1   | Total: 17m 30s | Avg: 17m 30s | Max: 17m 30s
      🟩 Clang16            Pass: 100%/1   | Total: 16m 32s | Avg: 16m 32s | Max: 16m 32s
      🟩 Clang17            Pass: 100%/1   | Total: 17m 15s | Avg: 17m 15s | Max: 17m 15s
      🟩 Clang18            Pass: 100%/4   | Total: 54m 40s | Avg: 13m 40s | Max: 16m 00s
      🟩 GCC10              Pass: 100%/1   | Total: 15m 58s | Avg: 15m 58s | Max: 15m 58s
      🟩 GCC11              Pass: 100%/1   | Total: 15m 41s | Avg: 15m 41s | Max: 15m 41s
      🟩 GCC12              Pass: 100%/2   | Total: 31m 15s | Avg: 15m 37s | Max: 18m 53s
      🟩 GCC13              Pass: 100%/4   | Total: 51m 02s | Avg: 12m 45s | Max: 14m 29s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 14s | Avg: 10m 14s | Max: 10m 14s | Hits:  80%/261   
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 35s | Avg: 10m 35s | Max: 10m 35s | Hits:  75%/261   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/8   | Total:  2h 00m | Avg: 15m 03s | Max: 17m 30s
      🟩 GCC                Pass: 100%/8   | Total:  1h 53m | Avg: 14m 14s | Max: 18m 53s
      🟩 MSVC               Pass: 100%/2   | Total: 20m 49s | Avg: 10m 24s | Max: 10m 35s | Hits:  78%/522   
      🟩 NVHPC              Pass: 100%/2   | Total: 18m 12s | Avg:  9m 06s | Max:  9m 18s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/20  | Total:  4h 33m | Avg: 13m 40s | Max: 18m 53s | Hits:  78%/522   
    🟩 jobs
      🟩 Build              Pass: 100%/18  | Total:  4h 08m | Avg: 13m 49s | Max: 18m 53s | Hits:  78%/522   
      🟩 Test               Pass: 100%/2   | Total: 24m 29s | Avg: 12m 14s | Max: 12m 22s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total: 11m 19s | Avg: 11m 19s | Max: 11m 19s
      🟩 90a                Pass: 100%/1   | Total: 12m 07s | Avg: 12m 07s | Max: 12m 07s
    🟩 std
      🟩 17                 Pass: 100%/4   | Total: 46m 03s | Avg: 11m 30s | Max: 13m 07s
      🟩 20                 Pass: 100%/16  | Total:  3h 47m | Avg: 14m 12s | Max: 18m 53s | Hits:  78%/522   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 7m 36s | Avg: 3m 48s | Max: 5m 18s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 ctk
      🟩 12.8               Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/2   | Total:  7m 36s | Avg:  3m 48s | Max:  5m 18s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 18s | Avg:  2m 18s | Max:  2m 18s
      🟩 Test               Pass: 100%/1   | Total:  5m 18s | Avg:  5m 18s | Max:  5m 18s
    
  • 🟩 python: Pass: 100%/1 | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 ctk
      🟩 12.8               Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cudacxx
      🟩 nvcc12.8           Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 26m 08s | Avg: 26m 08s | Max: 26m 08s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
+/- Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 151)

# Runner
108 linux-amd64-cpu16
15 windows-amd64-cpu16
10 linux-arm64-cpu16
8 linux-amd64-gpu-rtx2080-latest-1
6 linux-amd64-gpu-rtxa6000-latest-1
3 linux-amd64-gpu-rtx4090-latest-1
1 linux-amd64-gpu-h100-latest-1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cub For all items related to CUB
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

Replace parts of cub::Traits by numeric_limits and deprecate those
2 participants