Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable new CUDA visibility flags #9679

Draft
wants to merge 2 commits into
base: IB/CMSSW_15_1_X/master
Choose a base branch
from

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Feb 9, 2025

CUDA 12.8.0 introduce two flags that restrict the linkage and visibility of __global__ functions (kernels) and global device variables.

Enabling them should make the CUDA device code and kernel launch interface more self-contained, potentially reducing conflicts across shared libraries.

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 9, 2025

enable gpu

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2025

cms-bot internal usage

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 9, 2025

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2025

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-593db5/44283/summary.html
COMMIT: 0ec301d
CMSSW: CMSSW_15_1_X_2025-02-09-0000/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9679/44283/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation warning when building: See details on the summary page.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2025

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-593db5/44286/summary.html
COMMIT: 0ec301d
CMSSW: CMSSW_15_1_X_2025-02-09-0000/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9679/44286/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

[1147/1422] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DDISABLE_CUSPARSE_DEPRECATED -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONLY_C_LOCALE=0 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_LEAN_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -D_GNU_SOURCE -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/utf8_range-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/date-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cudnn_frontend-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/mp11-src/include -isystem /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/9.6.0.74-9f9593d5e60f18ef40154a558d2268da/include -DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API -Wno-deprecated-gpu-targets --static-global-template-stub=true -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" -Xcompiler=-fPIC -Xcudafe --diag_suppress=conversion_function_not_usable --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads 1 -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm70.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm70.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm70.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm70.cu.o
[1148/1422] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DDISABLE_CUSPARSE_DEPRECATED -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONLY_C_LOCALE=0 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_LEAN_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -D_GNU_SOURCE -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/utf8_range-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/date-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cudnn_frontend-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/mp11-src/include -isystem /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/9.6.0.74-9f9593d5e60f18ef40154a558d2268da/include -DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API -Wno-deprecated-gpu-targets --static-global-template-stub=true -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" -Xcompiler=-fPIC -Xcudafe --diag_suppress=conversion_function_not_usable --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads 1 -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu.o
[1149/1422] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DDISABLE_CUSPARSE_DEPRECATED -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONLY_C_LOCALE=0 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_LEAN_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -D_GNU_SOURCE -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/utf8_range-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/date-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cudnn_frontend-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/mp11-src/include -isystem /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/9.6.0.74-9f9593d5e60f18ef40154a558d2268da/include -DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API -Wno-deprecated-gpu-targets --static-global-template-stub=true -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" -Xcompiler=-fPIC -Xcudafe --diag_suppress=conversion_function_not_usable --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads 1 -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim128_bf16_sm80.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim128_bf16_sm80.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim128_bf16_sm80.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim128_bf16_sm80.cu.o
[1150/1422] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DDISABLE_CUSPARSE_DEPRECATED -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONLY_C_LOCALE=0 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_LEAN_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -D_GNU_SOURCE -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/utf8_range-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/date-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cutlass-src/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/cudnn_frontend-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/build/_deps/mp11-src/include -isystem /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/9.6.0.74-9f9593d5e60f18ef40154a558d2268da/include -DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API -Wno-deprecated-gpu-targets --static-global-template-stub=true -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" -Xcompiler=-fPIC -Xcudafe --diag_suppress=conversion_function_not_usable --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads 1 -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu.o
ninja: build stopped: subcommand failed.
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.X0pKEB (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+onnxruntime+1.20.1-e35ea1b21c6163717a6cadfc46ef1910
Macro expanded in comment on line 383: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}


@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 9, 2025

OK, onnxruntime does not like the new mode:

/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-e35ea1b21c6163717a6cadfc46ef1910/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/fastertransformer_decoder_attention/decoder_masked_multihead_attention_impl.h(24):
error #20280-D:
when "-static-global-template-stub=true" in whole program compilation mode ("-rdc=false"), a __global__ function template instantiation or specialization ("onnxruntime::contrib::cuda::masked_multihead_attention_kernel<float, (int)64, (int)4, (int)16, (int)64> ") must have a definition in the current translation unit.
To resolve this issue, either use separate compilation mode ("-rdc=true"), or explicitly set "-static-global-template-stub=false" (but see nvcc documentation about downsides of turning it off)

@fwyzard fwyzard force-pushed the IB/CMSSW_15_1_X/master_CUDA_flags branch from 0ec301d to d07c73e Compare February 9, 2025 18:51
@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 9, 2025

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2025

Pull request #9679 was updated.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 9, 2025

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-593db5/44289/summary.html
COMMIT: d07c73e
CMSSW: CMSSW_15_1_X_2025-02-09-0000/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9679/44289/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

-- Configuring incomplete, errors occurred!
CMake Warning:
Value of CMAKE_CUDA_FLAGS contained a newline; truncating


error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.mijelU (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+onnxruntime+1.20.1-7732de1f458751000cb6ebfbdb9d9438
Macro expanded in comment on line 383: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}


onnxruntime.spec Outdated
@@ -36,7 +36,7 @@ cmake ../%{n}-%{realversion}/cmake -GNinja \
-Donnxruntime_CUDNN_HOME="${CUDNN_ROOT}" \
-Donnxruntime_NVCC_THREADS=1 \
-DCMAKE_CUDA_ARCHITECTURES=$(echo %{cuda_arch} | tr ' ' ';' | sed 's|;;*|;|') \
-DCMAKE_CUDA_FLAGS="-DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API -Wno-deprecated-gpu-targets --static-global-template-stub=false -cudart shared" \
-DCMAKE_CUDA_FLAGS="-DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API %{nvcc_cuda_flags}" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fwyzard , I would suggest to add

CUDA_FLAGS=$(echo "-DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API %{nvcc_cuda_flags}" | tr '\n' ' ')

before the cmake command and then use -DCMAKE_CUDA_FLAGS="${CUDA_FLAGS}" here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@fwyzard fwyzard force-pushed the IB/CMSSW_15_1_X/master_CUDA_flags branch from d07c73e to 7168672 Compare February 11, 2025 09:08
@cmsbuild
Copy link
Contributor

Pull request #9679 was updated.

@fwyzard
Copy link
Contributor Author

fwyzard commented Feb 11, 2025

please test

@@ -21,6 +21,7 @@ rm -rf ../build; mkdir ../build; cd ../build
USE_CUDA=OFF
if [ "%{cuda_gcc_support}" = "true" ] ; then
USE_CUDA=ON
CUDA_FLAGS=$(echo "-DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API %{nvcc_cuda_flags}" | tr '\n' ' ')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually includes too many flags, some are already set by CMAKE_CUDA_ARCHITECTURES=... and CMAKE_CUDA_RUNTIME_LIBRARY=Shared.

@cmsbuild
Copy link
Contributor

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-593db5/44315/summary.html
COMMIT: 7168672
CMSSW: CMSSW_15_1_X_2025-02-10-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmsdist/9679/44315/install.sh to create a dev area with all the needed externals and cmssw changes.

External Build

I found compilation error when building:

[1149/1422] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DDISABLE_CUSPARSE_DEPRECATED -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONLY_C_LOCALE=0 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_LEAN_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -D_GNU_SOURCE -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/utf8_range-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/date-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cutlass-src/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cudnn_frontend-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/mp11-src/include -isystem /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/9.6.0.74-9f9593d5e60f18ef40154a558d2268da/include -DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API    -std=c++20 -O3 --generate-line-info --source-in-ptx --display-error-number --expt-relaxed-constexpr --extended-lambda --static-global-template-stub=true --device-entity-has-hidden-visibility=true    -gencode arch=compute_60,code=[sm_60,compute_60] -gencode arch=compute_70,code=[sm_70,compute_70] -gencode arch=compute_75,code=[sm_75,compute_75] -gencode arch=compute_80,code=[sm_80,compute_80] -gencode arch=compute_89,code=[sm_89,compute_89] -Wno-deprecated-gpu-targets -diag-suppress=3012 -diag-suppress=3189 -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored -Xcudafe --gnu_version=120300 --cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" -Xcompiler=-fPIC -Xcudafe --diag_suppress=conversion_function_not_usable --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads 1 -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu.o
nvcc warning : incompatible redefinition for option 'std', the last value of this option was used
[1150/1422] /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DDISABLE_CUSPARSE_DEPRECATED -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DENABLE_CPU_FP16_TRAINING_OPS -DNSYNC_ATOMIC_CPP11 -DONLY_C_LOCALE=0 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DPROTOBUF_USE_DLLS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_LEAN_ATTENTION=1 -DUSE_MEMORY_EFFICIENT_ATTENTION=1 -D_GNU_SOURCE -Donnxruntime_providers_cuda_EXPORTS -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/utf8_range-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/include/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/include/onnxruntime/core/session -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/pytorch_cpuinfo-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/google_nsync-src/public -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/abseil_cpp-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/safeint-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/gsl-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/date-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/onnx-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/onnx-build -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/protobuf/3.21.9-3b4dc3892e5c8da1829fdf0bfe570820/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/flatbuffers-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cutlass-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cutlass-src/examples -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cutlass-src/tools/util/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/eigen-src -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/cudnn_frontend-src/include -I/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/build/_deps/mp11-src/include -isystem /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02876/el8_amd64_gcc12/external/cuda/12.8.0-15bfa86985d46d842bb5ecc3aca6c676/include -isystem /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/cudnn/9.6.0.74-9f9593d5e60f18ef40154a558d2268da/include -DTHRUST_IGNORE_DEPRECATED_API -DCUB_IGNORE_DEPRECATED_API    -std=c++20 -O3 --generate-line-info --source-in-ptx --display-error-number --expt-relaxed-constexpr --extended-lambda --static-global-template-stub=true --device-entity-has-hidden-visibility=true    -gencode arch=compute_60,code=[sm_60,compute_60] -gencode arch=compute_70,code=[sm_70,compute_70] -gencode arch=compute_75,code=[sm_75,compute_75] -gencode arch=compute_80,code=[sm_80,compute_80] -gencode arch=compute_89,code=[sm_89,compute_89] -Wno-deprecated-gpu-targets -diag-suppress=3012 -diag-suppress=3189 -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored -Xcudafe --gnu_version=120300 --cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" -Xcompiler=-fPIC -Xcudafe --diag_suppress=conversion_function_not_usable --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -Xcompiler -Wno-nonnull-compare --threads 1 -Xcompiler -Wno-reorder -Xcompiler -Wno-error=sign-compare -Werror all-warnings -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu.o -MF CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu.o.d -x cu -c /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu -o CMakeFiles/onnxruntime_providers_cuda.dir/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el8_amd64_gcc12/external/onnxruntime/1.20.1-dc2a4f648e0f3548871281e756da7311/onnxruntime-1.20.1/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu.o
nvcc warning : incompatible redefinition for option 'std', the last value of this option was used
ninja: build stopped: subcommand failed.
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.Y5zCar (%build)


RPM build errors:
line 37: It's not recommended to have unversioned Obsoletes: Obsoletes: external+onnxruntime+1.20.1-dc2a4f648e0f3548871281e756da7311
Macro expanded in comment on line 383: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants