Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Update for cub::FutureValue PR (NVIDIA/cub#305) #1519

Merged
merged 1 commit into from
Oct 15, 2021

Conversation

zasdfgbnm
Copy link
Contributor

Please review NVIDIA/cub#305

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

@zasdfgbnm zasdfgbnm changed the title Update for cub change cub/pull/305 Update for cub change NVIDIA/cub#305 Sep 7, 2021
@alliepiper alliepiper self-assigned this Sep 21, 2021
@alliepiper alliepiper added this to the 1.15.0 milestone Sep 21, 2021
@alliepiper
Copy link
Collaborator

My last push just updated the CUB submodule so gpuCI can pick it up.

DVS CL: 30512893

run tests

@alliepiper alliepiper added the testing: gpuCI in progress Started gpuCI testing. label Oct 8, 2021
@zasdfgbnm zasdfgbnm force-pushed the device-scan-future branch 3 times, most recently from e1d0681 to 600e57e Compare October 12, 2021 05:45
@zasdfgbnm
Copy link
Contributor Author

@allisonvacanti The compilation failures should be fixed now.

@alliepiper alliepiper added P1: should have Necessary, but not critical. helps: pytorch Helps or needed by PyTorch. and removed testing: gpuCI in progress Started gpuCI testing. labels Oct 14, 2021
@alliepiper
Copy link
Collaborator

DVS CL: 30535270

run tests

@alliepiper alliepiper added testing: gpuCI in progress Started gpuCI testing. testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). labels Oct 14, 2021
@alliepiper alliepiper changed the title Update for cub change NVIDIA/cub#305 Update for cub::FutureValue PR (NVIDIA/cub#305) Oct 14, 2021
@zasdfgbnm
Copy link
Contributor Author

5648/6174] Building CUDA object dependencies/cub/test/CMakeFiles/cub.cpp14.test.block_radix_sort.dir/test_block_radix_sort.cu.o
FAILED: dependencies/cub/test/CMakeFiles/cub.cpp14.test.block_radix_sort.dir/test_block_radix_sort.cu.o 
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/usr/bin/g++-10 -DCUB_IGNORE_DEPRECATED_CPP_11 -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -I../dependencies/cub/test -I../dependencies/cub -I../ -gencode arch=compute_50,code=sm_50 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80 -O3 -DNDEBUG -Xcompiler=-Werror -Xcompiler=-Wall -Xcompiler=-Wextra -Xcompiler=-Winit-self -Xcompiler=-Woverloaded-virtual -Xcompiler=-Wcast-qual -Xcompiler=-Wpointer-arith -Xcompiler=-Wvla -Xcompiler=-Wno-gnu-zero-variadic-macro-arguments -Xcompiler=-Wno-unused-function -Xcompiler=-Wno-deprecated-declarations -Xcompiler=-Wno-noexcept-type -Xcudafe=--display_error_number -Xcudafe=--promote_warnings -Wno-deprecated-gpu-targets -Wno-deprecated-declarations -std=c++14 -MD -MT dependencies/cub/test/CMakeFiles/cub.cpp14.test.block_radix_sort.dir/test_block_radix_sort.cu.o -MF dependencies/cub/test/CMakeFiles/cub.cpp14.test.block_radix_sort.dir/test_block_radix_sort.cu.o.d -x cu -c ../dependencies/cub/test/test_block_radix_sort.cu -o dependencies/cub/test/CMakeFiles/cub.cpp14.test.block_radix_sort.dir/test_block_radix_sort.cu.o
/usr/include/c++/10/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/usr/include/c++/10/chrono:473:154:   required from here
/usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
  428 |  _S_gcd(intmax_t __m, intmax_t __n) noexcept
      |                           ^~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-10/README.Bugs> for instructions

Seems unrelated

@alliepiper alliepiper added testing: gpuCI passed Passed gpuCI testing. and removed testing: internal ci in progress Currently testing on internal NVIDIA CI (DVS). testing: gpuCI in progress Started gpuCI testing. labels Oct 15, 2021
@alliepiper alliepiper added the testing: internal ci passed Passed internal NVIDIA CI (DVS). label Oct 15, 2021
@alliepiper
Copy link
Collaborator

@zasdfgbnm Yeah, that's a known compiler bug in gcc 10, it can be ignored.

Internal testing on DVS also looks clean, so this is ready to go in -- thanks for advocating for this and doing the work, I really like how this feature turned out! 😀

I'll get this merged in a moment.

@alliepiper
Copy link
Collaborator

Last couple of pushes just updated the CUB submodule.

@alliepiper alliepiper merged commit 2fff110 into NVIDIA:main Oct 15, 2021
@zasdfgbnm zasdfgbnm deleted the device-scan-future branch October 15, 2021 20:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
helps: pytorch Helps or needed by PyTorch. P1: should have Necessary, but not critical. testing: gpuCI passed Passed gpuCI testing. testing: internal ci passed Passed internal NVIDIA CI (DVS).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants