Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix clang 17 compilation #394

Merged
merged 2 commits into from
Dec 21, 2023
Merged

Conversation

AustinSchuh
Copy link
Contributor

This makes it so I'm able to compile basic code which uses cuco with clang 17.

Copy link

copy-pr-bot bot commented Nov 16, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@sleeepyjack
Copy link
Collaborator

/ok to test

@sleeepyjack
Copy link
Collaborator

@AustinSchuh FYI if you sign all of your existing and future commits to this PR then CI will run automatically on every push.

@AustinSchuh
Copy link
Contributor Author

@AustinSchuh FYI if you sign all of your existing and future commits to this PR then CI will run automatically on every push.

I thought I had, looks like I hadn't. I'll work on getting the build to pass.

@PointKernel PointKernel added the type: bug Something isn't working label Nov 16, 2023
@PointKernel
Copy link
Member

/ok to test

@PointKernel
Copy link
Member

sign all of your existing and future commits to this PR then CI will run automatically on every push.

I think this is valid only for contributors who have write access to the repo.

@sleeepyjack sleeepyjack added type: feature request New feature request P1: Should have Necessary but not critical labels Nov 17, 2023
@AustinSchuh
Copy link
Contributor Author

AustinSchuh commented Nov 17, 2023

OK, I think I need to wait for #343 to merge first. There have been a lot of changes in thrust which change how the pieces interact quite a bit. I'd rather debug those after the update, especially since some of the older versions of thrust and friends don't work with clang.

@AustinSchuh AustinSchuh reopened this Nov 17, 2023
@PointKernel
Copy link
Member

OK, I think I need to wait for #343 to merge first. There have been a lot of changes in thrust which change how the pieces interact quite a bit. I'd rather debug those after the update, especially since some of the older versions of thrust and friends don't work with clang.

Make sense. I will ping you once it's ready.

@sleeepyjack
Copy link
Collaborator

@AustinSchuh FYI #343 was superseeded by #404 which has been merged. We are now using CCCL 2.2.0.

Clang complains:

  In file included from cuco/static_map.cuh:24:
  In file included from cuco/pair.cuh:147:
  In file included from cuco/detail/pair/pair.inl:54:
  cuco/detail/pair/tuple_helpers.inl:18:8: error:
        cannot specialize a dependent template
     18 | struct tuple_size<cuco::pair<T1, T2>> : integral_constant<size_t, 2> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:22:8: error:
        cannot specialize a dependent template
     22 | struct tuple_size<const cuco::pair<T1, T2>> : tuple_size<cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:26:8: error:
        cannot specialize a dependent template
     26 | struct tuple_size<volatile cuco::pair<T1, T2>> : tuple_size<cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:30:8: error:
        cannot specialize a dependent template
     30 | struct tuple_size<const volatile cuco::pair<T1, T2>> : tuple_size<cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:34:8: error:
        cannot specialize a dependent template
     34 | struct tuple_element<I, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:39:8: error:
        cannot specialize a dependent template
     39 | struct tuple_element<0, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:44:8: error:
        cannot specialize a dependent template
     44 | struct tuple_element<1, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:49:8: error:
        cannot specialize a dependent template
     49 | struct tuple_element<0, const cuco::pair<T1, T2>> : tuple_element<0, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:53:8: error:
        cannot specialize a dependent template
     53 | struct tuple_element<1, const cuco::pair<T1, T2>> : tuple_element<1, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:57:8: error:
        cannot specialize a dependent template
     57 | struct tuple_element<0, volatile cuco::pair<T1, T2>> : tuple_element<0, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:61:8: error:
        cannot specialize a dependent template
     61 | struct tuple_element<1, volatile cuco::pair<T1, T2>> : tuple_element<1, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:65:8: error:
        cannot specialize a dependent template
     65 | struct tuple_element<0, const volatile cuco::pair<T1, T2>> : tuple_element<0, cuco::pair<T1, T2>> {
        |        ^
  cuco/detail/pair/tuple_helpers.inl:69:8: error:
        cannot specialize a dependent template
     69 | struct tuple_element<1, const volatile cuco::pair<T1, T2>> : tuple_element<1, cuco::pair<T1, T2>> {
        |        ^

If we dig in, include/cuco/detail/pair/pair.inl is trying to teach
tuple_size and tuple_element to work with cuco::pair's.

thrust/pair.h defines thrust::tuple_size to be:

  template <class T>
  using tuple_size = ::cuda::std::tuple_size<T>;

This is a using statement, not a class.  So, we only need to teach
cuda::std::{tuple_size, tuple_element} how to work with cuco::pair, not
thrust::{tuple_size, tuple_element}.

Signed-off-by: Austin Schuh <austin.linux@gmail.com>
Clang complains:

  external/com_github_nvidia_cuco/include/cuco/detail/static_map.inl:446:21: error:
        explicit qualification required to use member 'initial_slot' from dependent base class
    446 |   auto current_slot{initial_slot(insert_pair.first, hash)};
        |                     ^
  frc971/orin/cuda.cc:219:32: note: in instantiation of function template
        specialization 'cuco::static_map<unsigned long, unsigned
        long>::device_mutable_view::insert_and_find<cuco::detail::XXHash_32<unsigned long>,
        thrust::equal_to<unsigned long>>' requested here
    219 |   auto [iter, inserted] = view.insert_and_find(thrust::make_pair(union_markers[i], 1));
        |                                ^

Signed-off-by: Austin Schuh <austin.linux@gmail.com>
@AustinSchuh
Copy link
Contributor Author

Thanks @sleeepyjack ! I rebased the pieces which still apply, hopefully CI will be happy now.

@PointKernel
Copy link
Member

/ok to test

Comment on lines -53 to -55
namespace thrust {
#include <cuco/detail/pair/tuple_helpers.inl>
} // namespace thrust
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sleeepyjack

Just a reminder to ourselves, thrust::pair is the same as cuda::std::pair so the current pair trait logic can be simplified.

@@ -443,7 +443,7 @@ __device__
"insert_and_find is not supported for unpackable data on pre-Volta GPUs.");
#endif

auto current_slot{initial_slot(insert_pair.first, hash)};
auto current_slot{this->initial_slot(insert_pair.first, hash)};
Copy link
Member

@PointKernel PointKernel Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand the logic here since we have many places missing this-> but how can they be built properly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be genuinely surprised if this is the only place where we forgot to use this->. So yeah, not sure why clang is complaining here but not in the other cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suspicion is that this is the only code I am calling in my application and exercising. I need to go dig in and get you a real answer though, not a guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only code I am calling in my application and exercising.

Make sense.

@sleeepyjack we can merge the current PR as it is and fix other uncovered issues in your clang CI PR. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with that, especially if it unblocks @AustinSchuh 's work for now.

@sleeepyjack
Copy link
Collaborator

I'll enable llvm in CI later today. Unfortunately the newest version we can use in CI is llvm16. Is this a problem?

@AustinSchuh
Copy link
Contributor Author

I'll enable llvm in CI later today. Unfortunately the newest version we can use in CI is llvm16. Is this a problem?

(answering the easy questions while I'm distracted) 16 should be fine. My understanding is it doesn't support the latest and greatest CUDA version available today, but that statement will be true with 17 at some point soon anyways.

@sleeepyjack
Copy link
Collaborator

I'll enable llvm in CI later today.

I think I underestimated this project a little. Adding llvm tests requires more changes to you CI setup than I expected. I think we can proceed with this PR without CI coverage for now.

Copy link
Collaborator

@sleeepyjack sleeepyjack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. Since we have no CI coverage yet, we must make sure that we test everything, i.e., make sure all of the examples+tests+benchmarks compile without any problems.

@@ -443,7 +443,7 @@ __device__
"insert_and_find is not supported for unpackable data on pre-Volta GPUs.");
#endif

auto current_slot{initial_slot(insert_pair.first, hash)};
auto current_slot{this->initial_slot(insert_pair.first, hash)};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be genuinely surprised if this is the only place where we forgot to use this->. So yeah, not sure why clang is complaining here but not in the other cases.

Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@sleeepyjack sleeepyjack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@sleeepyjack sleeepyjack merged commit 679612a into NVIDIA:dev Dec 21, 2023
15 checks passed
@PointKernel PointKernel mentioned this pull request Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1: Should have Necessary but not critical type: bug Something isn't working type: feature request New feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants