This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Goal is to clean up code and reduce build time:
Before:
After:
Thrust isn't significantly affected, as it spends most of its time building non-CUB algorithms (the set operation and minmax tests in particular). We can probably bring that down by removing old SM policies in a follow up patch, though Thrust currently only appears to go down to sm30 so there are fewer policies to take out.
CUB's compile time is reduced by nearly 40%. There's still a lot of room for improvement by looking at the CPU vs walltime (build used 12 cores). This is because there are one or two tests that take a very long time to compile. We can improve this by splitting up those tests so they can be parallelized.