This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 448
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Templated type of num_items in DeviceRadixSort.
List of individual changes: - OffsetT == unsigned long long for the 64-bit case - using std::{is_same,conditional} - using "portion" consistently for 2^28-2^30-sized chunks of the input array - HasEnoughMemory() takes overwrite into account. - moved checking for enough memory earlier. - added a CTA_SYNC() to the histogram kernel - disabled tests with NumItemsT != int for segmented sort - testing with 4.5 bln. items - tests for different NumItemsT - NumItemsT for all device sorting functions - wrapped ChooseOffsetT into namespace detail - fixed typos - templatized the type of num_items in 2 methods of DeviceRadixSort - tuned radix sort with 64-bit OffsetT for V100 - tuned for 64-bit OffsetT for A100 - separate tuning parameters for 64-bit OffsetT - improved downsweep policy for GP100 - option for 64-bit num_items with 32-bit shared memory histogram counters. - introduced PartOffsetT into Onesweep kernel. - OffsetT is now only used for offsets into the whole array (e.g. bin counts or global read/write offsets) - PartOffsetT is used for offsets that do not exceed a single part (e.g. decoupled look-back, block index, number of items inside a part) - this fixes problems when OffsetT is unsigned, and also contributes towards supporting 64-bit num_items
- Loading branch information
1 parent
5d31d2d
commit 5912195
Showing
5 changed files
with
342 additions
and
227 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.