Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

In-place guarantees for reduce #500

Merged
merged 4 commits into from
Jun 25, 2022

Conversation

gevtushenko
Copy link
Collaborator

@gevtushenko gevtushenko commented May 31, 2022

Reduce consists of two kernels. The first one reads input data and accumulates partial sums in a temporary storage. The second one reads the temporary storage and writes the final result. Therefore, it's safe to alias input and output arrays.
Nonetheless, allowing in-place execution would limit our abilities to optimize algorithm later. Since aliasing doesn't provide significant memory saving, I'd rather not allow it.

Regarding ByKey variant, it relies on decoupled look back, so it should be safe to alias in/out data as long as value types for input/output iterators match exactly. The only limiting factor is LOAD_LDG which makes aliasing in this case an UB. In-place execution in this case would provide significant memory savings. If there's a request, I suggest we add an overload that would allow in-place execution.

Regarding Segmented version, one block is assigned per segment. Results are written without synchronization between blocks, therefore, any aliasing with output array would introduce a data race.

@gevtushenko gevtushenko requested a review from alliepiper May 31, 2022 12:15
@gevtushenko gevtushenko added only: docs Documentation changes only. Doesn't need code CI. area: docs Related to documentation. labels May 31, 2022
@alliepiper alliepiper added type: enhancement New feature or request. P1: should have Necessary, but not critical. labels Jun 3, 2022
@alliepiper alliepiper added this to the 2.0.0 milestone Jun 3, 2022
Copy link
Collaborator

@alliepiper alliepiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of style nits.

cub/device/device_segmented_reduce.cuh Outdated Show resolved Hide resolved
cub/device/device_segmented_reduce.cuh Outdated Show resolved Hide resolved
@gevtushenko gevtushenko force-pushed the enh-main/github/in_place_reduce branch from 07be755 to 5732130 Compare June 25, 2022 11:34
@gevtushenko gevtushenko merged commit 8cd8b55 into NVIDIA:main Jun 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area: docs Related to documentation. only: docs Documentation changes only. Doesn't need code CI. P1: should have Necessary, but not critical. type: enhancement New feature or request.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants