Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Optimize device to device copying #1536

Closed
gevtushenko opened this issue Oct 11, 2021 · 0 comments · Fixed by NVIDIA/cccl#211
Closed

Optimize device to device copying #1536

gevtushenko opened this issue Oct 11, 2021 · 0 comments · Fixed by NVIDIA/cccl#211
Labels
P2: nice to have Desired, but not necessary. thrust type: enhancement New feature or request.

Comments

@gevtushenko
Copy link
Collaborator

The current implementation of thrust::copy uses transform to perform device to device copying. I suppose we have to dispatch to some form of memcpy for trivially copyable types. Here's the expected performance improvement for std::uint8_t:

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P2: nice to have Desired, but not necessary. thrust type: enhancement New feature or request.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants