forked from cms-sw/cmssw
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace use of CUDA API wrapper unique_ptrs with CUDAUtilities unique_ptrs #396
Merged
fwyzard
merged 5 commits into
cms-patatrack:CMSSW_11_0_X_Patatrack
from
waredjeb:replace_cudaMakeUnique
Oct 31, 2019
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
a0e2de8
Replace cuda::memory::device::make_unique() calls with cudautils::mak…
waredjeb 0ae4faf
Replace not complete, problem with std::swap
waredjeb 4da1c01
Fixes problems replacing cuda::memory::device::unique_ptr() with cuda…
waredjeb 2e1dc28
-Replace cuda::memory::host::make_unique() with cudautils::make_host_…
waredjeb 54d2576
Comments cleanup
waredjeb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do people think it would make sense to (ab)use
cudaStreamDefault
instead ofnullptr
to speficy the default stream ?I say "abuse" because
cudaStreamDefault
is meant to specify the default stream creation flags - however the name and value (0x00) would make it a good candidate...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit afraid that the "abuse" would lead to confusion at some point.
I'm thinking (*) to add an overload on the caching allocator that would not take a stream at all (or use the
nullptr
to signify no-stream; although that choice would make it impossible to use the allocator with the default stream), in which case the memory block is truly freed at the destructor of theunique_ptr
(instead of delaying the "true free" until the work using the memory block has finished). My main challenge is the naming of the smart pointers: usingunique_ptr
for both would likely be confusing (in a sense the currentunique_ptr
could be argued to be confusing as well).(*) e.g. for caching memory allocations in ESProducts, and to reduce the use of CUDA events in the caching allocator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading up on the CUDA documentation, there are actually two options for the "default" stream:
Passing
0
ornullptr
will use either of those behaviours depending on thenvcc --default-stream
option or theCUDA_API_PER_THREAD_DEFAULT_STREAM
symbol; the default is the "legacy" stream.Purely from the API point of view, I would use
cudautils::make_device_unique<T>(size, nullptr);
for the unspecified default streamcudautils::make_device_unique<T>(size, cudaStreamLegacy);
for the legacy default streamcudautils::make_device_unique<T>(size, cudaStreamPerThread);
for the per-thread default streamcudautils::make_device_unique<T>(size);
for the synchronous behaviourto keep the possibility of passing
nullptr
for the generic default stream.With that naming scheme,
cudaStreamDefault
makes a lot of sense for the unspecified default stream.Then I would suggest
unique_ptr
andmake_device_unique
for the synchronous behaviour, and something likeasync_unique_ptr
andmake_device_async_unique
orunique_ptr_async
andmake_device_unique_async
for the ones that use a stream ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just stick to
unique_ptr
...