-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fences as a replacement for Captures #151
Conversation
From live discussion with @psalz:
This can be implemented with Eventually we want to allow futures in range mappers and other places to push back wait-points as far as possible. |
I have updated this PR to use asynchronous fences through futures as discussed above. The new interface is as follows: namespace celerity::experimental {
template <typename DataT, int Dims>
class buffer_snapshot {
...
};
template <typename T>
std::future<T> fence(distr_queue&, const host_object<T>& obj);
template <typename DataT, int Dims>
std::future<buffer_snapshot<DataT, Dims>>
fence(distr_queue&, const buffer<DataT, Dims>& buf, const subrange<Dims>& sr);
template <typename DataT, int Dims>
std::future<buffer_snapshot<DataT, Dims>>
fence(distr_queue& q, const buffer<DataT, Dims>& buf);
} The DPC++ CI failure is not related to these changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this feels like a good abstraction. I've added a few minor notes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
As discussed in #94, the current proposal attaching data extraction facilities to epochs is suboptimal.
slow_full_sync
anddrain
is subtle and confusingThis PR implements fences in the form of the user-facing
celerity::experimental::fence(q, captures...)
and the new fence task- and command types. Fences synchronize between executor and main threads for data extraction, but will allow independent tasks that have been submitted before to execute while the application is waiting on completion of the fence operation. To avoid serialization around multiple fences, a tuple of captures can be specified in thefence()
call that will return a tuple of values.A user can
fence
three different structures:buffer
, returning anexperimental::buffer_snapshot
which is just astd::vector
with metadataexperimental::buffer_subrange
, which combines a buffer with a subrange to extract. Likewise returns abuffer_snapshot
.host_object
, whose interior needs to be copyable.Example
DAGs for fencing a buffer subrange
DAGs for fencing a host object