-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
two-level parallel version of transpose GEMV #514
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to write an optimized GEMV? Why can't we just call cuBLAS for CUDA and the BLAS for non-CUDA, and have a slow fall-back for unsupported types?
const bool conj, | ||
class IndexType = typename AViewType::size_type> | ||
struct TwoLevelTransposeGEMV { | ||
typedef typename YViewType::non_const_value_type y_value_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer using new_type = old_type;
C++11 type alias syntax to typedef old_type new_type;
.
@ndellingwood wrote:
Cool, I'm OK with it as long as it's fixing a bug. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK. The usual kokkos-kernels idiom that Christian has been promoting, is to reduce the number of instantiations of the functor, by having the function that calls the functor assign the input Views to "canonical" View types. For example, the (outer) function might be templated on ViewType, but if the functor really only needs View<const T*, AnonymousSpace>
, it would be best to have the functor be templated only on T
.
Thank you for the comments @mhoemmen. I am running the spot-check on white, but it is taking a long time @ndellingwood. |
@iyamazaki spot-check on kokkos-dev-2 passed, merging in. Thanks!
|
No description provided.