Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sort and make sure using host mirror on host memory in kspiluk_symbolic #951

Merged
merged 2 commits into from
May 11, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 24 additions & 28 deletions src/sparse/impl/KokkosSparse_spiluk_symbolic_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
#include <KokkosKernels_config.h>
#include <Kokkos_ArithTraits.hpp>
#include <KokkosSparse_spiluk_handle.hpp>
#include <Kokkos_Sort.hpp>

//#define SYMBOLIC_OUTPUT_INFO

Expand Down Expand Up @@ -171,49 +172,32 @@ void iluk_symbolic ( IlukHandle& thandle,
{
// Scheduling and symbolic phase currently compute on host - need host copy of all views

typedef typename ARowMapType::HostMirror AHostRowMapType;
typedef typename AEntriesType::HostMirror AHostEntriesType;
typedef typename LRowMapType::HostMirror LHostRowMapType;
typedef typename LEntriesType::HostMirror LHostEntriesType;
typedef typename URowMapType::HostMirror UHostRowMapType;
typedef typename UEntriesType::HostMirror UHostEntriesType;

typedef typename IlukHandle::size_type size_type;
typedef typename IlukHandle::nnz_lno_t nnz_lno_t;

typedef typename IlukHandle::nnz_lno_view_t HandleDeviceEntriesType;
typedef typename IlukHandle::nnz_lno_view_t::HostMirror HandleHostEntriesType;

typedef typename IlukHandle::nnz_row_view_t HandleDeviceRowMapType;
typedef typename IlukHandle::nnz_row_view_t::HostMirror HandleHostRowMapType;

//typedef typename IlukHandle::signed_integral_t signed_integral_t;

size_type nrows = thandle.get_nrows();

AHostRowMapType A_row_map = Kokkos::create_mirror_view(A_row_map_d);
Kokkos::deep_copy(A_row_map, A_row_map_d);

AHostEntriesType A_entries = Kokkos::create_mirror_view(A_entries_d);
Kokkos::deep_copy(A_entries, A_entries_d);
auto A_row_map = Kokkos::create_mirror_view_and_copy( Kokkos::HostSpace(), A_row_map_d );
auto A_entries = Kokkos::create_mirror_view_and_copy( Kokkos::HostSpace(), A_entries_d );
auto L_row_map = Kokkos::create_mirror_view(Kokkos::HostSpace(), L_row_map_d);
auto L_entries = Kokkos::create_mirror_view(Kokkos::HostSpace(), L_entries_d);
auto U_row_map = Kokkos::create_mirror_view(Kokkos::HostSpace(), U_row_map_d);
auto U_entries = Kokkos::create_mirror_view(Kokkos::HostSpace(), U_entries_d);

LHostRowMapType L_row_map = Kokkos::create_mirror_view(L_row_map_d);
LHostEntriesType L_entries = Kokkos::create_mirror_view(L_entries_d);
UHostRowMapType U_row_map = Kokkos::create_mirror_view(U_row_map_d);
UHostEntriesType U_entries = Kokkos::create_mirror_view(U_entries_d);

HandleDeviceRowMapType dlevel_list = thandle.get_level_list();
HandleHostRowMapType level_list = Kokkos::create_mirror_view(dlevel_list);
Kokkos::deep_copy(level_list, dlevel_list);

auto level_list = Kokkos::create_mirror_view_and_copy( Kokkos::HostSpace(), dlevel_list );

HandleDeviceEntriesType dlevel_ptr = thandle.get_level_ptr();
HandleHostEntriesType level_ptr = Kokkos::create_mirror_view(dlevel_ptr);
Kokkos::deep_copy(level_ptr, dlevel_ptr);
auto level_ptr = Kokkos::create_mirror_view_and_copy( Kokkos::HostSpace(), dlevel_ptr );

HandleDeviceEntriesType dlevel_idx = thandle.get_level_idx();
HandleHostEntriesType level_idx = Kokkos::create_mirror_view(dlevel_idx);
Kokkos::deep_copy(level_idx, dlevel_idx);

auto level_idx = Kokkos::create_mirror_view_and_copy( Kokkos::HostSpace(), dlevel_idx );

size_type nlev = 0;

//Level scheduling on A???
Expand Down Expand Up @@ -358,6 +342,18 @@ void iluk_symbolic ( IlukHandle& thandle,
thandle.set_nnzL(cntL);
thandle.set_nnzU(cntU);

// Sort
for (size_type row_id = 0; row_id < static_cast<size_type>(L_row_map.extent(0))-1; row_id++) {
size_type row_start = L_row_map(row_id);
size_type row_end = L_row_map(row_id + 1);
Kokkos::sort(subview(L_entries, Kokkos::make_pair(row_start, row_end)));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vqd8a For this, you could do:

#include "KokkosKernels_Sorting.hpp"
...
KokkosKernels::sort_crs_graph<exec_space, decltype(L_row_map_d), decltype(L_entries_d)>
(L_row_map_d, L_entries_d);

Then it would sort the rows in parallel on device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-kelley Thanks. This is useful. I can try it. Is KokkosKernels::sort_crs_graph in KokkosKernels_SparseUtils.hpp or in KokkosKernels_Sorting.hpp?
It looks to me in Kokkos Kernels's develop branch, it is in KokkosKernels_Sorting.hpp but in Trilinos, it is in KokkosKernels_SparseUtils.hpp.

Copy link
Contributor

@brian-kelley brian-kelley May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vqd8a I moved all the sorting stuff into KokkosKernels_Sorting.hpp and out of the Impl namespace, but it just missed the 3.4 release. So if you're in Trilinos, you would do

#include "KokkosKernels_SparseUtils.hpp"
KokkosKernels::Impl::sort_crs_graph...

and if you're in KokkosKernels develop, do

#include "KokkosKernels_Sorting.hpp"
KokkosKernels::sort_crs_graph...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-kelley in this case, how can I match this Kokkos Kernels PR with trilinos/Trilinos#9088? Or does trilinos/Trilinos#9088 have to wait for the next release of Kokkos Kernels to be in Trilinos?

Another thing is I tried

#include "KokkosKernels_SparseUtils.hpp"
KokkosKernels::Impl::sort_crs_graph...

in Trilinos but I got compile error error: static assertion failed with "View allocation constructor requires managed memory"
It looks to me that sort_crs_graph does not work with unmanaged views (e.g. L_row_map_d, L_entries_d). Do you have any idea?

Copy link
Contributor

@brian-kelley brian-kelley May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vqd8a Oh, I know what's going on. sort_crs_graph sometimes allocates temporary buffers with the same type as the Entries, and yours are unmanaged. But that case should be supported. I'll fix it (#960 ). If you want to get this PR merged and then patched into Trilinos quickly, then you should just keep the serial host sorting for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-kelley Will #960 be in Trilinos soon?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vqd8a Probably in the minor release 3.4.1. I don't know when that will be but it'll be much quicker than a normal promotion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-kelley Thanks. I think for now I will just keep the serial host sorting in this PR. But I will keep this in my mind and will change it to use sort_crs_graph in another PR.

for (size_type row_id = 0; row_id < static_cast<size_type>(U_row_map.extent(0))-1; row_id++) {
size_type row_start = U_row_map(row_id);
size_type row_end = U_row_map(row_id + 1);
Kokkos::sort(subview(U_entries, Kokkos::make_pair(row_start, row_end)));
}

//Level scheduling on L
level_sched (thandle, L_row_map, L_entries, nrows, level_list, level_ptr, level_idx, nlev);

Expand Down