Reference matrix conversion of Csr and Hybrid #302

yhmtsai · 2019-05-08T18:33:22Z

Reference matrix conversion of Csr and Hybrid

hartwiganzt

I did not find anything to complain about.

thoasm

Looks good to me.

Just 2 small nits (adding a const), and one general question about relying on the CSR row_ptrs for nnz_per_row instead of counting, which I would like to discuss before merging.

reference/matrix/hybrid_kernels.cpp

reference/matrix/csr_kernels.cpp

reference/matrix/hybrid_kernels.cpp

thoasm

LGTM!

tcojean

LGTM other than the questions for preserving or not explicit zeros.

tcojean · 2019-05-16T14:02:20Z

reference/matrix/csr_kernels.cpp

+        size_type ell_idx = 0;
+        while (csr_idx < csr_row_ptrs[row + 1]) {
+            const auto val = csr_vals[csr_idx];
+            if (val != zero<ValueType>()) {


Is that test correct? To count max_nnz you also count zeros. Doesn't that mean you should also store explicit zeros here? In general, what is our policy on these issues? For some conversions, I think we preserve zeros (CSR <-> COO,CSR->SELLP) but some others that does not seem to be the case (ELL <-> CSR). The lists are non exhaustive.

you are right. The result is not the expected hybrid matrix.

tcojean · 2019-05-16T14:03:02Z

reference/matrix/hybrid_kernels.cpp

+        // Ell part
+        for (IndexType col = 0; col < max_nnz_per_row; col++) {
+            const auto val = ell->val_at(row, col);
+            if (val != zero<ValueType>()) {


Same thing with keeping explicit zeros or not. And further down also.

How about we keep explicit zeros for all matrices conversions?
And we also need to implement the function which kicks all zeros out.
Thus, users can decide whether they need to do it.
For reading matrix file now, we seem to delete all zero value, so maybe we also keep them.

As far as I know, we should not put the explicit zeros in the ELL format since we might need even more storage that way and I am not sure if we have already have the SpMV improvement in our code that stops as soon as a zero is found.

Actaully, for the Hybrid, I thought it should be fine to ignore all explicit zeros since COO is only used for the parts that don't properly fit into the ELL.

For ELL part, it will stops when zero is found. It depends on column index not the value in the cuda kernel.
Hybrid use the nnz_per_row to decide the #col of Ell, so the hybrid matrix is not expected when we skipped the explicit zeros.

hartwiganzt · 2019-05-17T14:49:33Z

This can be merged? Thanks @tcojean !

tcojean · 2019-05-17T14:53:51Z

@hartwiganzt There is ongoing discussion, actually maybe if you have time could you give your input?

Summary:
Currently, there is a problem with taking into account the zeros when computing nnz_per_row, but then they are removed in the conversion, which means that you do not get the actual hybrid matrix that you created in the beginning.

We are also wondering in general whether and when we should ignore zeros during conversions or not (for Dense <-> anything that is obvious, but for the rest?).

Currently, we preserve zeros in some cases (CSR <-> COO, CSR->SELLP) but some others that does not seem to be the case (ELL <-> CSR, all read method from files).

hartwiganzt · 2019-05-17T15:39:32Z

This is a difficult question. As pointed out, sometimes it can be helpful to have explicit zeros stored. At the same time, if converting ELL->CSR you want to have the zeros removed, obviously. I don't think I have an overall best solution, but maybe different routines handle this differently. Obviously, the documentation should be explicit about it. Is that a variant?

tcojean · 2019-05-17T15:46:38Z

I guess for this PR we can only focus on making the Hybrid <-> CSR version correct in terms of correctly using the nnz_per_row. You are right that this should be a per case thing anyway, maybe with a loose default policy of keeping explicit zeros whenever possible/it makes sense.

Co-Authored-By: Thomas Grützmacher <thomas.gruetzmacher@kit.edu>

yhmtsai · 2019-05-17T17:30:47Z

Csr -> Hybrid : keep the explicit zeros
Hybrid -> Csr : delete the explicit zeros in coo or ell part.
I also add another test for them.

yhmtsai · 2019-05-20T02:47:37Z

@tcojean I see it failed on pipeline.
It says /usr/bin/ld: final link failed: No space left on device
Is the workstation full? Or, is there something wrong in code.

tcojean · 2019-05-20T07:39:21Z

@yhmtsai there was indeed a space problem. Now it built but you have a problem in a kernel.

yhmtsai · 2019-05-20T09:44:06Z

Fixed it

thoasm

LGTM!
Just some minor style suggestions.

reference/test/matrix/csr_kernels.cpp

reference/test/matrix/hybrid_kernels.cpp

Co-Authored-By: Thomas Grützmacher <thomas.gruetzmacher@kit.edu>

The Ginkgo team is proud to announce the new minor release of Ginkgo version 1.1.0. This release brings several performance improvements, adds Windows support, adds support for factorizations inside Ginkgo and a new ILU preconditioner based on ParILU algorithm, among other things. For detailed information, check the respective issue. Supported systems and requirements: + For all platforms, cmake 3.9+ + Linux and MacOS + gcc: 5.3+, 6.3+, 7.3+, 8.1+ + clang: 3.9+ + Intel compiler: 2017+ + Apple LLVM: 8.0+ + CUDA module: CUDA 9.0+ + Windows + MinGW and CygWin: gcc 5.3+, 6.3+, 7.3+, 8.1+ + Microsoft Visual Studio: VS 2017 15.7+ + CUDA module: CUDA 9.0+, Microsoft Visual Studio + OpenMP module: MinGW or CygWin. The current known issues can be found in the [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues). Additions: + Upper and lower triangular solvers ([#327](#327), [#336](#336), [#341](#341), [#342](#342)) + New factorization support in Ginkgo, and addition of the ParILU algorithm ([#305](#305), [#315](#315), [#319](#319), [#324](#324)) + New ILU preconditioner ([#348](#348), [#353](#353)) + Windows MinGW and Cygwin support ([#347](#347)) + Windows Visual studio support ([#351](#351)) + New example showing how to use ParILU as a preconditioner ([#358](#358)) + New example on using loggers for debugging ([#360](#360)) + Add two new 9pt and 27pt stencil examples ([#300](#300), [#306](#306)) + Allow benchmarking CuSPARSE spmv formats through Ginkgo's benchmarks ([#303](#303)) + New benchmark for sparse matrix format conversions ([#312](https://github.com/ginkgo-project/ginkgo/issues/312)[#317](https://github.com/ginkgo-project/ginkgo/issues/317)) + Add conversions between CSR and Hybrid formats ([#302](#302), [#310](#310)) + Support for sorting rows in the CSR format by column idices ([#322](#322)) + Addition of a CUDA COO SpMM kernel for improved performance ([#345](#345)) + Addition of a LinOp to handle perturbations of the form (identity + scalar * basis * projector) ([#334](#334)) + New sparsity matrix representation format with Reference and OpenMP kernels ([#349](#349), [#350](#350)) Fixes: + Accelerate GMRES solver for CUDA executor ([#363](#363)) + Fix BiCGSTAB solver convergence ([#359](#359)) + Fix CGS logging by reporting the residual for every sub iteration ([#328](#328)) + Fix CSR,Dense->Sellp conversion's memory access violation ([#295](#295)) + Accelerate CSR->Ell,Hybrid conversions on CUDA ([#313](#313), [#318](#318)) + Fixed slowdown of COO SpMV on OpenMP ([#340](#340)) + Fix gcc 6.4.0 internal compiler error ([#316](#316)) + Fix compilation issue on Apple clang++ 10 ([#322](#322)) + Make Ginkgo able to compile on Intel 2017 and above ([#337](#337)) + Make the benchmarks spmv/solver use the same matrix formats ([#366](#366)) + Fix self-written isfinite function ([#348](#348)) + Fix Jacobi issues shown by cuda-memcheck Tools and ecosystem: + Multiple improvements to the CI system and tools ([#296](#296), [#311](#311), [#365](#365)) + Multiple improvements to the Ginkgo containers ([#328](#328), [#361](#361)) + Add sonarqube analysis to Ginkgo ([#304](#304), [#308](#308), [#309](#309)) + Add clang-tidy and iwyu support to Ginkgo ([#298](#298)) + Improve Ginkgo's support of xSDK M12 policy by adding the `TPL_` arguments to CMake ([#300](#300)) + Add support for the xSDK R7 policy ([#325](#325)) + Fix examples in html documentation ([#367](#367))

The Ginkgo team is proud to announce the new minor release of Ginkgo version 1.1.0. This release brings several performance improvements, adds Windows support, adds support for factorizations inside Ginkgo and a new ILU preconditioner based on ParILU algorithm, among other things. For detailed information, check the respective issue. Supported systems and requirements: + For all platforms, cmake 3.9+ + Linux and MacOS + gcc: 5.3+, 6.3+, 7.3+, 8.1+ + clang: 3.9+ + Intel compiler: 2017+ + Apple LLVM: 8.0+ + CUDA module: CUDA 9.0+ + Windows + MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, 8.1+ + Microsoft Visual Studio: VS 2017 15.7+ + CUDA module: CUDA 9.0+, Microsoft Visual Studio + OpenMP module: MinGW or Cygwin. The current known issues can be found in the [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues). ### Additions + Upper and lower triangular solvers ([#327](#327), [#336](#336), [#341](#341), [#342](#342)) + New factorization support in Ginkgo, and addition of the ParILU algorithm ([#305](#305), [#315](#315), [#319](#319), [#324](#324)) + New ILU preconditioner ([#348](#348), [#353](#353)) + Windows MinGW and Cygwin support ([#347](#347)) + Windows Visual Studio support ([#351](#351)) + New example showing how to use ParILU as a preconditioner ([#358](#358)) + New example on using loggers for debugging ([#360](#360)) + Add two new 9pt and 27pt stencil examples ([#300](#300), [#306](#306)) + Allow benchmarking CuSPARSE spmv formats through Ginkgo's benchmarks ([#303](#303)) + New benchmark for sparse matrix format conversions ([#312](https://github.com/ginkgo-project/ginkgo/issues/312)[#317](https://github.com/ginkgo-project/ginkgo/issues/317)) + Add conversions between CSR and Hybrid formats ([#302](#302), [#310](#310)) + Support for sorting rows in the CSR format by column idices ([#322](#322)) + Addition of a CUDA COO SpMM kernel for improved performance ([#345](#345)) + Addition of a LinOp to handle perturbations of the form (identity + scalar * basis * projector) ([#334](#334)) + New sparsity matrix representation format with Reference and OpenMP kernels ([#349](#349), [#350](#350)) ### Fixes + Accelerate GMRES solver for CUDA executor ([#363](#363)) + Fix BiCGSTAB solver convergence ([#359](#359)) + Fix CGS logging by reporting the residual for every sub iteration ([#328](#328)) + Fix CSR,Dense->Sellp conversion's memory access violation ([#295](#295)) + Accelerate CSR->Ell,Hybrid conversions on CUDA ([#313](#313), [#318](#318)) + Fixed slowdown of COO SpMV on OpenMP ([#340](#340)) + Fix gcc 6.4.0 internal compiler error ([#316](#316)) + Fix compilation issue on Apple clang++ 10 ([#322](#322)) + Make Ginkgo able to compile on Intel 2017 and above ([#337](#337)) + Make the benchmarks spmv/solver use the same matrix formats ([#366](#366)) + Fix self-written isfinite function ([#348](#348)) + Fix Jacobi issues shown by cuda-memcheck ### Tools and ecosystem improvements + Multiple improvements to the CI system and tools ([#296](#296), [#311](#311), [#365](#365)) + Multiple improvements to the Ginkgo containers ([#328](#328), [#361](#361)) + Add sonarqube analysis to Ginkgo ([#304](#304), [#308](#308), [#309](#309)) + Add clang-tidy and iwyu support to Ginkgo ([#298](#298)) + Improve Ginkgo's support of xSDK M12 policy by adding the `TPL_` arguments to CMake ([#300](#300)) + Add support for the xSDK R7 policy ([#325](#325)) + Fix examples in html documentation ([#367](#367)) Related PR: #370

tcojean requested review from thoasm, pratikvn and tcojean and removed request for thoasm and pratikvn May 9, 2019 10:20

tcojean assigned yhmtsai May 9, 2019

tcojean added is:enhancement An improvement of an existing feature. type:matrix-format This is related to the Matrix formats 1:ST:ready-for-review This PR is ready for review mod:reference This is related to the reference module. labels May 9, 2019

hartwiganzt approved these changes May 9, 2019

View reviewed changes

thoasm reviewed May 13, 2019

View reviewed changes

reference/matrix/hybrid_kernels.cpp Outdated Show resolved Hide resolved

reference/matrix/csr_kernels.cpp Outdated Show resolved Hide resolved

reference/matrix/csr_kernels.cpp Show resolved Hide resolved

reference/matrix/hybrid_kernels.cpp Show resolved Hide resolved

thoasm approved these changes May 14, 2019

View reviewed changes

tcojean approved these changes May 16, 2019

View reviewed changes

yhmtsai and others added 8 commits May 18, 2019 01:26

create the functions

a7ef23d

add csr -> hybrid reference

bc9f957

create hybrid->csr functions

05eef75

hybrid->csr reference

fb411a6

Update reference/matrix/hybrid_kernels.cpp

5f35fc3

Co-Authored-By: Thomas Grützmacher <thomas.gruetzmacher@kit.edu>

Update reference/matrix/csr_kernels.cpp

b544a8c

Co-Authored-By: Thomas Grützmacher <thomas.gruetzmacher@kit.edu>

csr->hybrid preserve explict zeros

5fa432c

hybrid->csr without explict zero

1b69ea9

yhmtsai force-pushed the reference_csr_hybrid_converter branch from bd480e8 to 1b69ea9 Compare May 17, 2019 17:28

Fix rebase error

a99edf4

thoasm approved these changes May 20, 2019

View reviewed changes

reference/test/matrix/csr_kernels.cpp Outdated Show resolved Hide resolved

reference/test/matrix/csr_kernels.cpp Show resolved Hide resolved

reference/test/matrix/csr_kernels.cpp Outdated Show resolved Hide resolved

reference/test/matrix/hybrid_kernels.cpp Show resolved Hide resolved

tcojean added 1:ST:do-not-merge Please do not merge PR this yet. and removed 1:ST:ready-for-review This PR is ready for review labels May 20, 2019

yhmtsai and others added 3 commits May 20, 2019 21:19

Update reference/test/matrix/csr_kernels.cpp

8d472b2

Co-Authored-By: Thomas Grützmacher <thomas.gruetzmacher@kit.edu>

Fix style of test csr_kernel.cpp

ec01b75

fix style of test hybrid_kernel.cpp

3aef7cb

tcojean added 1:ST:ready-for-review This PR is ready for review 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:do-not-merge Please do not merge PR this yet. 1:ST:ready-for-review This PR is ready for review labels May 21, 2019

tcojean merged commit c9be444 into ginkgo-project:develop May 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference matrix conversion of Csr and Hybrid #302

Reference matrix conversion of Csr and Hybrid #302

yhmtsai commented May 8, 2019

hartwiganzt left a comment

thoasm left a comment

thoasm left a comment

tcojean left a comment

tcojean May 16, 2019

yhmtsai May 16, 2019

tcojean May 16, 2019

yhmtsai May 16, 2019

thoasm May 16, 2019

yhmtsai May 17, 2019

hartwiganzt commented May 17, 2019

tcojean commented May 17, 2019 •

edited

Loading

hartwiganzt commented May 17, 2019

tcojean commented May 17, 2019

yhmtsai commented May 17, 2019 •

edited

Loading

yhmtsai commented May 20, 2019

tcojean commented May 20, 2019

yhmtsai commented May 20, 2019

thoasm left a comment

Reference matrix conversion of Csr and Hybrid #302

Reference matrix conversion of Csr and Hybrid #302

Conversation

yhmtsai commented May 8, 2019

hartwiganzt left a comment

Choose a reason for hiding this comment

thoasm left a comment

Choose a reason for hiding this comment

thoasm left a comment

Choose a reason for hiding this comment

tcojean left a comment

Choose a reason for hiding this comment

tcojean May 16, 2019

Choose a reason for hiding this comment

yhmtsai May 16, 2019

Choose a reason for hiding this comment

tcojean May 16, 2019

Choose a reason for hiding this comment

yhmtsai May 16, 2019

Choose a reason for hiding this comment

thoasm May 16, 2019

Choose a reason for hiding this comment

yhmtsai May 17, 2019

Choose a reason for hiding this comment

hartwiganzt commented May 17, 2019

tcojean commented May 17, 2019 • edited Loading

hartwiganzt commented May 17, 2019

tcojean commented May 17, 2019

yhmtsai commented May 17, 2019 • edited Loading

yhmtsai commented May 20, 2019

tcojean commented May 20, 2019

yhmtsai commented May 20, 2019

thoasm left a comment

Choose a reason for hiding this comment

tcojean commented May 17, 2019 •

edited

Loading

yhmtsai commented May 17, 2019 •

edited

Loading