TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver #967

albestro · 2023-09-04T10:09:37Z

This is a starting point towards getting a gemm cost reduction as we did for the local variant of the algorithm. Unfortunately, it will require more work than the local one, so this is just the very first step of the entire work that will come.

Rebase after merging TridiagSolver (local): reduce GEMM step computational cost #951 (in order to avoid conflicts)
~~Implement permuteJustLocal for GPU~~
Factor out work about Distribution helpers in a separate PR (@rasolca)
use snake_case

Note:
A stablePartition function has been dropped and one has been added, resulting in a difficult to read diff (sorry for that, after merge-squashing I don't have a way to isolate it and make it easier for your review). In addition to that, I took also the chance to actually update the test with the new functionalities of the new stablePartition implementation.

Develop: TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver (#967) Develop: TridiagSolver (dist): STEP2 Make rank1 work just on local non-deflated eigenvectors (#996) Develop: TridiagSolver (dist): STEP2b Make rank1 work just on local non-deflated eigenvectors (multi-threaded) (#997) Develop: TridiagSolver (dist): STEP3 reduce GEMM step computational cost (#998)

albestro · 2023-11-30T17:14:52Z

cscs-ci run

albestro · 2023-11-30T17:16:37Z

include/dlaf/eigensolver/tridiag_solver/merge.h

+    // TODO remove this branch. It exists just because GPU permuteJustLocal is not implemented yet
+    copy(idx_loc_begin, sz_loc_tiles, ws.e0, ws_hm.e0);
+    dlaf::permutations::internal::permuteJustLocal<Backend::MC, Device::CPU, T, Coord::Col>(
+        i_begin, i_end, ws_hm.i5, ws_hm.e0, ws_hm.e2);
+    copy(idx_loc_begin, sz_loc_tiles, ws_hm.e2, ws.e1);


About permuteJustLocal: currently just the MC backend has been implemented. The GPU variant of the tridiag_solver uses it by copying forth-and-back the data.

Since there is the chance that we will drop it in favour of re-using the "classic" permute, as per discussion with @rasolca, we wouldn't spend time on implementing tests for it in this PR.

sort eigenvectors by column type (non-deflated{upper,dense,lower}, non-deflated)

albestro · 2023-12-04T17:22:20Z

include/dlaf/eigensolver/tridiag_solver/merge.h

@@ -432,54 +373,187 @@ auto stablePartitionIndexForDeflationArrays(const SizeType n, const ColType* typ
  return std::tuple(k, std::move(n_udl));
 }

+// This function returns number of global non-deflated eigenvectors, together with two permutations:
+// - @p index_sorted          (sort(non-deflated)|sort(deflated)) -> initial.
+// - @p index_sorted_coltype  (sort(upper)|sort(dense)|sort(lower)|sort(deflated)) -> initial


This describes how it is implemented, but actually the fact that UDL are sorted is an implementation detail (as specified in the code line 456-459), while X (deflated) should be sorted (IIRC @rasolca can you confirm?).

I would change the doc accordingly, i.e. removing the sort for UDL while leaving it for X (also below in the parameter list).

Note: the example I used below in the doc is not the most generic one, because it is given already sorted. Let me know if you want to have a more generic example.

No need for U, D and L to be sorted.
Up to you if you want to leave it in the doc.

include/dlaf/permutations/general/impl.h

albestro · 2023-12-04T17:25:13Z

cscs-ci run

codecov-commenter · 2023-12-04T18:01:30Z

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (e311f6a) 94.04% compared to head (0d16047) 94.05%.

Files	Patch %	Lines
include/dlaf/permutations/general/impl.h	93.93%	1 Missing and 1 partial ⚠️
include/dlaf/eigensolver/tridiag_solver/merge.h	98.30%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #967   +/-   ##
=======================================
  Coverage   94.04%   94.05%           
=======================================
  Files         145      145           
  Lines        9008     9068   +60     
  Branches     1156     1160    +4     
=======================================
+ Hits         8472     8529   +57     
- Misses        319      321    +2     
- Partials      217      218    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

albestro · 2023-12-05T17:53:49Z

include/dlaf/eigensolver/tridiag_solver/merge.h

-                                      Matrix<const T, Device::CPU>& evals,
-                                      Matrix<const SizeType, Device::CPU>& in,
-                                      Matrix<SizeType, Device::CPU>& out) {
+SizeType stablePartitionIndexForDeflationArrays(const matrix::Distribution& dist_sub, const SizeType n,


note-to-self: probably n parameter can be removed, since it is implicitly defined in dist_sub.

… by column type for rank1 solver (#967)

albestro added Priority:Medium Type:Optimization labels Sep 4, 2023

albestro added this to the Optimizations milestone Sep 4, 2023

albestro self-assigned this Sep 4, 2023

albestro mentioned this pull request Sep 4, 2023

Permute (locally) to get the optimal matrix shape #914

Closed

2 tasks

albestro force-pushed the alby/trisolver-gemm-sub branch from b9fddda to 36527f5 Compare September 4, 2023 14:03

albestro force-pushed the alby/trisolver-sort-for-gemm-dist branch from 02bd56a to 31503bb Compare September 4, 2023 14:06

albestro force-pushed the alby/trisolver-gemm-sub branch from 36527f5 to 0a9482d Compare September 5, 2023 09:16

albestro force-pushed the alby/trisolver-sort-for-gemm-dist branch 2 times, most recently from 3095aa4 to bd90d4b Compare September 6, 2023 12:40

albestro force-pushed the alby/trisolver-gemm-sub branch from 0a9482d to 86cc933 Compare September 6, 2023 12:41

albestro changed the title ~~TridiagSolver (dist): rank-independent sort of eigenvalues by column type for rank1 solver~~ TridiagSolver (dist) STEP-1: rank-independent sort of eigenvalues by column type for rank1 solver Sep 29, 2023

albestro changed the title ~~TridiagSolver (dist) STEP-1: rank-independent sort of eigenvalues by column type for rank1 solver~~ TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver Sep 29, 2023

albestro force-pushed the alby/trisolver-sort-for-gemm-dist branch from b1d0c7e to e80f01a Compare September 29, 2023 10:13

albestro force-pushed the alby/trisolver-gemm-sub branch 2 times, most recently from f9b5825 to 7c45e8f Compare November 27, 2023 16:36

albestro added the Priority:on hold label Nov 28, 2023

albestro force-pushed the alby/trisolver-gemm-sub branch from d7b2b41 to d26d9f4 Compare November 30, 2023 08:21

Base automatically changed from alby/trisolver-gemm-sub to master November 30, 2023 10:24

albestro force-pushed the alby/trisolver-sort-for-gemm-dist branch from e80f01a to 65349b5 Compare November 30, 2023 16:29

albestro commented Nov 30, 2023

View reviewed changes

albestro mentioned this pull request Nov 30, 2023

Tridiagonal Solver (dist): Migrate permutation of local eigenvectors to GPU #1058

Closed

merge-squashed: propedeuthic work towards gemm cost reduction

51109c9

sort eigenvectors by column type (non-deflated{upper,dense,lower}, non-deflated)

albestro force-pushed the alby/trisolver-sort-for-gemm-dist branch from 61b3265 to 51109c9 Compare December 4, 2023 15:17

albestro added 4 commits December 4, 2023 17:54

snake_case and minor change to cleanup

865fd74

fix doc

2c48ccb

remove superfluous doc

28d260a

doc minor fixes

0d16047

albestro commented Dec 4, 2023

View reviewed changes

include/dlaf/permutations/general/impl.h Show resolved Hide resolved

albestro requested review from rasolca and msimberg December 4, 2023 17:24

albestro marked this pull request as ready for review December 4, 2023 17:24

albestro added Priority:High and removed Priority:Medium Priority:on hold labels Dec 4, 2023

msimberg approved these changes Dec 5, 2023

View reviewed changes

albestro commented Dec 5, 2023

View reviewed changes

rasolca approved these changes Dec 11, 2023

View reviewed changes

rasolca merged commit b99bb16 into master Dec 11, 2023
4 checks passed

rasolca deleted the alby/trisolver-sort-for-gemm-dist branch December 11, 2023 10:15

github-actions bot pushed a commit that referenced this pull request Dec 11, 2023

Doc: TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues…

cfee939

… by column type for rank1 solver (#967)

albestro mentioned this pull request Apr 9, 2024

TriSolver (dist): move sorting permutation from CPU to GPU #1118

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver #967

TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver #967

albestro commented Sep 4, 2023 •

edited

Loading

albestro commented Nov 30, 2023

albestro Nov 30, 2023

albestro Dec 4, 2023

rasolca Dec 11, 2023

albestro commented Dec 4, 2023

codecov-commenter commented Dec 4, 2023

albestro Dec 5, 2023

TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver #967

TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver #967

Conversation

albestro commented Sep 4, 2023 • edited Loading

albestro commented Nov 30, 2023

albestro Nov 30, 2023

Choose a reason for hiding this comment

albestro Dec 4, 2023

Choose a reason for hiding this comment

rasolca Dec 11, 2023

Choose a reason for hiding this comment

albestro commented Dec 4, 2023

codecov-commenter commented Dec 4, 2023

Codecov Report

albestro Dec 5, 2023

Choose a reason for hiding this comment

albestro commented Sep 4, 2023 •

edited

Loading