Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TridiagSolver (dist): STEP2b Make rank1 work just on local non-deflated eigenvectors (multi-threaded) #997

Merged
merged 3 commits into from
Dec 12, 2023

Conversation

albestro
Copy link
Collaborator

@albestro albestro commented Sep 29, 2023

This PR makes the changes from #996 multithreaded.

Main changes:

  • some sections are split in batches assigned to multiple threads
  • number of working threads is function of how much work the rank1 has to do (size of the problem + each thread must have at least 2 tiles
  • some loops have been refactored in order to reduce index cost

Suggestion for reviewers: turn on "hide whitespace" feature (due to the different indentation introduced with an additional nesting for dynamic nthreads).

TODO:

@albestro albestro added this to the Optimizations milestone Sep 29, 2023
@albestro albestro self-assigned this Sep 29, 2023
@albestro albestro changed the title TridiagSolver (dist): STEP2 Make rank1 work just on local non-deflated eigenvectors (multi-threaded) TridiagSolver (dist): STEP2b Make rank1 work just on local non-deflated eigenvectors (multi-threaded) Sep 29, 2023
@albestro albestro force-pushed the alby/trisolver-dist-opt-step2-multi branch from a36baa8 to 05d4760 Compare September 29, 2023 10:14
@albestro albestro force-pushed the alby/trisolver-dist-opt-step2 branch from c22b56f to 42d6681 Compare September 29, 2023 10:16
albestro added a commit that referenced this pull request Oct 16, 2023
Develop: TridiagSolver (dist): STEP1 rank-independent sort of eigenvalues by column type for rank1 solver (#967)
Develop: TridiagSolver (dist): STEP2 Make rank1 work just on local non-deflated eigenvectors (#996)
Develop: TridiagSolver (dist): STEP2b Make rank1 work just on local non-deflated eigenvectors (multi-threaded) (#997)
Develop: TridiagSolver (dist): STEP3 reduce GEMM step computational cost (#998)
@albestro albestro force-pushed the alby/trisolver-dist-opt-step2 branch from 42d6681 to 477b2e2 Compare December 4, 2023 15:17
@albestro albestro force-pushed the alby/trisolver-dist-opt-step2-multi branch from 05d4760 to fffad13 Compare December 4, 2023 15:17
@albestro albestro force-pushed the alby/trisolver-dist-opt-step2 branch from 477b2e2 to 0d17094 Compare December 11, 2023 10:44
@albestro albestro force-pushed the alby/trisolver-dist-opt-step2-multi branch from fffad13 to a538c2e Compare December 12, 2023 08:16
Comment on lines +1279 to +1288
const std::size_t nthreads = [dist_sub, k_lc] {
const std::size_t workload = to_sizet(dist_sub.localSize().rows() * k_lc);
const std::size_t workload_unit = 2 * to_sizet(dist_sub.tile_size().linear_size());

const std::size_t min_workers = 1;
const std::size_t available_workers = getTridiagRank1NWorkers();

const std::size_t ideal_workers = util::ceilDiv(to_sizet(workload), workload_unit);
return std::clamp(ideal_workers, min_workers, available_workers);
}();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pointing out the logic, which might deserve a comment/note.

Number of "ideal" threads is calculated dynamically as a function of how much work rank1-solver has to do. The work is measured in terms of local problem size and a thread should have at least 2 tiles to work on.

This ideal number of threads is then used to determine how many actual threads can be used, by checking the range defined by configuration parameters (at least 1, at most config parameter).

@albestro albestro marked this pull request as ready for review December 12, 2023 09:21
Base automatically changed from alby/trisolver-dist-opt-step2 to master December 12, 2023 09:45
@rasolca
Copy link
Collaborator

rasolca commented Dec 12, 2023

cscs-ci run

@codecov-commenter
Copy link

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (e312c25) 93.98% compared to head (ec9ebf6) 94.04%.

Files Patch % Lines
include/dlaf/eigensolver/tridiag_solver/merge.h 94.50% 3 Missing and 7 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #997      +/-   ##
==========================================
+ Coverage   93.98%   94.04%   +0.05%     
==========================================
  Files         145      145              
  Lines        9034     9050      +16     
  Branches     1159     1157       -2     
==========================================
+ Hits         8491     8511      +20     
  Misses        321      321              
+ Partials      222      218       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rasolca rasolca merged commit 2425d86 into master Dec 12, 2023
4 checks passed
@rasolca rasolca deleted the alby/trisolver-dist-opt-step2-multi branch December 12, 2023 12:42
github-actions bot pushed a commit that referenced this pull request Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants