Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threading for periodic FDSBP flux differencing #1029

Merged
merged 17 commits into from
Jan 13, 2022

Conversation

jlchan
Copy link
Contributor

@jlchan jlchan commented Dec 30, 2021

Trixi parallelizes over elements, but periodic FDSBP solvers are "single element" and can't take advantage of threading. This implements a naive threading (without exploiting symmetry or skew-symmetry to reduce total flux evaluations). which gives about 4x speedup on my 8-core machine.

On main

rhs!                      435    21.5s  99.1%  49.5ms    971KiB  4.52%  2.23KiB
   volume integral         435    21.0s  96.7%  48.3ms    136KiB  0.63%     319B

On this PR

 rhs!                      435    5.55s  92.1%  12.8ms   1.08MiB  3.56%  2.54KiB
   volume integral         435    5.07s  84.1%  11.6ms    265KiB  0.85%     624B

@codecov
Copy link

codecov bot commented Dec 30, 2021

Codecov Report

Merging #1029 (77d7055) into main (5986c2b) will decrease coverage by 0.08%.
The diff coverage is 48.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1029      +/-   ##
==========================================
- Coverage   93.99%   93.91%   -0.08%     
==========================================
  Files         291      291              
  Lines       22518    22550      +32     
==========================================
+ Hits        21164    21177      +13     
- Misses       1354     1373      +19     
Flag Coverage Δ
unittests 93.91% <48.65%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/solvers/dgmulti/sbp.jl 88.10% <36.67%> (-8.57%) ⬇️
src/solvers/dgmulti/dg.jl 96.15% <100.00%> (+0.09%) ⬆️
src/solvers/dgmulti/flux_differencing.jl 84.30% <100.00%> (-0.13%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5986c2b...77d7055. Read the comment docs.

@jlchan
Copy link
Contributor Author

jlchan commented Jan 11, 2022

The most recent changes invoke existing calc_volume_integral! routines for SBP flux differencing if Threads.nthreads() < 2. This is motivated by the fact that the threading implementation does not exploit symmetry, so it loses roughly a factor of 2 on a single thread compared with previous optimized implementations.

On one thread N=32

 rhs!                      246    1.32s  97.8%  5.37ms    402KiB  13.4%  1.63KiB
   volume integral         246    1.29s  95.7%  5.26ms   34.6KiB  1.15%     144B

On 8 threads with N=32

 rhs!                      246    370ms  94.1%  1.51ms    629KiB  19.5%  2.56KiB
   volume integral         246    358ms  91.0%  1.45ms    150KiB  4.64%     624B

This is about a 4x speedup, which we'd expect (8x speedup from threading but 1/2 efficiency due to not exploiting symmetry).

@jlchan jlchan marked this pull request as ready for review January 11, 2022 04:11
@jlchan jlchan changed the title WIP: threading for periodic FDSBP flux differencing Threading for periodic FDSBP flux differencing Jan 11, 2022
@jlchan jlchan requested review from sloede and ranocha January 11, 2022 04:11
Copy link
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! I just have some minor comments.

examples/dgmulti_3d/elixir_euler_tgv_fdsbp_periodic.jl Outdated Show resolved Hide resolved
src/solvers/dgmulti/sbp.jl Outdated Show resolved Hide resolved
@ranocha ranocha linked an issue Jan 11, 2022 that may be closed by this pull request
@jlchan jlchan requested a review from ranocha January 11, 2022 14:40
Copy link
Member

@sloede sloede left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor comment from my side (which can be resolved or just left as it is). Otherwise it LGTM

src/solvers/dgmulti/sbp.jl Show resolved Hide resolved
ranocha
ranocha previously approved these changes Jan 11, 2022
@ranocha ranocha enabled auto-merge (squash) January 11, 2022 15:33
@jlchan
Copy link
Contributor Author

jlchan commented Jan 11, 2022

Hopefully tests should pass this time

@jlchan jlchan closed this Jan 11, 2022
auto-merge was automatically disabled January 11, 2022 20:45

Pull request was closed

@jlchan jlchan reopened this Jan 11, 2022
@jlchan
Copy link
Contributor Author

jlchan commented Jan 12, 2022

@sloede @ranocha the coverage tests are failing, and I think it's because a branch with Threads.nthreads() > 1 isn't getting picked up. I added a periodic FDSBP test in test_threaded.jl though, so I'm not sure why it's not detecting coverage. Any ideas?

@ranocha
Copy link
Member

ranocha commented Jan 12, 2022

I guess it's just running with two threads and you check for more than two threads?

We do not collect coverage info for threaded tests:

run(`$(Base.julia_cmd()) --threads=$TRIXI_NTHREADS --check-bounds=yes --code-coverage=none $(abspath("test_threaded.jl"))`)

Collecting coverage info with multiple threads is awfully slow, see JuliaLang/julia#36142...

From my point of view, it's okay to merge this as it is, even with reduced coverage info (since we know that we test the new code paths, just without coverage due to the issue mentioned above). What do you think, @sloede?

@jlchan
Copy link
Contributor Author

jlchan commented Jan 12, 2022

@sloede are you OK if I merge this manually? The coverage drop is due to threading tests not being run w/coverage.

@ranocha ranocha merged commit a58e1a6 into trixi-framework:main Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Threading does nothing for periodic FDSBP solvers
3 participants