perf: run map_variations in reconsensus concurrently #107

ivan-aksamentov · 2025-01-15T05:16:55Z

This uses rayon's parallel iterator to run map_variations() (basically Nextclade) for individual sequences concurrently in reconsensus() step.

We already run it concurrently in solve_promise()step, for individual blocks and then concurrently for alignment within each block, so I thought we can repeat this success in the reconsensus() step as well.

This change results in 64.4% speedup in my measurements on ecoli.

Command:

/usr/bin/time -qf 'Cmd : %C\nTime: %E\nMem : %M KB' pangraph build -b 20 -l 500 --circular data/ecoli.fa.gz -o tmp/ecoli.fa.gz.json -v

Branch	Time
rust (commit `2d2e7b0`)	5m 51s
perf/parallel-nextclade-reconsensus	3m 33s

Things to watch out:

increased memory usage due to increased concurrency
potential reorder of the results (see Ordering question rayon-rs/rayon#551)

Both are also true for all our existing parallel loops.

Related: #108

This uses rayon's parallel iterator to run `map_variations()` (basically Nextclade) for individual sequences concurrently in `reconsensus()` step. We already run it concurrently in `solve_promise()`step, [for individual blocks](https://github.com/neherlab/pangraph/blob/2d2e7b046cbbbc24d1c364f5a0afe2176b662a99/packages/pangraph/src/pangraph/graph_merging.rs#L145-L150) and then concurrently [for alignment within each block](https://github.com/neherlab/pangraph/blob/2d2e7b046cbbbc24d1c364f5a0afe2176b662a99/packages/pangraph/src/pangraph/reweave.rs#L38-L49), so I thought we can repeat this success in the `reconsensus()` step as well. This change results in 64.4% speedup in my measurements on ecoli. Command: ```bash /usr/bin/time -qf 'Cmd : %C\nTime: %E\nMem : %M KB' pangraph build -b 20 -l 500 --circular data/ecoli.fa.gz -o tmp/ecoli.fa.gz.json -v ``` | Branch | Time | |-----------------------------------------|---------| | rust (commit 2d2e7b0) | 5m 51s | | perf/parallel-nextclade-reconsensus | 3m 33s | Things to watch out: * increased memory usage due to increased concurrency * potential reorder of the results

Similar to #107 but less succesful. Here I tried to introduce more parallelism in random places where I found `.iter()` or a a plain loop and where rayon's `.par_iter()` could be used (technically; scientific correctness is to be verified). In my measurements this brings no speedup at all compared to base branch. I'll just leave this as an idea for the future optimization. Perhaps there's a smarter way to find places where we can squeeze some more parallelism, or we could restructure the algo such that new parallelization opportunities appear. At the time, this is low priority I think, so let's focus on other things.

mmolari

I think it's a very good change! And the fact that the order is not guaranteed should not be a problem, since the results should be key-values pairs for a dictionary.

…de-reconsensus

ivan-aksamentov requested a review from mmolari January 15, 2025 05:17

ivan-aksamentov mentioned this pull request Jan 15, 2025

perf: more par_iter #108

Closed

mmolari approved these changes Jan 17, 2025

View reviewed changes

Merge remote-tracking branch 'origin/rust' into perf/parallel-nextcla…

06750c5

…de-reconsensus

ivan-aksamentov merged commit 7765f0d into rust Jan 17, 2025
9 checks passed

ivan-aksamentov deleted the perf/parallel-nextclade-reconsensus branch January 17, 2025 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: run map_variations in reconsensus concurrently #107

perf: run map_variations in reconsensus concurrently #107

ivan-aksamentov commented Jan 15, 2025 •

edited

Loading

mmolari left a comment

perf: run map_variations in reconsensus concurrently #107

perf: run map_variations in reconsensus concurrently #107

Conversation

ivan-aksamentov commented Jan 15, 2025 • edited Loading

mmolari left a comment

Choose a reason for hiding this comment

ivan-aksamentov commented Jan 15, 2025 •

edited

Loading