Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: run map_variations in reconsensus concurrently #107

Merged
merged 2 commits into from
Jan 17, 2025

Conversation

ivan-aksamentov
Copy link
Member

@ivan-aksamentov ivan-aksamentov commented Jan 15, 2025

This uses rayon's parallel iterator to run map_variations() (basically Nextclade) for individual sequences concurrently in reconsensus() step.

We already run it concurrently in solve_promise()step, for individual blocks and then concurrently for alignment within each block, so I thought we can repeat this success in the reconsensus() step as well.

This change results in 64.4% speedup in my measurements on ecoli.

Command:

/usr/bin/time -qf 'Cmd : %C\nTime: %E\nMem : %M KB' pangraph build -b 20 -l 500 --circular data/ecoli.fa.gz -o tmp/ecoli.fa.gz.json -v
Branch Time
rust (commit 2d2e7b0) 5m 51s
perf/parallel-nextclade-reconsensus 3m 33s

Things to watch out:

Both are also true for all our existing parallel loops.

Related: #108

This uses rayon's parallel iterator to run `map_variations()` (basically Nextclade) for individual sequences concurrently in `reconsensus()` step.

We already run it concurrently in `solve_promise()`step, [for individual blocks](https://github.com/neherlab/pangraph/blob/2d2e7b046cbbbc24d1c364f5a0afe2176b662a99/packages/pangraph/src/pangraph/graph_merging.rs#L145-L150) and then concurrently [for alignment within each block](https://github.com/neherlab/pangraph/blob/2d2e7b046cbbbc24d1c364f5a0afe2176b662a99/packages/pangraph/src/pangraph/reweave.rs#L38-L49), so I thought we can repeat this success in the `reconsensus()` step as well.

This change results in 64.4% speedup in my measurements on ecoli.

Command:

```bash
/usr/bin/time -qf 'Cmd : %C\nTime: %E\nMem : %M KB' pangraph build -b 20 -l 500 --circular data/ecoli.fa.gz -o tmp/ecoli.fa.gz.json -v
```

| Branch                                  | Time    |
|-----------------------------------------|---------|
| rust (commit 2d2e7b0)                   | 5m 51s  |
| perf/parallel-nextclade-reconsensus     | 3m 33s  |

Things to watch out:
 * increased memory usage due to increased concurrency
 * potential reorder of the results
ivan-aksamentov added a commit that referenced this pull request Jan 15, 2025
Similar to #107 but less succesful.

Here I tried to introduce more parallelism in random places where I found `.iter()` or a a plain loop and where rayon's `.par_iter()` could be used (technically; scientific correctness is to be verified).

In my measurements this brings no speedup at all compared to base branch.

I'll just leave this as an idea for the future optimization. Perhaps there's a smarter way to find places where we can  squeeze some more parallelism, or we could restructure the algo such that new parallelization opportunities appear.

At the time, this is low priority I think, so let's focus on other things.
Copy link
Collaborator

@mmolari mmolari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a very good change! And the fact that the order is not guaranteed should not be a problem, since the results should be key-values pairs for a dictionary.

@ivan-aksamentov ivan-aksamentov merged commit 7765f0d into rust Jan 17, 2025
9 checks passed
@ivan-aksamentov ivan-aksamentov deleted the perf/parallel-nextclade-reconsensus branch January 17, 2025 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants