Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.2.1 #42

Merged
merged 9 commits into from
Jan 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/nightly.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:
binary: ${{ matrix.binary }}

# ---------------------------------------------------------------------------
latest:
nightly:

needs: compile

Expand Down Expand Up @@ -88,7 +88,7 @@ jobs:
uses: actions/upload-artifact@v3
if: always()
with:
name: validate-${{ matrix.arch }}
path: output/sars-cov-2/nightly/validate
name: validate-linelist_${{ matrix.arch }}
path: output/sars-cov-2/nightly/validate/linelist.tsv
if-no-files-found: error
retention-days: 7
6 changes: 5 additions & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,10 @@ jobs:
if: always()
with:
name: output-${{ matrix.arch }}
path: output
path: |
output/alignment
output/toy1
output/populations/linelist.tsv

if-no-files-found: error
retention-days: 7
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,10 @@

## Install

`rebar` is a standalone binary file that you can simply download and run:
`rebar` is a standalone binary file, we recommend [conda](https://anaconda.org/bioconda/rebar) or [direct download](https://github.com/phac-nml/rebar/releases/latest/download/rebar-x86_64-unknown-linux-musl).

```bash
wget -O rebar https://github.com/phac-nml/rebar/releases/latest/download/rebar-x86_64-unknown-linux-musl
./rebar --help
conda install -c bioconda rebar
```

- Please see the [install](docs/install.md) docs for Windows, macOS, Docker, Singularity, and Conda.
Expand Down
2 changes: 0 additions & 2 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ wget "https://github.com/phac-nml/rebar/releases/latest/download/rebar-x86_64-pc

## Conda

> **Coming Soon!** The conda install option will be available when this page is live: https://anaconda.org/bioconda/rebar

```bash
conda create -c bioconda -n rebar rebar
conda activate rebar
Expand Down
17 changes: 17 additions & 0 deletions docs/notes/Notes_v0.2.1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# v0.2.1

This patch release documents the new conda installation method, resolves a plotting artifact and improves performance of the `phylogeny` function `get_common_ancestor`.

## New

- Issue #10, PR #34 | Add documentation for Conda installation.

## Fixes

- Issue #27, PR #37 | Fix legend overflow in plot.

## Changes

- Issue #23, PR #39 | Change `parsimony::from_sequence` param `coordinates` from Vector to Slice.
- Issue #28, PR #41 | Improve peformance of `phylogeny::get_common_ancestor`
- Issue #29, PR #38 | Reduce large artifact uploads in CI.
2 changes: 1 addition & 1 deletion src/dataset/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ impl Dataset {
&self,
sequence: &Sequence,
populations: Option<&Vec<&String>>,
coordinates: Option<&Vec<usize>>,
coordinates: Option<&[usize]>,
) -> Result<SearchResult, Report> {
// initialize an empty result, this will be the final product of this function
let mut result = SearchResult::new(sequence);
Expand Down
104 changes: 50 additions & 54 deletions src/phylogeny/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
use crate::utils;
use color_eyre::eyre::{eyre, Report, Result, WrapErr};
use color_eyre::eyre::{eyre, ContextCompat, Report, Result, WrapErr};
use color_eyre::Help;
use itertools::Itertools;
use log::debug;
Expand All @@ -9,7 +9,6 @@ use petgraph::visit::{Dfs, IntoNodeReferences};
use petgraph::Direction;
use serde::{Deserialize, Serialize};
use serde_json;
use std::collections::HashMap;
use std::fs::File;
use std::io::Write;
use std::path::Path;
Expand Down Expand Up @@ -369,65 +368,62 @@ impl Phylogeny {
}

/// Identify the most recent common ancestor shared between all node names.
pub fn get_common_ancestor(&self, names: &Vec<String>) -> Result<String, Report> {
pub fn get_common_ancestor(&self, names: &[String]) -> Result<String, Report> {
// if only one node name was provided, just return it
if names.len() == 1 {
let common_ancestor = names[0].clone();
return Ok(common_ancestor);
}

// Phase 1: Count up the ancestors shared between all named populations
let mut ancestor_counts: HashMap<String, Vec<String>> = HashMap::new();
let mut ancestor_depths: HashMap<String, isize> = HashMap::new();

for name in names {
// directly use the get_paths method over get_ancestors, because
// get_ancestors removes the self node name from the list,
// but some datasets have named internal nodes, so a listed
// node could be a common ancestor!
let ancestor_paths = self.get_paths("root", name, petgraph::Outgoing)?;

for ancestor_path in ancestor_paths {
for (depth, ancestor) in ancestor_path.iter().enumerate() {
let depth = depth as isize;
// add ancestor if first time encountered
ancestor_depths.entry(ancestor.clone()).or_insert(depth);

// recombinants can appear multiple times in ancestors, update
// depth map to use deepest one
if depth > ancestor_depths[ancestor] {
ancestor_depths.insert(ancestor.clone(), depth);
}
ancestor_counts
.entry(ancestor.clone())
.and_modify(|p| {
p.push(name.clone());
p.dedup();
})
.or_insert(vec![name.clone()]);
}
}
}
// mass pile of all ancestors of all named nodes
let ancestors: Vec<_> = names
.iter()
.map(|pop| {
let paths = self.get_paths(pop, "root", Direction::Incoming)?;
let ancestors = paths.into_iter().flatten().unique().collect_vec();
debug!("{pop}: {ancestors:?}");
Ok(ancestors)
})
.collect::<Result<Vec<_>, Report>>()?
.into_iter()
.flatten()
.collect::<Vec<_>>();

// get ancestors shared by all sequences
let common_ancestors: Vec<_> = ancestors
.iter()
.unique()
.filter(|anc| {
let count = ancestors.iter().filter(|pop| pop == anc).count();
count == names.len()
})
.collect();

debug!("common_ancestors: {common_ancestors:?}");

// get the depths (distance to root) of the common ancestors
let depths = common_ancestors
.into_iter()
.map(|pop| {
let paths = self.get_paths(pop, "root", Direction::Incoming)?;
let longest_path = paths
.into_iter()
.max_by(|a, b| a.len().cmp(&b.len()))
.unwrap_or_default();
let depth = longest_path.len();
debug!("{pop}: {depth}");
Ok((pop, depth))
})
.collect::<Result<Vec<_>, Report>>()?;

// Phase 2: Find the highest depth ancestor shared between all
let mut common_ancestor = "root".to_string();
let mut max_depth = 0;

for (ancestor, populations) in ancestor_counts {
// Which ancestors were found in all populations?
if populations.len() == names.len() {
// Which ancestor has the max depth?

let depth = ancestor_depths
.get(&ancestor)
.cloned()
.expect("Ancestor {ancestor} was not found in ancestor depths.");
if depth > max_depth {
max_depth = depth;
common_ancestor = ancestor;
}
}
}
// get the deepest (ie. most recent common ancestor)
let deepest_ancestor = depths
.into_iter()
.max_by(|a, b| a.1.cmp(&b.1))
.context("Failed to get common ancestor.")?;

// tuple (population name, depth)
let common_ancestor = deepest_ancestor.0.to_string();

Ok(common_ancestor)
}
Expand Down
10 changes: 9 additions & 1 deletion src/plot/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -268,11 +268,19 @@ pub fn create(
.ok_or_else(|| eyre!("Failed to calculated the maximum coord length"))?;

// longest legend label (in pixels)
let default_labels =
vec!["Reference", "Private Mutation"].into_iter().map(String::from).collect_vec();

let longest_legend_label = parents
.iter()
.chain(default_labels.clone().iter())
.map(|id| {
let label = match default_labels.contains(id) {
true => id.to_string(),
false => format!("{id} Reference"),
};
text::to_image(
&format!("{id} Reference"),
&label,
constants::FONT_REGULAR,
constants::FONT_SIZE,
&constants::TEXT_COLOR,
Expand Down
2 changes: 1 addition & 1 deletion src/sequence/parsimony.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ impl Summary {
pub fn from_sequence(
sequence: &Sequence,
query: &Sequence,
coordinates: Option<&Vec<usize>>,
coordinates: Option<&[usize]>,
) -> Result<Self, Report> {
let mut parsimony_summary = Summary::new();

Expand Down
Loading