Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.8.0 #17

Merged
merged 369 commits into from
Apr 17, 2024
Merged

Release 0.8.0 #17

merged 369 commits into from
Apr 17, 2024

Conversation

farchaab
Copy link
Collaborator

@farchaab farchaab commented Apr 9, 2024

PR for version 0.8.0

Summary

Big release with multiple bug fixes and new features:

Changes

command line

  • Added snaketool for easy-to-use command line
  • Added rich-click for beautiful command line
  • Added commands for genome download, read simulation and hmp templates generation
  • Added test command to test genome download and read simulation
  • Added example scheduler profiles for SLURM

input

  • Renamed input table headers (see examples)
  • Support path to local fastas as input for read simulation

community design

  • Use bases instead of reads for calculating sequencing coverage per genome
  • Support reads, bases, sequence and taxonomic abundances for calculating genome abundances
  • Multiple samples with different genomes can be simulated in a single run
  • Added optional replicate generation
  • Added example hmp communities Standard input tables for human samples #5

genome download

  • assembly_finder now uses ncbi-datasets-cli for fast genome downloads
  • No more NCBI key or email required to download (but recommended)

fasta processing

  • Added rule for renaming fastas (avoid names that make snakemake bug)
  • Added rule for extracting fastas with sequence id in header (short sequence names for BAM files)
  • Added rule for splitting fastas into contigs
  • Added rule for calculating genome stats (size, contig number...) when no assembly summary table is provided

read simulation

  • Updated long read simulation (pbsim3) to support ONT r10.4.1 and PacBio hifi read generation
  • Parallel read compression, shuffling and anonymizing
  • Output optional indexed and sorted BAM files
  • Output optional biobox taxonomic profile
  • Support custom error profiles for art_illumina

Github workflows

Fixes

#14
#13

@farchaab farchaab removed the request for review from tpillone April 16, 2024 15:33
@farchaab farchaab marked this pull request as ready for review April 17, 2024 10:30
@farchaab farchaab merged commit 87637f5 into main Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant