Skip to content

Commit

Permalink
Merge pull request #7 from EBI-Metagenomics/feature/restructure_outpu…
Browse files Browse the repository at this point in the history
…ts_mbc

Some tweaks on top of the restructure_outputs branch.
  • Loading branch information
mberacochea authored May 30, 2024
2 parents 5bc3cfa + a80aac1 commit 1963f75
Show file tree
Hide file tree
Showing 49 changed files with 1,122 additions and 549 deletions.
11 changes: 7 additions & 4 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,11 @@ If you're not used to this workflow with git, you can start with some [docs from

## Tests

You can optionally test your changes by running the pipeline locally. Then it is recommended to use the `debug` profile to
receive warnings about process selectors and other debug info. Example: `nextflow run . -profile debug,test,docker --outdir <OUTDIR>`.
You have the option to test your changes locally by running the pipeline. For receiving warnings about process selectors and other `debug` information, it is recommended to use the debug profile. Execute all the tests with the following command:

```bash
nf-test test --profile debug,test,docker --verbose
```

When you create a pull request with changes, [GitHub Actions](https://github.com/features/actions) will run automatic tests.
Typically, pull-requests are only fully reviewed when these tests are passing, though of course we can help out before then.
Expand All @@ -40,7 +43,7 @@ If any failures or warnings are encountered, please follow the listed URL for mo

### Pipeline tests

Each `nf-core` pipeline should be set up with a minimal set of test-data.
Each of the Microbiome Informatics pipelines should be set up with a minimal set of test-data.
`GitHub Actions` then runs the pipeline on this data to ensure that it exits successfully.
If there are any failures then the automated tests fail.
These tests are run both with the latest available version of `Nextflow` and also the minimum required version that is stated in the pipeline code.
Expand Down Expand Up @@ -82,7 +85,7 @@ Once there, use `nf-core schema build` to add to `nextflow_schema.json`.

Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/master/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels.

The process resources can be passed on to the tool dynamically within the process with the `${task.cpu}` and `${task.memory}` variables in the `script:` block.
The process resources can be passed on to the tool dynamically within the process with the `${task.cpus}` and `${task.memory}` variables in the `script:` block.

### Naming schemes

Expand Down
36 changes: 36 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: nf-test CI
on:
push:
branches:
- dev
pull_request:
release:
types: [published]

env:
NXF_ANSI_LOG: false
NFTEST_VER: "0.8.4"

jobs:
test:
name: Run pipeline with test data
runs-on: ubuntu-latest

steps:
- name: Check out pipeline code
uses: actions/checkout@v4

- uses: actions/setup-java@99b8673ff64fbf99d8d325f52d9a5bdedb8483e9 # v4
with:
distribution: "temurin"
java-version: "17"

- name: Setup Nextflow
uses: nf-core/setup-nextflow@v2

- name: Install nf-test
uses: nf-core/setup-nf-test@v1

- name: Run pipeline with test data
run: |
nf-test test
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,8 @@ testing*
results/

*.pyc
.pytest_cache/

assets/fetch_tool_credentials.json
assets/fetch_tool_credentials.json
.nf-test.log
.nf-test/
30 changes: 23 additions & 7 deletions .nf-core.yml
Original file line number Diff line number Diff line change
@@ -1,32 +1,48 @@
repository_type: pipeline
template:
prefix: ebi-metagenomics
skip:
- ci
- github_badges
lint:
files_exist:
- CODE_OF_CONDUCT.md
- assets/nf-core-miassembler_logo_light.png
- docs/images/nf-core-miassembler_logo_light.png
- docs/images/nf-core-miassembler_logo_dark.png
- docs/output.md
- docs/usage.md
- .github/ISSUE_TEMPLATE/config.yml
- .github/workflows/awstest.yml
- .github/workflows/awsfulltest.yml
- .github/workflows/branch.yml
- .github/workflows/ci.yml
- .github/workflows/linting_comment.yml
- .github/workflows/linting.yml
- conf/test_full.config
- lib/Utils.groovy
- lib/WorkflowMain.groovy
- lib/NfcoreTemplate.groovy
- lib/WorkflowMiassembler.groovy
- lib/nfcore_external_java_deps.jar
files_unchanged:
- CODE_OF_CONDUCT.md
- assets/nf-core-miassembler_logo_light.png
- docs/images/nf-core-miassembler_logo_light.png
- docs/images/nf-core-miassembler_logo_dark.png
- .github/ISSUE_TEMPLATE/bug_report.yml
- .github/CONTRIBUTING.md
- LICENSE
- docs/README.md
- .gitignore
multiqc_config:
- report_comment
nextflow_config:
nextflow_config: False
- params.input
- params.validationSchemaIgnoreParams
- params.custom_config_version
- params.custom_config_base
- manifest.name
- manifest.homePage
readme:
- nextflow_badge
repository_type: pipeline
template:
prefix: ebi-metagenomics
skip:
- ci
- github_badges
56 changes: 52 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@

This pipeline is still in early development. It's mostly a direct port of the mi-automation assembly generation pipeline. Some of the bespoke scripts used to remove contaminated contigs or to calculate the coverage of the assembly were replaced with tools provided by the community ([SeqKit](https://doi.org/10.1371/journal.pone.0163962) and [quast](https://doi.org/10.1093/bioinformatics/btu153) respectively).

> [!NOTE]
> This pipeline uses the nf-core template with some tweaks, but it's not part of nf-core.
## Usage

> [!WARNING]
Expand All @@ -23,12 +26,21 @@ nextflow run ebi-metagenomics/miassembler --help
Input/output options
--study_accession [string] The ENA Study secondary accession
--reads_accession [string] The ENA Run primary accession
--assembler [string] The short reads assembler (accepted: spades, metaspades, megahit) [default: metaspades for PE, megahit for SE]
--private_study [boolean] To use if the ENA study is private [default: false]
--assembler [string] The short reads assembler (accepted: spades, metaspades, megahit) [default: metaspades]
--reference_genome [string] The genome to be used to clean the assembly, the genome will be taken from the Microbiome Informatics internal
directory (accepted: chicken.fna, salmon.fna, cod.fna, pig.fna, cow.fna, mouse.fna, honeybee.fna,
rainbow_trout.fna, ...) [default: human+phiX]
--reference_genomes_folder [string] The folder with the reference genome blast indexes, defaults to the Microbiome Informatics internal directory
[default: /nfs/production/rdf/metagenomics/pipelines/prod/assembly-pipeline/blast_dbs/]
rainbow_trout.fna, rat.fna, ...)
--blast_reference_genomes_folder [string] The folder with the reference genome blast indexes, defaults to the Microbiome Informatics internal
directory.
--bwamem2_reference_genomes_folder [string] The folder with the reference genome bwa-mem2 indexes, defaults to the Microbiome Informatics internal
directory.
--remove_human_phix [boolean] Remove human and phiX reads pre assembly, and contigs matching those genomes. [default: true]
--human_phix_blast_index_name [string] Combined Human and phiX BLAST db. [default: human_phix]
--human_phix_bwamem2_index_name [string] Combined Human and phiX bwa-mem2 index. [default: human_phix]
--min_contig_length [integer] Minimum contig length filter. [default: 500]
--assembly_memory [integer] Default memory allocated for the assembly process. [default: 100]
--spades_only_assembler [boolean] Run SPAdes/metaSPAdes without the error correction step. [default: true]
--outdir [string] The output directory where the results will be saved. You have to use absolute paths to storage on Cloud
infrastructure.
--email [string] Email address for completion summary.
Expand All @@ -50,7 +62,43 @@ nextflow run ebi-metagenomics/miassembler \
--reads_accession SRR1631361
```
## Outputs
The outputs of the pipeline are organized as follows:
```
results/SRP1154
└── SRP115494
└── SRR6180
└── SRR6180434
├── assembly
│   └── metaspades
│   └── 3.15.5
│   ├── coverage
│   ├── decontamination
│   └── qc
│   ├── multiqc
│   └── quast
└── qc
├── fastp
└── fastqc

```
The nested structure based on ENA Study and Reads accessions was created to suit the Microbiome Informatics team’s needs. The benefit of this structure is that results from different runs of the same study won’t overwrite any results.
## Tests
There is a very small test data set ready to use:
```bash
nextflow run main.nf -resume -profile test,docker
```
### End to end tests
Two end-to-end tests can be launched (with megahit and metaspades) with the following command:
```bash
pytest tests/workflows/ --verbose
```
2 changes: 1 addition & 1 deletion assets/email_template.html
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

<img src="cid:nfcorepipelinelogo">

<h1>ebi-metagenomics/miassembler v${version}</h1>
<h1>ebi-metagenomics/miassembler ${version}</h1>
<h2>Run Name: $runName</h2>

<% if (!success){
Expand Down
Binary file added assets/mgnify_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 4 additions & 3 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
report_comment: >
This report has been generated by the <a href="https://github.com/ebi-metagenomics/miassembler/tree/dev" target="_blank">ebi-metagenomics/miassembler</a>
This report has been generated by the <a href="https://github.com/ebi-metagenomics/miassembler/" target="_blank">ebi-metagenomics/miassembler</a>
analysis pipeline.
report_section_order:
"ebi-metagenomics-miassembler-methods-description":
order: -1000
software_versions:
order: -1001
"ebi-metagenomics-miassembler-summary":
order: -1002

export_plots: true

skip_versions_section: true

top_modules:
- fastqc
- quast
Expand Down
3 changes: 0 additions & 3 deletions assets/samplesheet.csv

This file was deleted.

Loading

0 comments on commit 1963f75

Please sign in to comment.