Skip to content
This repository has been archived by the owner on Jul 17, 2023. It is now read-only.

Workflow specification is complete?: False #143

Closed
t-neumann opened this issue Aug 8, 2018 · 6 comments
Closed

Workflow specification is complete?: False #143

t-neumann opened this issue Aug 8, 2018 · 6 comments

Comments

@t-neumann
Copy link

Dear Manta Team,

I'm having reproducible issues that identically processed datasets are stuck in Manta with the following status messages that are updated hourly but it never runs through:

[2018-08-08T11:55:56.989475Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] ===== MantaWorkflow StatusUpdate =====
[2018-08-08T11:55:56.995621Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Workflow specification is complete?: False
[2018-08-08T11:55:56.996339Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Task status (waiting/queued/running/complete/error): 11849/0/1/1/0
[2018-08-08T11:55:56.996960Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing queued task time (hrs): 0.0000
[2018-08-08T11:55:56.997640Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing queued task name: ''
[2018-08-08T11:55:56.998454Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing running task time (hrs): 0.9997
[2018-08-08T11:55:56.999157Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing running task name: 'getAlignmentStats_generateStats_000'
[2018-08-08T12:55:57.163218Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] ===== MantaWorkflow StatusUpdate =====
[2018-08-08T12:55:57.181497Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Workflow specification is complete?: False
[2018-08-08T12:55:57.182282Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Task status (waiting/queued/running/complete/error): 11849/0/1/1/0
[2018-08-08T12:55:57.183055Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing queued task time (hrs): 0.0000
[2018-08-08T12:55:57.183700Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing queued task name: ''
[2018-08-08T12:55:57.184375Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing running task time (hrs): 2.0000
[2018-08-08T12:55:57.186269Z] [cn-11] [4907_1] [WorkflowRunner] [StatusUpdate] Longest ongoing running task name: 'getAlignmentStats_generateStats_000'

I ran two testfiles produced by bwa - one was a paired-end WGS dataset, the other RNA-seq. But both processed identically, same modules, same resources, same node.

module load samtools/1.4-foss-2017a bwa/0.7.15-foss-2017a

bwa mem -M -R '@RG\tID:MCC_1T\tPL:illumina\tSM:MCC_1T' -t 20 /groups/obenauf/Software/indices/virus/Centrifuge/bwa/p+h+v_unique.fa /groups/obenauf/NGS_data/Leiendecker_MCC/WGS/FASTQ/LP2100019-DNA_A01_1.fastq.gz /groups/obenauf/NGS_data/Leiendecker_MCC/WGS/FASTQ/LP2100019-DNA_A01_2.fastq.gz | samtools view -b - | samtools sort -@ 4 -o MCC_1T.virus.bam

bwa mem -M -R '@RG\tID:MCC_1T\tPL:illumina\tSM:MCC_1T' -t 20 /groups/obenauf/Software/indices/virus/Centrifuge/bwa/p+h+v_unique.fa /scratch-ii2/users/tobias.neumann/95/4f153e2c9c8d88ed3a80209f6e8046/CAFW2ANXX_6#RNAseq-MCC-25-1-GCCACA_1.fq /scratch-ii2/users/tobias.neumann/95/4f153e2c9c8d88ed3a80209f6e8046/CAFW2ANXX_6#RNAseq-MCC-25-1-GCCACA_2.fq | samtools view -b - | samtools sort -@ 4 -o MCC-25-1.virus.bam

Then I run manta out of the official Docker container. the MCC_1T (WGS 157 GB) passes after a couple of hours successfully. The MCC-25 sample (RNA-seq 1.5 GB) is looping with the bespoke message.

singularity exec docker://quay.io/biocontainers/manta:1.4.0--py27_1 configManta.py --bam ../bwa/MCC_1T.virus.bam --referenceFasta /groups/obenauf/Software/indices/virus/Centrifuge/bwa/p+h+v_unique.fa --runDir MCC_1T_manta
singularity exec docker://quay.io/biocontainers/manta:1.4.0--py27_1 MCC_1T_manta/runWorkflow.py -m local -j 24 -g 100

singularity exec docker://quay.io/biocontainers/manta:1.4.0--py27_1 configManta.py --bam ../bwa/MCC-25-1.virus.bam --referenceFasta /groups/obenauf/Software/indices/virus/Centrifuge/bwa/p+h+v_unique.fa --runDir testrun_self
singularity exec docker://quay.io/biocontainers/manta:1.4.0--py27_1 testrun_self/runWorkflow.py -m local -j 24 -g 100

I also tried running the RNA-seq sample with the --rna flag, but that didn't help.

Can you help me on this? I'm out of options.

@x-chen
Copy link
Contributor

x-chen commented Aug 8, 2018

Yes, RNA-seq sample need be run with --rna option.

Manta seemed to choke at the stats generation step, which essentially subsample reads and estimate insertion size from the input bam. Could you describe more about the RNA-seq bam? Are they paired-end reads?

My intuition is that there might be some unexpected characteristics of the RNA bam. If that's not easy to identify, I would need a sample of your bam files to debug. Is it possible to share a bamlet that can demonstrate the same problem?

@t-neumann
Copy link
Author

Yes they are paired-end reads. Could you provide me with an email adress, then I could send you an example bam via file-share links.

@t-neumann
Copy link
Author

Dear @x-chen. I sent you some testdata last week and just was wondering whether you need anything in addition?

@x-chen
Copy link
Contributor

x-chen commented Aug 23, 2018

I was on vacation last week. And I will take a look once time allows.

@x-chen
Copy link
Contributor

x-chen commented Aug 31, 2018

There was a bug in stats generation that was triggered by reference sequences smaller than 100bp. I just created a branch bug-MANTA-1459 with the bug fixed.
https://github.com/Illumina/manta/tree/bug-MANTA-1459

I have tested it against the test data you provided, and the bug fix will be rolled in the next release.

@x-chen
Copy link
Contributor

x-chen commented Nov 13, 2018

The bug fix is now in v1.5.0

@x-chen x-chen closed this as completed Nov 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants