Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can this pepeline be used to analyze non-canccer data #13

Open
Jokendo-collab opened this issue May 22, 2020 · 7 comments
Open

How can this pepeline be used to analyze non-canccer data #13

Jokendo-collab opened this issue May 22, 2020 · 7 comments

Comments

@Jokendo-collab
Copy link

I find this pipeline very nice but how can it be used in the analysis of non cancer data?

@yafeng
Copy link
Collaborator

yafeng commented Jun 3, 2020

Hi @javanOkendo if you are not interested in detecting SNP variants or somatic mutations, you can remove the CanProVar and COSMIC entries in the varDB database before you apply the workflow. Then you can follow the same steps for non-cancer data.

@Jokendo-collab
Copy link
Author

@yafeng I am getting the following error;
executor > local (30)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[59/5329f4] process > percolator [100%] 2 of 2 ✔
[69/7b5133] process > getNovelPercolator [100%] 2 of 2 ✔
[e0/43a2dd] process > getVariantPercolator [100%] 2 of 2 ✔
[3c/290056] process > filterPercolator [100%] 4 of 4 ✔
[db/b21fa7] process > svmToTSV [100%] 4 of 4 ✔
[f8/a2085e] process > createPSMTables [100%] 4 of 4 ✔
[2a/17247d] process > prePeptideTable [ 0%] 0 of 4
[48/0aed18] process > prepSpectrumAI [ 0%] 0 of 2
[31/1fa8ca] process > mergeSetPSMtable [ 0%] 0 of 2
ERROR ~ Error executing process > 'prepSpectrumAI (1)'

Caused by:
Process prepSpectrumAI (1) terminated with an error exit status (255)

Command executed:

label_sub_pos.py --input_psm sampleA_variant_psmtable.txt --output specai_in.txt

Command exit status:
255

Command output:
(empty)

Command error:
FATAL: container creation failed: mount /etc/localtime->/etc/localtime error: while mounting /etc/localtime: could not mount /etc/localtime: input/output error

Work dir:
/scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/workflow/work/dd/98189bad1266cd02493be6e46b1815
Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

executor > local (30)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[59/5329f4] process > percolator [100%] 2 of 2 ✔
[69/7b5133] process > getNovelPercolator [100%] 2 of 2 ✔
[e0/43a2dd] process > getVariantPercolator [100%] 2 of 2 ✔
[3c/290056] process > filterPercolator [100%] 4 of 4 ✔
[db/b21fa7] process > svmToTSV [100%] 4 of 4 ✔
[f8/a2085e] process > createPSMTables [100%] 4 of 4 ✔
[2a/17247d] process > prePeptideTable [ 0%] 0 of 4
[dd/98189b] process > prepSpectrumAI [ 50%] 1 of 2, failed: 1
[31/1fa8ca] process > mergeSetPSMtable [ 0%] 0 of 2
WARN: Killing pending tasks (7)
ERROR ~ Error executing process > 'prepSpectrumAI (1)'

Caused by:
Process prepSpectrumAI (1) terminated with an error exit status (255)

Command executed:

label_sub_pos.py --input_psm sampleA_variant_psmtable.txt --output specai_in.txt

Command exit status:
255
Command output:
(empty)

Command error:
FATAL: container creation failed: mount /etc/localtime->/etc/localtime error: while mounting /etc/localtime: could not mount /etc/localtime: input/output error

Work dir:
/scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/workflow/work/dd/98189bad1266cd02493be6e46b1815

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

executor > local (30)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[59/5329f4] process > percolator [100%] 2 of 2 ✔
[69/7b5133] process > getNovelPercolator [100%] 2 of 2 ✔
[e0/43a2dd] process > getVariantPercolator [100%] 2 of 2 ✔
[3c/290056] process > filterPercolator [100%] 4 of 4 ✔
[db/b21fa7] process > svmToTSV [100%] 4 of 4 ✔
[f8/a2085e] process > createPSMTables [100%] 4 of 4 ✔
[2c/8dba81] process > prePeptideTable [ 75%] 3 of 4, failed: 3
[48/0aed18] process > prepSpectrumAI [100%] 2 of 2, failed: 2
[31/1fa8ca] process > mergeSetPSMtable [ 0%] 0 of 2
WARN: Killing pending tasks (7)
ERROR ~ Error executing process > 'prepSpectrumAI (1)'

Caused by:
Process prepSpectrumAI (1) terminated with an error exit status (255)
Command executed:

label_sub_pos.py --input_psm sampleA_variant_psmtable.txt --output specai_in.txt

Command exit status:
255

Command output:
(empty)

Command error:
FATAL: container creation failed: mount /etc/localtime->/etc/localtime error: while mounting /etc/localtime: could not mount /etc/localtime: input/output error

Work dir:
/scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/workflow/work/dd/98189bad1266cd02493be6e46b1815

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

executor > local (30)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[59/5329f4] process > percolator [100%] 2 of 2 ✔
[69/7b5133] process > getNovelPercolator [100%] 2 of 2 ✔
[e0/43a2dd] process > getVariantPercolator [100%] 2 of 2 ✔
[3c/290056] process > filterPercolator [100%] 4 of 4 ✔
[db/b21fa7] process > svmToTSV [100%] 4 of 4 ✔
[f8/a2085e] process > createPSMTables [100%] 4 of 4 ✔
[2a/17247d] process > prePeptideTable [100%] 4 of 4, failed: 4
[48/0aed18] process > prepSpectrumAI [100%] 2 of 2, failed: 2
[31/1fa8ca] process > mergeSetPSMtable [100%] 2 of 2, failed: 2
WARN: Killing pending tasks (7)
ERROR ~ Error executing process > 'prepSpectrumAI (1)'

@Jokendo-collab
Copy link
Author

N E X T F L O W ~ version 19.04.1 Launching main.nf[small_banach] - revision: 6ff5d47fa2 2 mzML files in analysis Detected setnames: sampleA, sampleB [warm up] executor > local WARN: Input tuple does not match input set cardinality declared by processsplitSetNormalSearchPsms` -- offending value: sampleA
executor > local (1)
[ea/c37f3b] process > makeTargetSeqLookup [ 0%] 0 of 1
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔

executor > local (2)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [ 0%] 0 of 1

executor > local (2)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔

executor > local (4)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[58/a326b6] process > msgfPlus [ 0%] 0 of 2

executor > local (4)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[58/a326b6] process > msgfPlus [ 50%] 1 of 2
executor > local (6)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[d8/3b6275] process > percolator [ 0%] 0 of 2

executor > local (8)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[d8/3b6275] process > percolator [ 50%] 1 of 2
[0c/7c7b80] process > getNovelPercolator [ 0%] 0 of 1
[11/f42b61] process > getVariantPercolator [ 0%] 0 of 1

executor > local (8)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[d8/3b6275] process > percolator [ 50%] 1 of 2
[0c/7c7b80] process > getNovelPercolator [100%] 1 of 1
[11/f42b61] process > getVariantPercolator [100%] 1 of 1

executor > local (9)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[d8/3b6275] process > percolator [ 50%] 1 of 2
[0c/7c7b80] process > getNovelPercolator [100%] 1 of 1
[11/f42b61] process > getVariantPercolator [100%] 1 of 1
--More--(20%)
Work dir:
/scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/workflow/work/dd/98189bad1266cd02493be6e46b1815

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run

-- Check '.nextflow.log' file for details

executor > local (30)
[ea/c37f3b] process > makeTargetSeqLookup [100%] 1 of 1 ✔
[8e/4f629c] process > makeTrypSeq [100%] 1 of 1, cached: 1 ✔
[b6/342beb] process > createSpectraLookup [100%] 1 of 1, cached: 1 ✔
[0a/a91539] process > makeProtSeq [100%] 1 of 1, cached: 1 ✔
[a3/5e3717] process > concatFasta [100%] 1 of 1 ✔
[9e/513c28] process > msgfPlus [100%] 2 of 2 ✔
[59/5329f4] process > percolator [100%] 2 of 2 ✔
[69/7b5133] process > getNovelPercolator [100%] 2 of 2 ✔
[e0/43a2dd] process > getVariantPercolator [100%] 2 of 2 ✔
[3c/290056] process > filterPercolator [100%] 4 of 4 ✔
[db/b21fa7] process > svmToTSV [100%] 4 of 4 ✔
[f8/a2085e] process > createPSMTables [100%] 4 of 4 ✔
[2a/17247d] process > prePeptideTable [100%] 4 of 4, failed: 4
[48/0aed18] process > prepSpectrumAI [100%] 2 of 2, failed: 2
[31/1fa8ca] process > mergeSetPSMtable [100%] 2 of 2, failed: 2
WARN: Killing pending tasks (7)
ERROR ~ Error executing process > 'prepSpectrumAI (1)'

Caused by:
Process prepSpectrumAI (1) terminated with an error exit status (255)

Command executed:

label_sub_pos.py --input_psm sampleA_variant_psmtable.txt --output specai_in.txt

Command exit status:
255

Command output:
(empty)

Command error:
FATAL: container creation failed: mount /etc/localtime->/etc/localtime error: while mounting /etc/localtime: could not mount /etc/localtime: input/output error
--More--(98%)

  • `

@yafeng
Copy link
Collaborator

yafeng commented Jun 30, 2020

I haven't seen this error before,
can you paste the command you used? and what database was used?

@Jokendo-collab
Copy link
Author

@yafeng I am analyzing the data from Mycobacterium tuberculosis samples. I did create the custome database using customeProDB software. I did use that custome database as my variantDB for the search. Below is the command which I used:
#!/bin/sh
#SBATCH --account=cbio
#SBATCH --partition=ada
#SBATCH --nodes=2 --ntasks=40
#SBATCH --time=170:00:00
#SBATCH --job-name="protgActualAnalysis"
#SBATCH --mail-user=oknjav001@myuct.ac.za
#SBATCH --mail-type=END,FAIL

module load software/nextflow-19.04
module load software/R-3.6.0
nextflow run main.nf -resume --tdb /scratch/oknjav001/transcriptomics/proteogenomics/ms_rnaseqdata/rnasq/variantcallresults/16bcg/T016BCGmerge-var.fasta --mzmldef raw
files.txt --activation hcd --gtf /scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/downloads/dbvar/VarDB.gtf --mods /scratch/oknjav001/bal_mzML_raw_fil
es/databaseComparisonProject/msgfplus/searchEngine/MSGFPlus_Mods1.txt --knownproteins /scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/downloads/homo/
Homo_sapiens.GRCh38.pep.all.fa --blastdb /scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/downloads/dbvar/UniProteome+Ensembl87+refseq+GENCODE24.prote
ins.fasta --snpfa /scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/downloads/dbvar/MSCanProVar_ensemblV79.filtered.fasta --genome /scratch/oknjav001/g
enome/hg19.fasta --annovar_dir /scratch/oknjav001/transcriptomics/proteogenomics/yafengpipeline/annovar --bigwigs /scratch/oknjav001/transcriptomics/proteogenomics/ya
fengpipeline/bigwigs --bamfiles /scratch/oknjav001/transcriptomics/proteogenomics/ms_rnaseqdata/rnasq/LTB/trimmedres/bamfiles/tempanalysis/*.bam --outdir /scratch/ok
njav001/transcriptomics/proteogenomics/yafengpipeline/actualAnalysis --profile slurm -c singularity.config

@Jokendo-collab
Copy link
Author

I am not sure if this problem is because of the "Setname" in the text file containing the fullpath of the raw files. This maybe a spamy question but how should the text file containing the mzML full path look like?

ColumnA ColumnB
path/to/sampleA.mzML sampleA
path/to/sampleB.mzML sampleB
I did prepare my txt file using the above example and I do not know if this is what is causing the above error and I get the following warning.
WARN: Input tuple does not match input set cardinality declared by process splitSetNormalSearchPsms -- offending value: sampleA
Could you help in this because this information is missing in the readme. You can add an examples of how that file should look like and this will help other researchers

@yafeng
Copy link
Collaborator

yafeng commented Jul 1, 2020

Your text file for the input MS data looks fine to me.
The first column is full path of MS data, the second column is set name to group different MS raw file from same sample (label-free) or same TMT/iTRAQ set .

I suspect the error is due to the customized database you use. The pipeline is designed for VarDB, in which the sequences have specific fasta header, such as "PGOHUM", " lncRNA", "CanProVar", "COSMIC" to label different class of novel/variant peptides. These labels are used later in the pipeline to calculate class-specific FDR and further divided into different processes. if your database doesn't contain such headers, the pipeline will not be able to recognize them and will generate empty output, which probably causes errors in later steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants