You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have encountered what seems to be the same problem as this thread and this more recent thread. OrthoFinder fails during MSA, producing (in addition to many normal orthogroup files) an empty orthogroup file and halting the program. First, I'll paste the log with error, then follow up with some more information below.
(I've bolded my text to make it stand out from log snippets, since I have some comments in the middle of two big chunks. This is my first time posting on GitHub so my formatting's clumsy.)
orthofinder -f RemoveIfHaveStop/ -S diamond -M msa -A mafft -T fasttree
OrthoFinder version 2.3.12 Copyright (C) 2014 David Emms
2020-07-13 22:29:39 : Starting OrthoFinder
8 thread(s) for highly parallel tasks (BLAST searches etc.)
1 thread(s) for OrthoFinder algorithm
Checking required programs are installed
----------------------------------------
Test can run "mcl -h" - ok
Test can run "mafft /home/chase/OrthoFinder/Endobacteria/AllFrames/CombinedFiles/Linearize/RemoveIfHaveStop/OrthoFinder/Results_Jul13/WorkingDirectory//_dependencies_check/SimpleTest.fa" - ok
Test can run "FastTree /home/chase/OrthoFinder/Endobacteria/AllFrames/CombinedFiles/Linearize/RemoveIfHaveStop/OrthoFinder/Results_Jul13/WorkingDirectory//_dependencies_check/SimpleTest.fa" - ok
Dividing up work for BLAST for parallel processing
--------------------------------------------------
2020-07-13 22:29:42 : Creating diamond database 1 of 6
2020-07-13 22:29:42 : Creating diamond database 2 of 6
2020-07-13 22:29:42 : Creating diamond database 3 of 6
2020-07-13 22:29:42 : Creating diamond database 4 of 6
2020-07-13 22:29:42 : Creating diamond database 5 of 6
2020-07-13 22:29:42 : Creating diamond database 6 of 6
Running diamond all-versus-all
------------------------------
Using 8 thread(s)
2020-07-13 22:29:42 : This may take some time....
2020-07-13 22:29:43 : Done 0 of 36
2020-07-13 22:30:22 : Done 10 of 36
2020-07-13 22:30:35 : Done 20 of 36
2020-07-13 22:30:47 : Done all-versus-all sequence search
Running OrthoFinder algorithm
-----------------------------
2020-07-13 22:30:47 : Initial processing of each species
2020-07-13 22:30:48 : Initial processing of species 0 complete
2020-07-13 22:30:51 : Initial processing of species 1 complete
2020-07-13 22:30:52 : Initial processing of species 2 complete
2020-07-13 22:30:53 : Initial processing of species 3 complete
2020-07-13 22:30:53 : Initial processing of species 4 complete
2020-07-13 22:30:54 : Initial processing of species 5 complete
2020-07-13 22:30:58 : Connected putative homologues
2020-07-13 22:30:59 : Written final scores for species 0 to graph file
2020-07-13 22:31:01 : Written final scores for species 1 to graph file
2020-07-13 22:31:01 : Written final scores for species 2 to graph file
2020-07-13 22:31:02 : Written final scores for species 3 to graph file
2020-07-13 22:31:02 : Written final scores for species 4 to graph file
2020-07-13 22:31:02 : Written final scores for species 5 to graph file
2020-07-13 22:31:10 : Ran MCL
Writing orthogroups to file
---------------------------
OrthoFinder assigned 7471 genes (91.9% of total) to 2204 orthogroups. Fifty percent of all genes were in orthogroups with 4 or more genes (G50 was 4) and were contained in the largest 748 orthogroups (O50 was 748). There were 127 orthogroups with all species present and 115 of these consisted entirely of single-copy genes.
2020-07-13 22:31:22 : Done orthogroups
Analysing Orthogroups
=====================
2020-07-13 22:31:22 : Starting MSA/Trees
Species tree: Using 707 orthogroups with minimum of 66.7% of species having single-copy genes in any orthogroup
Inferring multiple sequence alignments for species tree
-------------------------------------------------------
2020-07-13 22:31:32 : Done 0 of 707
2020-07-13 22:32:45 : Done 100 of 707
2020-07-13 22:33:45 : Done 200 of 707
2020-07-13 22:34:54 : Done 300 of 707
2020-07-13 22:36:16 : Done 400 of 707
2020-07-13 22:37:15 : Done 500 of 707
2020-07-13 22:38:01 : Done 600 of 707
Traceback (most recent call last):
File "/home/chase/anaconda3/envs/py37/bin/orthofinder", line 7, in <module>
main(args)
File "/home/chase/anaconda3/envs/py37/bin/scripts_of/__main__.py", line 1731, in main
GetOrthologues(speciesInfoObj, options, prog_caller)
File "/home/chase/anaconda3/envs/py37/bin/scripts_of/__main__.py", line 1511, in GetOrthologues
options.name)
File "/home/chase/anaconda3/envs/py37/bin/scripts_of/orthologues.py", line 905, in OrthologuesWorkflow
seqs_alignments_dirs = treeGen.DoTrees(ogSet.OGs(qInclAll=True), ogSet.OrthogroupMatrix(), ogSet.Spec_SeqDict(), ogSet.SpeciesDict(), ogSet.speciesToUse, nHighParallel, qStopAfterSeqs, qStopAfterAlign or qPhyldog, qDoSpeciesTree=qDoMSASpeciesTree)
File "/home/chase/anaconda3/envs/py37/bin/scripts_of/trees_msa.py", line 357, in DoTrees
CreateConcatenatedAlignment(iOgsForSpeciesTree, ogs, self.GetAlignmentFilename, concatenated_algn_fn, fSingleCopy)
File "/home/chase/anaconda3/envs/py37/bin/scripts_of/trees_msa.py", line 222, in CreateConcatenatedAlignment
alignment = ReadAlignment(alignment_filename_function(iOg))
File "/home/chase/anaconda3/envs/py37/bin/scripts_of/trees_msa.py", line 214, in ReadAlignment
return MSA(msa)
File "/home/chase/anaconda3/envs/py37/bin/scripts_of/trees_msa.py", line 189, in __init__
self.length = len(list(msa_dict.values())[0])
IndexError: list index out of range
To explain some of the directories here, I've activated the conda environment "py37" which is running Python 3.7. Further system info is at the bottom of this post.
Like you predicted in the previous threads, an empty orthogroup file was produced in /Alignments_ids. I've attached the file from /Sequences_ids that has the same name (OG0000725.fa) as the empty 0KB file from /Alignments_ids (OG0000725.fa). It's had ".txt" added to it to allow uploading here.
Like you suggested in the previous thread, I ran $ mafft --localpair --maxiterate 1000 --anysymbol OG0000725.fa > OG0000725.aln.fa on this exact file, getting the following (apparently problem-free) result:
inputfile = orig
4 x 261 - 204 p
outputhat23=16
treein = 0
compacttree = 0
stacksize: 8192 kb
rescale = 1
All-to-all alignment.
tbfast-pair (aa) Version 7.471
alg=L, model=BLOSUM62, 2.00, -0.10, +0.10, noshift, amax=0.0
0 thread(s)
outputhat23=16
Loading 'hat3.seed' ...
done.
Writing hat3 for iterative refinement
rescale = 1
Gap Penalty = -1.53, +0.00, +0.00
tbutree = 1, compacttree = 0
Constructing a UPGMA tree ...
0 / 4
done.
Progressive alignment ...
STEP 3 /3
done.
tbfast (aa) Version 7.471
alg=A, model=BLOSUM62, 1.53, -0.00, -0.00, noshift, amax=0.0
1 thread(s)
minimumweight = 0.000010
autosubalignment = 0.000000
nthread = 0
randomseed = 0
blosum 62 / kimura 200
poffset = 0
niter = 16
sueff_global = 0.100000
nadd = 16
Loading 'hat3' ... done.
rescale = 1
0 / 4
Segment 1/ 1 1- 264
STEP 003-001-0 identical.
Converged.
done
dvtditr (aa) Version 7.471
alg=A, model=BLOSUM62, 1.53, -0.00, -0.00, noshift, amax=0.0
0 thread(s)
Strategy:
L-INS-i (Probably most accurate, very slow)
Iterative refinement method (<16) with LOCAL pairwise alignment information
If unsure which option to use, try 'mafft --auto input > output'.
For more information, see 'mafft --help', 'mafft --man' and the mafft page.
The default gap scoring scheme has been changed in version 7.110 (2013 Oct).
It tends to insert more gaps into gap-rich regions than previous versions.
To disable this change, add the --leavegappyregion option.
In the most recent thread linked at the top, this was all fixed by simply running the analysis on another computer, but I don't have that luxury at the moment. I wonder if there's any clues in common between my data and the previous poster's? If it's a per-machine thing, it must be something about the local environment, not the data itself.
For full information, I have OrthoFinder 2.3.12 and I'm running Ubuntu 18.04.4 LTS in "Windows Subsystem for Linux" compatibility layer in Windows 10 64-bit (ver. 10.0.18362 Build 18362), with 32 GB RAM; cpu is i5-8350U; machine is Lenovo Thinkpad T480.
For good measure, I've zipped the entire folder (original input files and Orthofinder results) and attached it here, too.
In the most recent thread linked at the top of this post, you mentioned looking for Species_Tree/Orthogroups_for_concatenated_alignment.txt, however this folder doesn't exist for me anywhere.
Is there anything else useful I could provide?
Thank you for your time! Chase
The text was updated successfully, but these errors were encountered:
Thanks for the details, from the looks of things this should be resolved by the fix that was submitted for the previously reported issue. I've pushed out a new release yesterday with this fix in so would you be able to try updating to that and letting me know how it goes? I'd also be interested to know whether OG0000725 is successful or not this time. OrthoFinder should be fine in either case but it'd be interesting to know if there is an occasional & intermittent problem with the alignments.
Hi David,
I have encountered what seems to be the same problem as this thread and this more recent thread. OrthoFinder fails during MSA, producing (in addition to many normal orthogroup files) an empty orthogroup file and halting the program. First, I'll paste the log with error, then follow up with some more information below.
(I've bolded my text to make it stand out from log snippets, since I have some comments in the middle of two big chunks. This is my first time posting on GitHub so my formatting's clumsy.)
To explain some of the directories here, I've activated the conda environment "py37" which is running Python 3.7. Further system info is at the bottom of this post.
Like you predicted in the previous threads, an empty orthogroup file was produced in /Alignments_ids. I've attached the file from /Sequences_ids that has the same name (OG0000725.fa) as the empty 0KB file from /Alignments_ids (OG0000725.fa). It's had ".txt" added to it to allow uploading here.
OG0000725.fa.txt
Like you suggested in the previous thread, I ran
$ mafft --localpair --maxiterate 1000 --anysymbol OG0000725.fa > OG0000725.aln.fa
on this exact file, getting the following (apparently problem-free) result:In the most recent thread linked at the top, this was all fixed by simply running the analysis on another computer, but I don't have that luxury at the moment. I wonder if there's any clues in common between my data and the previous poster's? If it's a per-machine thing, it must be something about the local environment, not the data itself.
For full information, I have OrthoFinder 2.3.12 and I'm running Ubuntu 18.04.4 LTS in "Windows Subsystem for Linux" compatibility layer in Windows 10 64-bit (ver. 10.0.18362 Build 18362), with 32 GB RAM; cpu is i5-8350U; machine is Lenovo Thinkpad T480.
For good measure, I've zipped the entire folder (original input files and Orthofinder results) and attached it here, too.
RemoveIfHaveStop2_DelLastLine.zip
In the most recent thread linked at the top of this post, you mentioned looking for
Species_Tree/Orthogroups_for_concatenated_alignment.txt
, however this folder doesn't exist for me anywhere.Is there anything else useful I could provide?
Thank you for your time!
Chase
The text was updated successfully, but these errors were encountered: