Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bioconda has two (redundant) T-COFFEE packages #10614

Closed
apcamargo opened this issue Aug 22, 2018 · 41 comments
Closed

Bioconda has two (redundant) T-COFFEE packages #10614

apcamargo opened this issue Aug 22, 2018 · 41 comments

Comments

@apcamargo
Copy link
Contributor

apcamargo commented Aug 22, 2018

https://bioconda.github.io/recipes/t_coffee/README.html

https://bioconda.github.io/recipes/t-coffee/README.html

The first one ("t_coffee") is older, but doesn't compile on macOS and doesn't have some dependencies that a member of the T-COFFEE team said are important to the software.

@apcamargo apcamargo changed the title Bioconda has two redundant T-COFFEE packages Bioconda has two (redundant) T-COFFEE packages Aug 22, 2018
@corburn
Copy link

corburn commented Aug 23, 2018

t_coffee was introduced in:

commit d9afcdccb98a638afc5d2f2136b17b8fb8baad9d
Author: Alexey Strokach <ostrokach@gmail.com>
Date:   Tue Oct 27 17:16:36 2015 -0400

t-coffee was introduced in:

commit 854c5ba7921dcd715d328672de2b9f7b9bb6d61e (fast5-research)
Author: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Date:   Wed Aug 8 22:52:32 2018 +0200

Given t-coffee has only been been around for a couple weeks, perhaps it should be removed and t_coffee updated?

@bgruening
Copy link
Member

bgruening commented Aug 23, 2018

Let's ask @pditommaso - what do you think?

@pditommaso
Copy link
Contributor

Interesting, I was not aware of it. The problem of that formula is that it uses the binary installer. I agree that the twos can be merged using the latest one.

@apcamargo
Copy link
Contributor Author

In this case, what should be done to the other recipes that depend on t_cofee? Should we update them all?

@pditommaso
Copy link
Contributor

Good point. Do you have any idea how many recipes depends on t_coffee ?

@corburn
Copy link

corburn commented Sep 17, 2018

@pditommaso for the bioconda channel, grep -rn "t_coffee" bioconda-recipes/recipes" turned up perl-bio-tools-run-alignment-tcoffee

@pditommaso
Copy link
Contributor

I see. Anyhow I would just to keep as it is for now. The new formulate does not include all required deps and should be considered experimental.

@SchwarzMarek
Copy link
Contributor

Hi @pditommaso , @apcamargo and others,
I would like to point out that the t_coffee recipe does not install T-coffee in such way, that all functionality is accessible from call to t_coffee (for example call to t_coffee -mode rcoffee does not work). While with t-coffee it does. I've however not tried all T-coffee modes.

WORKING

t_coffee /tmp/rba_fw55wzu1_39 -mode rcoffee
# Command Line: t_coffee /tmp/rba_fw55wzu1_39 -mode rcoffee  [PROGRAM:T-COFFEE]
# T-COFFEE Memory Usage: Current= 48.796 Mb, Max= 1071.462 Mb
# Results Produced with T-COFFEE Version_11.00.8cbe486 (2014-08-12 22:05:29 - Revision 8cbe486 - Build 477)
# T-COFFEE is available from http://www.tcoffee.org
# Register on: https://groups.google.com/group/tcoffee/

conda list
...
t-coffee                  11.00.8cbe486        h26a2512_0    bioconda
...

NOT WORKING

t_coffee /tmp/rba_fw55wzu1_39 -mode rcoffee
********************************************************************************
Command line arguments: ['/home/uc/miniconda3/bin/t_coffee', '/tmp/rba_fw55wzu1_39', '-mode', 'rcoffee']
Install folder: /home/uc/miniconda3/lib/t_coffee-11.0.8
********************************************************************************


#*****************************************************************
--ERROR: #  [FATAL:T-COFFEE]

# The Program /home/uc/miniconda3/lib/t_coffee-11.0.8/bin/t_coffee Needed by T-COFFEE Could not be found
# If /home/uc/miniconda3/lib/t_coffee-11.0.8/bin/t_coffee is installed on your system:
#	     -Make sure /home/uc/miniconda3/lib/t_coffee-11.0.8/bin/t_coffee is in your $path:
# If /home/uc/miniconda3/lib/t_coffee-11.0.8/bin/t_coffee is NOT installed obtain a copy from:
#	(null)
#
#
# and install it manualy
******************************************************************


*************************************************************************************************
*                        FULL TRACE BACK PID: 6590                                    
6590 -- ERROR: #  [FATAL:T-COFFEE]
6590 -- COM: /home/uc/miniconda3/lib/t_coffee-11.0.8/bin/t_coffee /tmp/rba_fw55wzu1_39 -mode rcoffee 
6590 -- STACK: 6589 -> 6590
*************************************************************************************************

# TERMINATION STATUS: FAILURE [PROGRAM: T-COFFEE pid 6590 ppid 6589
#CL: /home/uc/miniconda3/lib/t_coffee-11.0.8/bin/t_coffee /tmp/rba_fw55wzu1_39 -mode rcoffee 

********************************************************************************
Result: None
Error message: None
********************************************************************************

conda list
...
t_coffee                  11.0.8                   py36_2    bioconda
...

If you plan to remove one of the packages, please ensure that the remaining one works completely.

Best regards

@apcamargo
Copy link
Contributor Author

Pinging @bioconda/core

@epruesse
Copy link
Member

epruesse commented Dec 8, 2018

I don't think we need to (or should) remove any old packages - they are not broken, but removing them might break things. Removing either of the two recipes from the repo and updating all recipes requiring t-coffee to match the name makes sense though, so we don't have the duplicate in the future.

@epruesse
Copy link
Member

epruesse commented Dec 8, 2018

W.r.t. distributing mafft etc within the t-coffee package: I vote against that.

The practice makes sense outside of a package manager. It might even make limited sense with other package managers, to be able to ensure compatible versions of the incorporated tools are available. With conda, they can be pinned as needed though, so there is no need for that.

It would be important to have a test case in the recipe testing t-coffee in a mode calling all the aligners. CLIs are APIs, but people tend to forget that and so a change in command line interface by one of the tools might break t-coffee. It might be prudent to pin strongly.

@epruesse
Copy link
Member

epruesse commented Dec 8, 2018

@SchwarzMarek - Any chance you could add your tests to the t-coffee recipe?

@pditommaso
Copy link
Contributor

W.r.t. distributing mafft etc within the t-coffee package

This was exactly the reason why I've create the t-coffee recipe i.e. to include all the required dependencies as external bioconda packages.

However it turned out that T-Coffee uses an insane number of external packages for most of which do not exist a Bioconda package. I've posted the list here.

@SchwarzMarek
Copy link
Contributor

Hi @epruesse - It would certainly be nice to have run-tests for every aligner. The test would be more complicated.
We could have something like

printf ">a\nACGTCGATGCTA\n>b\nCAGTGTCAGCTG\n" | t_coffee -in=stdin -mode rcoffee

and check the exit code.

@epruesse
Copy link
Member

epruesse commented Dec 9, 2018

@SchwarzMarek That looks good!

If it gets too complicated to have individual lines, you can write a run_tests.sh to contain all tests (or actually even run_tests.py). Those would be run automatically.

@SchwarzMarek
Copy link
Contributor

Hi again,
I must apologise, I've had installed the viennarna package in my env so the t-coffee package with mode rcoffee worked. When clean install, this does not work as well. However the t_coffee package had installed the viennarna as well and didn't work.

I've dug-up the -mode options from https://github.com/cbcrg/tcoffee/blob/master/lib/t_coffee_lib/t_coffee.c (line 814 onwards) and tested them for sample DNA, RNA and protein see the table below.

I'm inclined to make changes to t-coffee recipe to include as much dependencies as possible, test for the working methods and skip the dependencies which are not available now. What do you think?

Table

                  DNA                                                  PROTEIN                        RNA
genepredx        FAIL (with -method specified -> COREDUMP)              FAIL                         FAIL
genepredpx       FAIL as above                                          FAIL                         FAIL
regular          OK                                                      OK                           OK
genome           WARN galign NOT found                                galign NOT found                OK
quickaln         OK                                                      OK                           OK
dali             OK                                                      OK                           OK
evaluate         [input is MSA]
precomputed      [input is MSA]
3dcoffee               WARNS SAP not installed (wants TMalign instead - TMaling not installed -> FAIL) -> https://anaconda.org/speleo3/tmalign -> can't be used (no templates) -> probapair
expresso               WARNS SAP not installed (wants TMalign instead - TMaling not installed -> FAIL) -> https://anaconda.org/speleo3/tmalign -> Impossible to find EXPRESSO Templates
repeats          OK                                                     OK                            OK
sample           OK                                                     OK                            OK
highlow                                unknown mode
psicoffee        [WARN - "could not use email", Impossible to find BLAST Templates, Check that your blast server is properly installed [See documentation][FATAL:T-COFFEE]]
procoffee        OK                                    working [outputs JJJJJJTN for RFVAVRVN input]  OK
blastr           OK                                    working [as above],               RNA backtranscribed to DNA - OK
accurate         OK                                 AA WARNS "Impossible to find BLAST Templates"    RNA FAIL (RNAplfold needed)
accurate4DNA     OK                                                  --------                       -------
accurate4RNA     -------------                                       --------                      +viennarna OK
best4RNA         -------------                                       --------                     FAIL RNAplfold, PDB templates not found
accurate4PROTEIN ------------                               Templates NOT FOUND                     -------
low_memory       OK                                                    OK                             OK
dna              OK                                                  --------                       -------
cdna             OK                                                  --------                       -------
protein          -------                                               OK                             OK
mcoffee          FAIL dialign-t needed                               FAIL dialign-t needed         FAIL dialign-t needed  (only dialign-tx avalible in conda)
xcoffee          OK                                                    OK                             OK
dmcoffee         FAIL kalign needed                                  FAIL kalign needed            FAIL kalign needed     https://anaconda.org/etetoolkit/kalign (not working for me)
fmcoffee         FAIL kalign needed                                  FAIL kalign needed            FAIL kalign needed
rcoffee_consan   -------                                             --------                      FAIL need sfold (and RNAplfold) (sfold not found in bioconda)
rmcoffee                                                                                           FAIL need (RNAplfold and probcons) (probcons part of  ete3_external_apps package)
rcoffee          -------                                             --------                      +viennarna OK
rcoffee_slow_accurate     RNAplfold sfold
rcoffee_fast_approximate  RNAplfold probcons
t_coffee         OK                                                    OK                            OK
saracoffee       FAIL sara
sracoffee        FAIL sap

@pditommaso
Copy link
Contributor

Folks, as author of the t-coffee recipe I propose to retire it until we don't manage to add all the required deps as bioconda packages. In the current state it only creates confusion.

@SchwarzMarek
Copy link
Contributor

I would prefer to keep the t-coffee. The installation of required tools is easier with this recipe.

@pditommaso
Copy link
Contributor

pditommaso commented Dec 11, 2018 via email

@SchwarzMarek
Copy link
Contributor

Well, as it was argued in the linked post (here), having the dependencies are important, however, not all are easily accessible. For example sfold is available per license request form (here). So every dependency might be hard to get, but I agree that having most of them would be good.

Also not all mods are not frequently used (some are not even in docs (I've looked here))

Maybe we could focus on dependencies for basic modes covered in tcoffee docs. That is regressive, accurate, rcoffee, quickaln, low_mem, expresso and mcoffee.

As of now, I can add kalign.

@pditommaso
Copy link
Contributor

Is there a way to print a warning message when installing a conda package? it could used to warn the user that some modes are not available.

@corburn
Copy link

corburn commented Dec 11, 2018

@epruesse
Copy link
Member

You could use outputs: to create meta packages - e.g. t-coffee-minimal, t-coffee and t-coffee-full.

@SchwarzMarek SchwarzMarek mentioned this issue Dec 12, 2018
5 tasks
@pditommaso
Copy link
Contributor

pditommaso commented Dec 12, 2018

OK, great. I will add a disclaimer to warn users that this is a recipe under development and does not support all t-coffee modes.

@pditommaso
Copy link
Contributor

FYI #12594

@pditommaso
Copy link
Contributor

pditommaso commented Dec 13, 2018

It turns out that a discrete number of deps are already on bioconda. I've updated the recipe adding them however the build is failing with this message:

conda.exceptions.UnsatisfiableError: The following specifications were found to be in conflict:

  - dca

  - pasta

Use "conda info <package>" to see the dependencies for each package.

Any suggestion how to solve it?

@bgruening
Copy link
Member

@pditommaso this can happen if both packages were compiled against different sub-dependencies. Let's say zlib 1 and zlib 2. Rebuild/rerender both and it should fix it.

@pditommaso
Copy link
Contributor

@corburn Sorry, still about this. I've added a pre-link to t-coffee recipe (see here). But it's completely ignored.

Any idea why it's not working as expected?

This was referenced Dec 14, 2018
@druvus
Copy link
Member

druvus commented Dec 14, 2018

I think the issue is that paste incorrectly is limited to only py27 while dca is only py3. However pasta should work for py3 according to their documentation: Python (version 2.7 or later, including python 3).

@SchwarzMarek SchwarzMarek mentioned this issue Dec 14, 2018
2 tasks
@druvus
Copy link
Member

druvus commented Dec 14, 2018

Hopefully fixed the conflict in #12648

@epruesse
Copy link
Member

T-coffee also installs blast and blast-legacy, which clobber each other's files. See #14331

@glichtenstein
Copy link

hello, I was searching for t coffee in anaconda cloud and encountered this discussion, which one shall I install in my production server? Thanks in advance

@pditommaso
Copy link
Contributor

@edgano what's the status of the t-coffee recipe?

@dpryan79
Copy link
Contributor

@glichtenstein t-coffee is a much newer version than t_coffee, so presumably the more recent version (t-coffee) would be preferred.

@glichtenstein
Copy link

@dpryan79 I agree, will go with t-coffee then, thank you very much.

@amizeranschi
Copy link
Contributor

Is anyone still able to help with this issue? It's preventing packages from upgrading beyond python 3.7 and it's also been documented in a newer issue here: #32576 (comment)

@dpryan79
Copy link
Contributor

I rebuild t-coffee about 2 weeks ago, so packages should switch to that, as it doesn't pin a python version.

@amizeranschi
Copy link
Contributor

@dpryan79 I see that the recipe for perl-bio-tools-run-alignment-tcoffee still points to the old problematic package t_coffee (with underscore), and not the more recent one t-coffee (with a hyphen).

Should that be changed? This comment suggests so and I've submitted a PR implementing it here.

@0xaf1f
Copy link
Contributor

0xaf1f commented Dec 1, 2022

@amizeranschi I don't see a PR -- your link is to a commit on your fork of bioconda-recipes. Did you miss the last step of creating the PR on the official bioconda repository or am I just not finding it?

@amizeranschi
Copy link
Contributor

@0xaf1f Apologies, you're right. It looks like I forgot to submit the PR itself: #38199

@0xaf1f
Copy link
Contributor

0xaf1f commented Dec 2, 2022

Awesome! I'm excited to finally not be stuck on old python 3 versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests