Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproducibility of results between github, galaxy instance and biobakery's LEfSe scripts #13

Open
glucksfall opened this issue Sep 8, 2021 · 1 comment

Comments

@glucksfall
Copy link

Dear developers,

Hope everything's ok.

I write because I'm having trouble trying to reproduce locally the results of the galaxy instance here https://huttenhower.sph.harvard.edu/galaxy/ (because I have a file bigger than the allowed size in the galaxy instance).

Firstly, I'm confused because the galaxy instance has the stricter all-against-all option (See image below), while the lefse_run.py --help reports that the stricter option is one-against-one, with no all-against-all option:

-y {0,1}        (for multiclass tasks) set whether the test is performed in a one-against-one ( 1 - more strict!) or in a one-against-all setting ( 0 - less strict) (default 0)

image

Secondly. after running LEfSe from galaxy and biobakery locally with python2.7 (ubuntu 20.04 and rpy2 compatible, R3.6.3) and from GitHub with python3, I have found that LDA scores differ subtly, but enough to drop one or two features below the threshold, while the galaxy instance reports them. I used abs(LDA) > 2. Do you have any clue about what interferes with reproducibility? I see that you set a random number seed here https://github.com/biobakery/galaxy_lefse/blob/2ca4bf39cbbe588b979873b234636670565b4caf/lefse.py#L9, but many other things can change things.

Finally, I don't know if you maintain the LEfSe versions at https://toolshed.g2.bx.psu.edu/, however, I installed both available versions in a local galaxy server and couldn't run Format Data for LEfSe because my local instance has python3 instead of python2. The biobakery's LEfSe for galaxy also needs python2 to run properly.

If you need more details or a better explanation, please don't hesitate and ask me.

Best regards

@lauramason326
Copy link

Hi - I am actually having a similar issue comparing the GUI version of LefSe from https://huttenhower.sph.harvard.edu/galaxy/ and Lefse-1.1.2 on the command line. Like @glucksfall, I have similar LDA scores and mostly the same OTUs, but there are some differences. Do you know why this might be?
Thanks
Laura

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants