Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add wrapper for training Spec2Vec in Galaxy #314

Merged
merged 60 commits into from
Jan 5, 2023

Conversation

maximskorik
Copy link
Member

@maximskorik maximskorik commented Dec 6, 2022

Updates

  • add spec2vec_training tool to Galaxy

Description

This tool can train a Spec2Vec model from msp or mgf spectra files. The outputs are json model-metadata and npy peak-embeddings, which can be read by spec2vec.serialization module (in the future downstream step) to compute Spec2Vec similarity scores. Optional outputs are Python pickle file & checkpoints of a model at user-defined iterations.

Closes #315

@xtrojak
Copy link
Contributor

xtrojak commented Dec 7, 2022

@maximskorik can you please explain the role of docker image? Is it just to get the newest version of the tool available in master branch?

@maximskorik
Copy link
Member Author

maximskorik commented Dec 7, 2022

@maximskorik can you please explain the role of docker image? Is it just to get the newest version of the tool available in master branch?

Exactly. We recently added functionality for Spec2Vec to export a model to json+npy files and it hasn't been released yet. Hence the docker image with a tag that points to the merge commit with this update until these changes are released and become available via Conda.

tools/spec2vec/spec2vec_training.xml Show resolved Hide resolved
tools/spec2vec/spec2vec_training.xml Outdated Show resolved Hide resolved
tools/spec2vec/spec2vec_training.xml Outdated Show resolved Hide resolved
@maximskorik
Copy link
Member Author

The CI fails because the linter is concerned about suffixes in the value and delta attributes of <has_size>. I've submitted an issue to Planemo galaxyproject/planemo#1335.

Copy link
Contributor

@xtrojak xtrojak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! Great job @maximskorik

tools/spec2vec/spec2vec_training.xml Outdated Show resolved Hide resolved
@hechth
Copy link
Member

hechth commented Dec 16, 2022

Can we maybe change this to be compliant with the CI so that it passes? I know it is not a fault on our side, but still as soon as the CI fails and you merge stuff, it becomes harder to see when it actually starts failing for a different reason.

tools/spec2vec/spec2vec_training.xml Show resolved Hide resolved
tools/spec2vec/spec2vec_training.xml Outdated Show resolved Hide resolved
tools/spec2vec/spec2vec_training.xml Outdated Show resolved Hide resolved
tools/spec2vec/spec2vec_training.xml Outdated Show resolved Hide resolved
tools/spec2vec/spec2vec_training.xml Show resolved Hide resolved
tools/spec2vec/spec2vec_training_wrapper.py Show resolved Hide resolved
@maximskorik
Copy link
Member Author

maximskorik commented Dec 19, 2022

I removed suffixes from the size attributes in tests (70991af) since planemo v0.75.3 wouldn't accept them anymore, causing the tests to fail. The CI now passes.

cc @hechth

@hechth hechth merged commit 2e4bdc2 into RECETOX:master Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

add wrapper for Spec2Vec model training
4 participants