The choice of fragmentation strategy used for data acquisition in untargeted metabolomics greatly affects the coverage and spectral quality of identified metabolites. In a typical strategy for data-dependant acquisition, the most N intense ions (Top-N) in the survey MS1 scan are chosen for fragmentation. This strategy is not entirely data-driven as it is unable to learn and adapt to changes in incoming signals when deciding which ions to target for fragmentation.
Reinforcement learning has been widely used to train intelligent agents that manage the operations of various scientific instruments (example). However, its use in mass spectrometry instrumentation control and untargeted metabolomics has never been explored.
In vimms-gym, we provide an OpenAI gym environment to develop data-dependant fragmentation strategies that control a simulated mass spectrometry instrument and learn from the data during acquisition. This is built upon ViMMS, a general framework to develop, test and optimise fragmentation strategies. We hope that vimms-gym could encourage further research into applying reinforcement learning in data acquisition for untargeted metabolomics.
No Python package is provided at the moment as the project is still under active development.
To use vimms-gym, please clone this repository first, then use your preferred method to install the required dependencies.
A. Managing Dependencies using Pipenv
- Install pipenv (https://pipenv.readthedocs.io).
- In the cloned Github repo, run
$ pipenv install
to create a new virtual environment and install all the packages need to run ViMMS. - Go into the newly created virtual environment in step (4) by typing
$ pipenv shell
. - In this environment, you could develop run the environment, train models etc by running
notebooks (
$ jupyter lab
).
B. Managing Dependencies using Pipenv
- Install Anaconda Python (https://www.anaconda.com/products/individual).
- In the cloned Github repo, run
$ conda env create --file environment.yml
to create a new virtual environment and install all the packages need to run ViMMS. - Go into the newly created virtual environment in step (4) by typing
$ conda activate vimms-gym
. - In this environment, you could develop run the environment, train models etc by running
notebooks (
$ jupyter lab
).
The Stable-Baselines3 package has been included as a dependency of this project, although you may use other RL frameworks to work with vimms-gym if desired.
Example notebooks can be found here. This includes a demonstration of the environment, as well as other notebooks to train models and evaluate the results.