Protein plays a big role for every living organism, it keeps us healthy by performs it functionality such as transporting oxygen, part of imune system, muscle development, etc. These amazing functionality only works if those protein 'fold' into a stable(native)-state where it has a minimum 'free-energy' value. Predicting the minimum 'free-energy', helps bioinformatics, biotechnology in terms of developing a novel drug, vaccine, and many more. This project helps to simulate the folding process in 2D HP-model which is also well known as NP-complete problem.
If you ever play around any of OpenAI gym environment, you should be familiar with the reset(), step(), render() function.
Well, this project follows the OpenAI gym funtions behaviour too.
Before using the environment
git clone https://github.com/alvinwatner/protein_folding.git
and after done clonning, Please run the code create_background.py inside 'protein_folding\Code' folder. It will generate an initial background '.npy' file for visualization purpose, otherwise it will raise an error.
- numpy >= 1.18.2
- matplotlib >= 3.2.1
- opencv-python >= 4.2.0.34
- Pillow >= 7.1.2
Open Terminal and install the above prerequisites libraries
pip install numpy
pip install matplotlib
pip install opencv-python
pip install pillow
HP-model looks like a simple board game, since 20 different amino acid in protein are classified into 2 amino acid:
- ‘H’ = Hydrophobic (Black Circle)
- ‘P’ = Hydrophillic (White Circle)
Given a sequence of amino ['H', 'P'], the agent task is to place each amino in the sequence into 2D space. Note that, the next amino should be place side by side up, left, right, down from the previous amino. Repeat this process until all amino in the sequence has been placed.
Example :
UP Left Right Down
Find the minimum total free energy given a sequence of amino acid. Free energy indicated by H-H pairs that is not connected to the protein primary structure. The value of free energy is -1 for each pair.
Example :
Free Energy = -1 Free Energy = -3 Free Energy = -9
- Placing amino to occupied space by other amino is not allowed, it considered as collision and recieve a collision punishment -2
Example :
- If amino has nowhere to go, whereas there are still other amino in the sequence, it considered as trap condition and receive a trap punishment -4
Example :
- If collision and trap occur, agent should pick another action to move to other direction. But there are also a conditions where the agent couldnt move to other direction since all space has occupied. If these occur, I called it as multiple trap then episode terminate (Done = True).
Example :
Reward is calculated at the end of the episode, which mean its a sparse reward RL problem, everysteps has 0 reward except the terminal state
Note that, env.reset() argument is optional, if amino_input not specified then it will generate random sequences.
- env.render() : To Visualize Folding Process (opencv window followed by matplotlib figure)
- env.render(plot = True) : To Show Folding Result(Matplotlib Figure) Only
from simulation import environment
import numpy as np
env = environment()
current_state = env.reset(amino_input = ['P', 'P', 'P', 'H', 'H', 'P', 'P', 'H', 'H', 'P', 'P', 'P', 'P', 'P', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'P', 'P', 'H', 'H', 'P', 'P', 'P', 'P', 'H', 'H', 'P', 'P', 'H', 'P', 'P'])
done = False
while not done:
action = np.random.randint(0, env.action_space_size)
new_state, reward, done = env.step(action)
# env.render()
env.render(plot = True) #show result figure only
Output
Please feel free to use and modify this, but keep this below information. Thanks!
----------------------------------------
Author : Alvin Watner
Email : alvinsetiadi22@gmail.com
Website : -
License : MIT
----------------------------------------
This project is licensed under the MIT License - see the LICENSE.md file for details