Skip to content

"Train Your Reinforcement Learning Algorithm To This Protein Folding Problem Simulation". This is part of my bachelor thesis. The latest version of this env is in the LogDQN_ProteinHP repository.

License

Notifications You must be signed in to change notification settings

alvinwatner/HP_Protein_Fold-GymEnv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Protein Folding 2D HP-Model Simulation

Protein plays a big role for every living organism, it keeps us healthy by performs it functionality such as transporting oxygen, part of imune system, muscle development, etc. These amazing functionality only works if those protein 'fold' into a stable(native)-state where it has a minimum 'free-energy' value. Predicting the minimum 'free-energy', helps bioinformatics, biotechnology in terms of developing a novel drug, vaccine, and many more. This project helps to simulate the folding process in 2D HP-model which is also well known as NP-complete problem.

Getting Started

If you ever play around any of OpenAI gym environment, you should be familiar with the reset(), step(), render() function.
Well, this project follows the OpenAI gym funtions behaviour too.

Before using the environment

git clone https://github.com/alvinwatner/protein_folding.git

and after done clonning, Please run the code create_background.py inside 'protein_folding\Code' folder. It will generate an initial background '.npy' file for visualization purpose, otherwise it will raise an error.

create_background

Prerequisites

  • numpy >= 1.18.2
  • matplotlib >= 3.2.1
  • opencv-python >= 4.2.0.34
  • Pillow >= 7.1.2

Installation

Open Terminal and install the above prerequisites libraries

pip install numpy
pip install matplotlib
pip install opencv-python
pip install pillow

How it works?

HP-model looks like a simple board game, since 20 different amino acid in protein are classified into 2 amino acid:

  • ‘H’ = Hydrophobic (Black Circle)
  • ‘P’ = Hydrophillic (White Circle)

Given a sequence of amino ['H', 'P'], the agent task is to place each amino in the sequence into 2D space. Note that, the next amino should be place side by side up, left, right, down from the previous amino. Repeat this process until all amino in the sequence has been placed.

Example :

     UP                  Left                 Right                Down

Goals

Find the minimum total free energy given a sequence of amino acid. Free energy indicated by H-H pairs that is not connected to the protein primary structure. The value of free energy is -1 for each pair.

Example :

 Free Energy = -1           Free Energy = -3             Free Energy = -9         

Punishment

  • Placing amino to occupied space by other amino is not allowed, it considered as collision and recieve a collision punishment -2

Example :

  • If amino has nowhere to go, whereas there are still other amino in the sequence, it considered as trap condition and receive a trap punishment -4

Example :

  • If collision and trap occur, agent should pick another action to move to other direction. But there are also a conditions where the agent couldnt move to other direction since all space has occupied. If these occur, I called it as multiple trap then episode terminate (Done = True).

Example :

Reward Function

Reward is calculated at the end of the episode, which mean its a sparse reward RL problem, everysteps has 0 reward except the terminal state

Try it!

Note that, env.reset() argument is optional, if amino_input not specified then it will generate random sequences.

  • env.render() : To Visualize Folding Process (opencv window followed by matplotlib figure)
  • env.render(plot = True) : To Show Folding Result(Matplotlib Figure) Only
from simulation import environment
import numpy as np

env = environment()
current_state = env.reset(amino_input = ['P', 'P', 'P', 'H', 'H', 'P', 'P', 'H', 'H', 'P', 'P', 'P', 'P', 'P', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'P', 'P', 'H', 'H', 'P', 'P', 'P', 'P', 'H', 'H', 'P', 'P', 'H', 'P', 'P'])
done = False

while not done:
	action = np.random.randint(0, env.action_space_size)
	new_state, reward, done = env.step(action)	
	# env.render()
env.render(plot = True) #show result figure only

Output

Authors Info

Please feel free to use and modify this, but keep this below information. Thanks!

----------------------------------------
Author  : Alvin Watner
Email   : alvinsetiadi22@gmail.com
Website : -
License : MIT
----------------------------------------

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

"Train Your Reinforcement Learning Algorithm To This Protein Folding Problem Simulation". This is part of my bachelor thesis. The latest version of this env is in the LogDQN_ProteinHP repository.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages