UniPMT

The code and datasets for the model UniPMT proposed in the paper: UniPMT: A Unified Deep Framework for Peptide, MHC, and TCR Binding Prediction

To run the code:

Requirements:

Operating System: Linux Ubuntu 20.04.

Software dependencies and software versions: please see ./code/requirements.txt

Hardware: CPU: Intel@ Xeon(R) Platinum 8360Y CPU @ 2.40GHzx 144, GPU: Nvidia A100 (for training); Nvidia A100 or Nvidia 3090 (for evaluation)

Instructions

Install the required packages in requirement.py: pip install -r requirements.txt. Normal install time: within 1 hour.
Download the datasets (see in ./data/ folder)，and put the datasets in that folder, e.g., ./data/pmt_pmt/.
Download the model file (see in ./model/ folder) and put the trained model file in that folder, e.g., ./model/model_pmt_pmt.pt

To reproduce the PMT results:
- Modifiy the code/config/config.py file: data_folder = pmt_pmt.
- Run the evaluation through python main.py. Expected runing time: within 1 min on Nvidia 3090/5 seconds on Nvidia A100.
To reproduce the PM results:
- Modifiy the code/config/config.pyy file: data_folder = pm_iedbsame
- Run the evaluation through python main.py. Expected runing time: within 1 min on Nvidia 3090/5 seconds on Nvidia A100.

Expected output

The test results (AUC, PRAUC) in the test set. The results will be outputed on the Terminal.
A predicted results scores of each data sample in the test set will be stored in ./output/predictions/
For easy understanding the score, we have added the files of scores with corresponding sequences in ./output/predictions/ ending with _with_name.csv, (e.g., result_pm_iedbsame_with_name.csv, result_pmt_pmt_with_name.csv)

Code functinality description (Pseudocode)

UniPMT Training Process

Data Processing and Graph Construction
- Load and preprocess datasets for P-M, P-T, and P-M-T bindings.
- Remove duplicates and anomalies from the data.
- Create edge sets E for P-M, P-T, and P-M-T bindings.
- Represent peptides (P), MHCs (M), and TCRs (T) as nodes, forming a heterogeneous graph G(V, E).
Initial Embedding Representation
- Generate initial embeddings for P and T nodes using the ESM method: hp, ht <- ESM(P, T)
- Generate initial embeddings for M nodes using pseudo sequences: hm <- Pseudo(M)
Graph Neural Network Learning
- def GraphSAGE:
  - For each node ni at layer l+1: h_ni^(l+1) = ReLU(W^(l) * MEAN({h_nj^(l) | nj in Neighbors(ni)}))
Multi-task Learning
- def P-M Task Learning:
  - Generate vector representation for P-M binding: v_pm = f_pm(hp, hm)
  - Calculate P-M binding probability: P_pm = sigmoid(w_pm * v_pm)
  - Compute cross-entropy loss: L_pm = -(1/N_pm) * sum(y_pm^(i) * log(P_pm^(i)) + (1 - y_pm^(i)) * log(1 - P_pm^(i)))
- def P-M-T Task Learning:
  - Reuse P-M representation v_pm.
  - Generate vector representation for M-T binding: v_mt = f_mt(hm, ht)
  - Calculate P-M-T binding score and probability: P_pmt = sigmoid(f_DMF(v_pm * v_mt))
  - Optimize using Info-NCE contrastive learning loss: L_pmt = -(1/N_pmt) * sum(log(exp(P_pmt^(i) / tau) / (exp(P_pmt^(i) / tau) + sum(exp(P_pmt^(i,j) / tau))))
- def P-T Task Learning:
  - Aggregate P-M binding probabilities: P_pt = (1/M) * sum(P_pmjt for j in 1 to M)
  - Compute cross-entropy loss: L_pt = -(1/N_pt) * sum(y_pt^(i) * log(P_pt^(i)) + (1 - y_pt^(i)) * log(1 - P_pt^(i)))
Training Process
- For each epoch:
  - For each batch in the dataset:
    - Update node embeddings using GraphSAGE.
    - Perform P-M task learning and compute L_pm.
    - Perform P-M-T task learning and compute L_pmt.
    - Perform P-T task learning and compute L_pt.
    - L = lambda_pm * L_pm + lambda_pmt * L_pmt + lambda_pt * L_pt
    - Update model parameters through minimizing L.
  - Check for convergence or stopping criteria.
- Continue training until the model converges or meets predefined stopping criteria.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
code		code
data		data
model		model
output/predictions		output/predictions
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniPMT

To run the code:

Requirements:

Instructions

Expected output

Code functinality description (Pseudocode)

About

Releases

Packages

Contributors 2

Languages

License

ethanmock/UniPMT

Folders and files

Latest commit

History

Repository files navigation

UniPMT

To run the code:

Requirements:

Instructions

Expected output

Code functinality description (Pseudocode)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages