This software de-immunizes therapeutic sequences by solving the dual optimization problem of reducing immunogenicity and increasing the likelihood of protein function using model output from EVcouplings.
- Python 2.7
- Numpy 1.9.1
- Cplex (+Python API) 12.5
- Polygon 2.0.7
- Psutil
After installing the above requirements, simply clone this repository onto your machine. Add the folder to your path in .bashrc
to run from any location.
This module preprocesses the model file from EV couplings for use with EVdeimmunization.
usage: deimm preprocess [options] alignment model config out_basename
positional arguments:
alignment alignment in fasta format that served as input to
model binary eij couplings file with the biotherapeutic
sequence as target
config config file in YAML format
out_basename basename of out file (multiple files will be created
with appropriate extensions added)
optional arguments:
-h, --help show this help message and exit
--freq_thresh FREQ_THRESH, -t FREQ_THRESH
amino acid frequency threshold used to determine
allowed mutations
--ev_file_format {plmc_v1,plmc_v2}, -f {plmc_v1,plmc_v2}
file format of EVcouplings model file (default:
The config file contains parameters for building the model and pointers to an allele file. See an example below:
# amino acid frequency threshold used to determine allowed mutations
frequency_thresh: 0.01
# path to PSSM file
allele_file: peptide_design_allele_file.txt
# positions excluded from mutations
exclude_pos: [1, 2, 3, 50, 51]
# positions completely ignored by model, i.e. mutations excluded AND eijs do not influence decision process
# epitope length
epi_len: 9
# number of mutations to introduce
k: 2
The example allele file includes representative alleles of supertypes [1]. Along with each allele is a PSSM correction value that adjusts PSSM scores such that peptides in the top 1% of binders have a positive score as computed in MixMHCpred2.0 [2] and a frequency parameter.
This module takes the model files from the preprocessing step and solves the bi-objective mixed integer problem using a rectangle splitting approach [3].
usage: deimm solve [options] --port PORT --output OUTPUT
positional arguments:
model model files (generated from preprocessing)
optional arguments:
-h, --help show this help message and exit
--port PORT, -p PORT Port
--output OUTPUT, -o OUTPUT
Solution output file (ex: output.pcl)
Bound on approximation (default:0.009)
--relTol RELTOL, -rel RELTOL
The relative tolerance for floating point comparison
--absTol ABSTOL, -abs ABSTOL
The absolut tolerance for floating point comparison, also used as epsilon in the model
-t THREADS, --threads THREADS
Number of threads (default: based on cpu count)
--verbose VERBOSE, -v VERBOSE
Verbosity (default 0)
deimm solve ./test/test_imm.lp ./test/test_en.lp -p 6882 -o ./test/out.pcl -v 1
First process the EV couplings output:
deimm preprocess ./example/PvLEA4_repeats_b0.75.fasta \
./example/PvLEA4_repeats_b0.75.model \
./example peptide_design_config.cfg \
Next, solve:
deimm solve ./example/PvLEA4_repeats_b0.75.k2_imm.lp ./example/PvLEA4_repeats_b0.75.k2_en.lp \
-o ./example/PvLEA4_repeats_b0.75.k2.pcl \
-p 6882 \
-v 1
The solver can be used in distributed systems. For that, first start the manager script, which implements the algorithm and handles the work distribution. Make sure that the chosen IP and Port combination is reachable from all other used systems.
usage: [-h] [--grid GRID] --port PORT
[--approximate APPROXIMATE] [--key KEY]
--output OUTPUT [--resolve RESOLVE]
Rectangle Manager implementation
-h, --help show this help message and exit
--grid GRID, -g GRID Number of Epsilon grid points (default:3)
--port PORT, -p PORT Port
Bound on approximation (default:0.009)
--key KEY, -k KEY Authentication key (default: rectangle)
--output OUTPUT, -o OUTPUT
Solution output as pickel
--resolve RESOLVE, -r RESOLVE
Reinitialize with partial solution
After initializing the manager process, one can start multiple worker processes on the same or distributed machines. The worker process connects to the manager process via TCP/IP at the specified port (must be the same as the one the manager listens at) and obtains single work packages from the manager process. Using the flag -r (--resolve) one can specify a intermediate solution and refine or restart the solving process from there. The specified intermediate solution must be a pickled list of Solution objects.
usage: [-h] --input INPUT1 INPUT2 --masterip MASTERIP
--port PORT --authkey AUTHKEY --threads
Rectangle Worker Grid implementation
-h, --help show this help message and exit
model files
--masterip MASTERIP, -m MASTERIP
The IP of the master node
--port PORT, -p PORT port to connect
--authkey AUTHKEY, -a AUTHKEY
authentication key
--threads THREADS, -t THREADS
nof of core
INPUT1 and INPUT2 are single-objective files (in CPLEX compatible formats) of the bi-objective problems, each containing one of the objective function and all constraints of the bi-objective problem.
python -p 6882 -g 3 -a 0.009 -k rectangle -o ./example/FA8_HUMAN_hmmer_plm_n5_m50_f70_t01_g_r2188-2345_e20_k1_output.pcl
python -i ./example/FA8_HUMAN_hmmer_plm_n5_m50_f70_t01_g_r2188-2345_e20_k1_pssm05_f01_f_he_experiment_local_imm.lp ./example/FA8_HUMAN_hmmer_plm_n5_m50_f70_t01_g_r2188-2345_e20_k1_pssm05_f01_f_he_experiment_local_en.lp -m -p 6882 -a rectangle -t 4
Please cite the following reference for the EVdeimmunization software:
Schubert B, Schärfe C, Dönnes P, Hopf T, Marks D, Kohlbacher O. Population-specific design of de-immunized protein biotherapeutics. PLoS Comput Biol. 2018;14(3):e1005983. Published 2018 Mar 2. doi:10.1371/journal.pcbi.1005983
The first version of the EVdeimmunization software package was initially released with this publication at FRED-2 EVdeimmunization.
Other sources:
[1] Lund O, Nielsen M, Kesmir C, et al. Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics. 2004;55(12):797–810. doi:10.1007/s00251-004-0647-4
[2] Gfeller, D, Guillaume, P, Michaux, J, Pak, H-S, Daniel, R T, Racle, J, Coukos, G, and Bassani-Sternberg, M. The Length Distribution and Multiple Specificity of Naturally Presented HLA-I Ligands. J. Immunol. 2018;201:3705–3716.
[3] Boland, N, Charkhgard, H, and Savelsbergh, M. A criterion space search algorithm for biobjective integer programming: The balanced box method. INFORMS J. Comput. 2015; 27, 735–754.