ShapeLinker is a method for the shape-conditioned de novo linker design for PROTACs. It is based on Link-INVENT and uses reinforcement learning to steer the linker generation towards a query shape with desired physicochemical properties. Shape alignment is performed with a novel, fast attention-based point cloud alignment method.
Preprint: Reinforcement Learning-Driven Linker Design via Fast Attention-based Point Cloud Alignment
- Multi-parameter optimization using shape alignment requires two different conda environments (see below).
- Only works on Cuda-enabled GPU.
- The code was tested on Debian 10 only.
- Create ShapeLinker conda environment
conda env create -f env.yml
- Create
shape_align
environment:
conda install -c conda-forge mamba
mamba create -n shape_align python=3.9 pytorch=1.13.0 torchvision pytorch-cuda=11.6 fvcore iopath nvidiacub pytorch3d -c bottler -c fvcore -c iopath -c pytorch -c nvidia -c pytorch3d
conda activate shape_align
pip install pykeops biotite open3d plyfile ProDy pykeops rdkit==2022.9.5 tqdm==4.49.0 unidip pytorch-lightning
pip install git+https://github.com/hesther/espsim.git
Download data and models from https://storage.googleapis.com/vantai-public-archive/shapelinker. This data dump includes:
- folder
data
- folder
xtal_poses
: Processed and fragmented crystal structures protacdb_extended_linkers.csv
: Processed PROTAC-DB datapdb_systems_data.csv
: Processed data for the investigated crystal structures
- folder
- folder
models
protacdb_extlinker_model_align.pth
: Trained model for shape alignment- folder
agents
: Trained RL agents for the different crystal structures
The Link-INVENT prior, which is needed for any RL run, can be accessed here.
Steps to get directory structure used in notebooks:
- Store folder
data
inShapeLinker/utils
cd ShapeLinker/utils
gsutil cp -r gs://vantai-public-archive/shapelinker/data .
- Store folder
models
inShapeLinker
cd ShapeLinker
gsutil cp -r gs://vantai-public-archive/shapelinker/models . # includes trained RL agents
- Dump
linkinvent.prior
inShapeLinker/models
cd ShapeLinker/models
wget https://github.com/MolecularAI/ReinventCommunity/raw/master/notebooks/models/linkinvent.prior
The notebooks (folder ShapeLinker/notebooks
) used here were adapted from ReinventCommunity and help with preparing runs for RL or sampling. There is also a notebook to help with training a shape alignment model (notebooks/train_shape_alignment_model.ipynb
). We recommend training a new model for poses different from the crystal structures investigated here (of which the extended linkers were used).
The folder utils/postprocessing
contains more useful jupyter notebooks allowing the postprocessing and evaluation of the generated data.