Notebooks and data associated with Loell, et al. 2023, 'Transcription factor interactions explain the context-dependent activity of CRX binding sites'.
This repository contains:
(1) a jupyter notebook and associated files needed to load all saved models and reproduce all of the figures from the paper and
(2) a jupyter notebook with the code used to train the models described in the paper.
Original fastq files for the CDNMR library massviely parallel reporter assay (MPRA) data are available at the NCBI Gene Expression Omnibus, accession number GSE225867. All processed data are available here in /data_files. See below for a description of the files in the repository directories.
The data for the CRX-NRL library is taken from supplementary Data S3 of White, Kwasnieski, et al., Cell Reports 2016 Oct 25;17(5):1247-1254, available at https://doi.org/10.1016/j.celrep.2016.09.066.
These notebooks require MAVE-NN to run. For instructions, see https://mavenn.readthedocs.io and Tareen, et al., Genome Biology 2022 Apr 15;23(1):98 at https://doi.org/10.1186/s13059-022-02661-7.