GitHub - Bornelov-lab/Camformer: Code repository for the paper "Predicting gene expression using millions of yeast promoters reveals cis-regulatory logic"

Camformer

Predicting gene expression using millions of yeast promoters reveals cis-regulatory logic

Problem: Let $S = {A,C,G,T,N}^{110}$ denote a promoter sequence of length $110$. Here, $A$, $C$, $G$, $T$ are the four nucleotides and $N$ represents an unknown nucleotide. The gene expression prediction task is then to learn a mapping $f: S \to \mathbb{R}$.

Data: We use data from DREAM Challenge consisting of 7 million random promoter sequences and the yellow fluorescent protein level. We then use the official test set from the challenge to evaluate our trained model(s).

Model: A residual convolutional neural network, strategically optimised using automated hyperparameter tuning.

The figure above shows the structure of the original (large variant) model (16M parameters). There is an almost equally good model that has 90% less parameters (1.4M). Please see the associated manuscript (preprint) for more details.

Assessment: Predictive, comparative

Assessment: Explanatory, Scientific discovery

Evaluating a trained model for explanatory assessment

File information

Here are some details on what the purpose of each file is:

File	Purpose
`gen_figs.ipynb`	A notebook to show (re-generate) some figures in the manuscript.
`train_rep.py`	Program to train several replicates of a Camformer model using training data.
`score_rep.py`	Program to test several replicates of a trained Camformer model on test data.

Directory structure

Directory	Contents
`base`	Contains core codebase, utility functions, auxiliary helper files etc.
`manuscript_figures`	Contains data, script and figures present in the manuscript.
`readme_figs`	Images used to prepare this nice-looking README file.
`analysis`	Contains some basic analysis of results. Contents may be updated.

References

Relevant resources and previous Camformer repositories.

Camformer repository (2022 version): DREAM2022 Submission
DREAM 2022 Challenge Wiki Page
Rafi et al., 2023: Paper
Rafi et al., 2023: Official Evaluation

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
analysis/FIMO		analysis/FIMO
base		base
manuscript_figures		manuscript_figures
readme_figs		readme_figs
.gitignore		.gitignore
README.md		README.md
env.yml		env.yml
gen_figs.ipynb		gen_figs.ipynb
score_rep.py		score_rep.py
train_rep.py		train_rep.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Camformer

File information

Directory structure

References

About

Releases

Packages

Contributors 2

Languages

Bornelov-lab/Camformer

Folders and files

Latest commit

History

Repository files navigation

Camformer

File information

Directory structure

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages