Skip to content
/ MIST Public

An interpretable and flexible deep learning framework for single-T cell transcriptome and receptor analysis

License

Notifications You must be signed in to change notification settings

aapupu/MIST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MIST (Multi-InSight for T cell)

MIST: an interpretable and flexible deep learning framework for single-T cell transcriptome and receptor analysis

Installation

Install from GitHub

install the latest develop version

pip install git+https://github.com/aapupu/MIST.git

or git clone and install

git clone git://github.com/aapupu/MIST.git
cd MIST
pip install -e .

Install from PyPI

pip install mist-vae

Note: Python 3.8 and scirpy 0.12.0 are recommended. MIST is implemented in Pytorch framework. If cuda is available, GPU modes will be run automatically.

Usage

1. API function

from mist import MIST
adata, model = MIST(rna_path, tcr_path, batch, rna_data_type, tcr_data_type, type)

Parameters of API function are similar to command line options.
The output includes a trained model and an Anndata object, which can be further analyzed using scanpy and scirpy.
rna_path List of paths to scRNA-seq data files.
tcr_path List of paths to scTCR-seq data files.
batch List of batch labels.
rna_data_type Type of scRNA-seq data file (e.g., 'h5ad').
tcr_data_type Type of scTCR-seq data file (e.g., '10X').
type Type of model to train ('joint', 'rna', or 'tcr').

2. Command line

MIST --rna_path rna_path1 rna_path2 --tcr_path tcr_path1 tcr_path2 --batch batch1 batch2 --rna_data_type h5ad --tcr_data_type 10X --type joint

Output

  • adata.h5ad: preprocessed data and results
  • model.pt: saved model

Option

  • --rna_path
    Paths to scRNA-seq data files. (example: XXX1.h5ad XXX2.h5ad)
  • --tcr_path
    Paths to scTCR-seq data files. (example: XXX1.csv XXX2.csv)
  • --batch
    Batch labels.
  • --rna_data_type
    Type of scRNA-seq data file (e.g., 10X mtx, h5, or h5ad). Default: h5ad
  • --tcr_data_type
    Type of scTCR-seq data file (e.g., 10X, tracer, BD, or h5ad). Default: 10X
  • --protein_path
    Path to merged protein (ADT) data file.
  • --type
    Type of model to train (e.g., joint, rna, or tcr). Default: joint
  • --min_genes
    Filtered out cells that are detected in less than min_genes. Default: 600
  • --min_cells
    Filtered out genes that are detected in less than min_cells. Default: 3
  • --pct_mt
    Filtered out cells that are detected in more than percentage of mitochondrial genes. If None, Filtered out mitochondrial genes. Default: None
  • --n_top_genes
    Number of highly-variable genes to keep. Default: 2000
  • --batch_size
    Batch size for training. Default: 128
  • --pooling_dims
    Dimensionality of pooling layer. Default: 16
  • --z_dims
    Dimensionality of latent space. If type='rna', z_dims=pooling_dims. Default: 128
  • --drop_prob
    Dropout probability. Default: 0.1
  • --lr
    Learning rate for the optimizer. Default: 1e-4
  • --weight_decay
    L2 regularization strength. Default: 1e-3
  • --max_epoch
    Maximum number of epochs. Default: 300
  • --patience
    Patience for early stopping. Default: 30
  • --warmup
    Warmup epochs. Default: 30
  • --gpu
    Index of GPU to use if GPU is available. Default: 0
  • --seed
    Random seed. Default: 42
  • --outdir
    Output directory.

Help

Explore further applications of MIST

MIST.py --help 

The running examples of MIST can be found in the jupyter folder.

Citation

MIST: an interpretable and flexible deep learning framework for single-T cell transcriptome and receptor analysis
Wenpu Lai, Yangqiu Li, Oscar Junhong Luo
bioRxiv 2024.07.05.602192; doi: https://doi.org/10.1101/2024.07.05.602192

Contacts

kyzy850520@163.com
luojh@jnu.edu.cn

About

An interpretable and flexible deep learning framework for single-T cell transcriptome and receptor analysis

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published