Skip to content

Latest commit

 

History

History
180 lines (126 loc) · 15.5 KB

scRNA-seq-cheatsheet.md

File metadata and controls

180 lines (126 loc) · 15.5 KB

scRNA-seq Cheatsheet

The tables below consist of valuable functions or commands that will help you through this module.

Each table represents a different library/tool and its corresponding commands.

Please note that these tables are not intended to tell you all the information you need to know about each command.

The hyperlinks found in each piece of code will take you to the documentation for further information on the usage of each command. Please be aware that the documentation will generally provide information about the given function's most current version (or a recent version, depending on how often the documentation site is updated). This will usually (but not always!) match what you have installed on your machine. If you have a different version of R or other R packages, the documentation may differ from what you have installed.

Table of Contents

Base R

Read the Base R documentation.

Library/Package Piece of Code What it's called What it does
Base R rowSums() Row sums Calculates sums for each row
Base R colSums() Column sums Calculates sums for each column
Base R t() Transpose Returns the transpose of a matrix or data frame
Base R prcomp() Principal Components Analysis Executes a principal components analysis on specified matrix or data frame
Base R <-function(x) { <code> } Function Creates a function that would take the defined parameters as input and execute the commands within the curly braces

Salmon and alevinQC

Read the command-line tool Salmon documentation.

Read the R package alevinQC documentation.

Software/package Piece of Code What it's called What it does
Salmon salmon alevin Salmon Alevin Runs the Alevin quantification from the command line
alevinQC alevinQCReport() Alevin QC Report Produces a QC (quality check) report from the salmon alevin output

SingleCellExperiment, txmimeta, and DropletUtils

Read the SingleCellExperiment package documentation (and e-book), and a vignette on its usage. Note that some of the SingleCellExperiment functions link to documentation from other packages like SummarizedExperiment or ExperimentSubset. In fact, SingleCellExperiment objects are based around existing Bioconductor functions in those packages, so the function usage is equivalent!

Read the tximeta package documentation, and a vignette on its usage.

Read the DropletUtils package documentation.

Library/Package Piece of Code What it's called What it does
SingleCellExperiment SingleCellExperiment() Single Cell Experiment Creates a SingleCellExperiment object
SingleCellExperiment colData() Column Data Extracts and stores cell-level metadata that describes features of the SingleCellExperiment object
SingleCellExperiment rowData() Row Data Extracts and stores gene-level metadata that describes features of the SingleCellExperiment object
SingleCellExperiment assay() Assay Extracts and stores a given assay from a SingleCellExperiment object
SingleCellExperiment assayNames() Assay names Returns a vector of the names of all assays in a SingleCellExperiment object
SingleCellExperiment logcounts() Log counts Extracts and stores log-transformed single-cell experiment count data as an assay of the SingleCellExperiment object
SingleCellExperiment counts() Counts Extracts and stores raw single-cell experiment count data as an assay of the SingleCellExperiment object
SingleCellExperiment reducedDim() Reduced dim Extracts or stores a given reduced dimension from a SingleCellExperiment object
SingleCellExperiment reducedDimNames() Reduced dim names Returns a vector of the names of all reduced dimensions in a SingleCellExperiment object
S4Vectors DataFrame() Data frame Not to be confused with data.frame() from Base R. This is a slightly different data frame-like object needed for storing information in SingleCellExperiment object's colData slot.
tximeta tximeta() Transcript Quantification Import with Automatic Metadata Load a directory of results produced by Salmon/or alevin output, including the associated metadata
DropletUtils read10xCounts() Read 10x counts Load data from a 10x Genomics experiment into R
DropletUtils emptyDrops() Empty drops Use the overall gene expression patterns in the sample to identify empty droplets
DropletUtils emptyDropsCellRanger() Empty drops Cell Ranger Use an approach analogous to Cell Ranger's algorithm to identify empty droplets

scran and scater

Read the scran package documentation, and a vignette on its usage.

Read the scater package documentation, and a vignette on its usage.

Library/Package Piece of Code What it's called What it does
scran quickCluster() Quick Clustering Groups similar cells into clusters which are stored in the SingleCellExperiment object and are used for the calculation of size factors by scran::computeSumFactors
scran computeSumFactors() Compute Sum Factors Returns a numeric vector of computed sum factors for each cell cluster stored in the SingleCellExperiment object. The cluster-based size factors are deconvolved into cell-based size factors that are stored in the SingleCellExperiment object and used by the scran::normalize function for the normalization of each cell's gene expression profile
scran getTopHVGs() Get top highly variable genes Identify variable genes in a SingleCellExperiment object, based on variance
scran modelGeneVar() model per gene variance Model the per gene variance of a SingleCellExperiment object
scran findMarkers() Find marker genes Find candidate marker genes for clusters of cells
scater logNormCounts() Normalize log counts Returns the SingleCellExperiment object with normalized expression values for each cell, using the size factors stored in the object
scater addPerCellQC() Add per cell quality control For a SingleCellExperiment object, calculate and add quality control per cell and store in colData
scater addPerFeatureQC() Add per feature quality control For a SingleCellExperiment object, calculate and add quality control per feature (genes usually) and store in rowData
scater calculatePCA() Calculate PCA Calculates principal components analysis on a SingleCellExperiment object, returning a PCA matrix
scater runPCA() Run PCA Calculates principal components analysis on a SingleCellExperiment object, returning an SCE object with a PCA reduced dimension
scater calculateUMAP() Calculate UMAP Calculates uniform manifold approximate projection on a SingleCellExperiment object, returning a UMAP matrix
scater runUMAP() Run UMAP Calculates uniform manifold approximate projection on a SingleCellExperiment object, returning an SCE object with a UMAP reduced dimension
scater calculateTSNE() Calculate t-SNE Calculates t-stochastic neighbor embedding on a SingleCellExperiment object, returning an SCE object with a TSNE reduced dimension
scater runTSNE() Calculate UMAP Calculates t-stochastic neighbor embedding on a SingleCellExperiment object, returning a t-SNE matrix
scater plotReducedDim() Plot reduced dimensions Plot a given reduced dimension slot from a SingleCellExperiment object by its name
scater plotPCA() Plot PCA Plot the "PCA"-named reduced dimension slot from a SingleCellExperiment object
scater plotUMAP() Plot UMAP Plot the "UMAP"-named reduced dimension slot from a SingleCellExperiment object

purrr, stringr, and tibble

Read the purrr package documentation.

Read the stringr package documentation.

Read the tibble package documentation.

Library/Package Piece of Code What it's called What it does
purrr map() map Apply a function across each element of list; return a list
purrr map_df() map df Apply a function across each element of list; return a data frame
purrr imap() imap Apply a function across each element of list and its index/names
stringr str_remove() String remove Remove matched string patterns
tibble as_tibble() As tibble Coerce data.frame or matrix to a tibble

Note that purrr::map() functions can take advantage of R's new (as of version 4.1.0) anonymous function syntax:

# One-line syntax:
\(x) # function code goes here #

# Multi-line syntax:
\(x) {
  # function code goes      #
  # inside the curly braces #
}

# Example: Use an anonymous function with `purrr::map()`
# to get the colData's rownames for each SCE in `list_of_sce_objects`
purrr::map(
  list_of_sce_objects,
  \(x) rownames(colData(x))
)

bluster

Read the bluster package documentation and this vignette on its usage.

Library/Package Piece of Code What it's called What it does
bluster clusterRows() Cluster rows of a matrix Perform clustering using a variety of algorithms on a matrix-like object
bluster KmeansParam() K-means clustering parameters Set up parameters to run clustering using kmeans() within bluster::clusterRows()
bluster NNGraphParam() Graph-based clustering parameters Set up parameters for nearest-neighbor (NN) graph-based clustering algorithms within bluster::clusterRows()

SingleR

Read the SingleR package documentation, and an e-book on its usage.

Library/Package Piece of Code What it's called What it does
SingleR trainSingleR() Train the SingleR classifier Build a SingleR classifier model object from an annotated reference dataset
SingleR classifySingleR() Classify cells with SingleR Use a SingleR model object to assign cell types to the cells in an SCE object
SingleR SingleR() Annotate scRNA-seq data Combines trainSingleR() and classifySingleR() to assign cell types to an SCE object from an annotated reference dataset