introduction.Rmd

<br class="long"/>

# Introduction
The following CyTOF analysis pipeline called <b>cytoMine</b> is developed by the  IMC bioinformatics platform and utilizes state-of-the art existing R packages and custom code. The goal of this pipeline is to provide a simple interface for users to visualize their FCS files, transform, visualize and cluster their mass cytometry data.

<br>

# Input files

This pipeline requires .FCS files as a starting input. You can provide a single or multiple .FCS files as input. If multiple files are provided, the pipeline will either concatenate them or keep them as separate files and process them separately.

<br>

# Pipleline Structure

<br>

* <a href="preprocessing.html">Preprocessing</a>

    + <a href="preprocessing.html">Normalization</a>
    + <a href="preprocessing.html">Debarcoding</a>
    + <a href="processing.html#rand">Randomize 0 values between -1 and 0</a>
    + Transformations


<br>

* Visualize
    + Basic plots for cell counts per sample, density plots for markers and PCA
    + Plot markers for inspection
    + t-SNE
    + t-SNE with different marker expressions

<br>

* Cell Clustering - automated

    + FlowSOM
    + ClusterX

<br>

* Discovery

    + Biomarker discovery :Differential expression analysis (edgeR)
    + Cell Progression: ISOMAPS (In progress)

<br>

# Usage


All commands in this pipeline can be run on the Snyder console. Each command is structured in the following way:

```
COMMAND ( COMPULSARY Inputs) [OPTIONAL Inputs]
```
<br/>

All inputs which are compulsory are shown inside the parentheses and optional inputs to a command are shown in square brackets. For optional inputs, default values are provided and can be seen by typing:

```
cytoMine --help
```
<br/>

Here is an example for the catalyst command used for bead normalization

```
 cytoMine (--inputDir=<directory> --outputDir=<directory> --mode=<exMode>)  [--beadMasses=<beads>]
```

In this example, --inputFiles , --outputDir and --mode are compulsory parameters while to run bead normalization beadMasses is also needed. 

<br/>

An example of how to perform normalization on an fcs files

```
cytoMine  --inputDir=data --beadMasses=140,151,153,165,175 --outputDir=results --mode=full

```

# Modes

cytoMine is developed so users can do different level of pipeline executions and do not have to wait for the full pipeline to run if they just wanted some very basic plots to understand their data. For this purpose, cytoMine has four working modes


## --mode=channels
In this mode, a user can find out how the channels are named/described in their fcs files. This is important because the marker list that the user input must match the names given in the description parameter of the fcs file.

### An example of how to run this mode is:

```
cytoMine  --inputDir=data --outputDir=results --mode=channels

```

A Channels.csv file is stored in the outputDir for quick look.

## --mode=basic
In the basic mode, only the very basic plots are generated to give the user idea about their data.

### An example of how to run this mode is:

```
cytoMine  --inputDir=data --outputDir=results --mode=basic

```
The output is saved in the basicPlots folder in outputDir

Filename | Description
---------|------------------------------------------
counts_plot.png| barplots of cell counts in each sample file
density_plot.png| Density plots for each marker in the marker list for each sample
PCA_plot.png| PCA plots for samples


## --mode=full

In this mode the whole cytoMine pipeline is executed. All the results are saved in the outputDir folder or appropriate subfolders. All the data is also stored in an R object called cytoMine.RDATA. This is used if the user wants to re-run the code for expert annotated re-clustering/merging of the data.

```
cytoMine  --inputDir=data --outputDir=results --mode=full

```


## --mode=mergeClusters

This mode is used only after the cytoMine pipeline has been executed in the full mode first. This mode is used to re-assign or merge clusters generated by FLOWSOM or ClusterX.

### An example of invoking mergeCluster mode with FlowSOM merging of clusters

```
cytoMine  --inputDir=data --outputDir=results --mode=mergeCluster --reclustFSFile=merge_cluster.csv

```

# Required InputFiles
All required inputFiles mentioned bellow should be in the inputDir. Default name for each file are provided. Please be consistent with the file format. 

Filename | Description
---------|------------------------------------------
<a href="examples/sample_info.csv">sample_info.csv</a>| This file holds information about your filenames, grouping, type etc
<a href="examples/marker_list.csv">marker_list.csv</a>| This file holds information about the markers of interest
<a href="examples/merge_clusters.csv">merge_clusters.csv</a>|Optional file only needed when --mode=mergeCluster

*Note: please avoid having spaces in your fcs file names.Feel free to use '_' instead of space.