-
Notifications
You must be signed in to change notification settings - Fork 6
Home
Avi Srivastava edited this page Aug 23, 2016
·
4 revisions
To run RapClust pipeline we need to have the following information beforehand:
1. RNA-seq reads of the experiment in two different conditions and possibly multiple replicates.
**2.** *de novo* assembly (set of contigs) of the RNA-seq reads. Assembly can be performed using trinity which can be found [here](https://github.com/trinityrnaseq/trinityrnaseq/wiki).
~~~**Note**: Input assembly can be from any standard assembler, trinity is used just as an example here.
**3.** Quantification of the RNA-seq reads separately in two different conditions using the above set of contigs as the reference.
~~~**Note**: Currently we only support [Sailfish](https://github.com/kingsfordgroup/sailfish)/[Salmon](https://github.com/COMBINE-lab/salmon).
**4.** RapClust source code/binary can be found [here](https://github.com/COMBINE-lab/RapClust).
## Pipeline:
### 1. *de novo* assembly:
`Trinity --seqType fq --left reads_1.fq --right reads_2.fq --CPU 6 --max_memory 20G`
* output would be available as Trinity.Fasta (i.e. the set of contigs).
* If you face problem in this step, some tips are available [here](https://github.com/Oshlack/Corset/wiki/Example#perform-the-de-novo-assembly) or raise issue [here](https://github.com/trinityrnaseq/trinityrnaseq).
### 2. Quantification:
Here we can use either Sailfish/Salmon, example below is for Sailfish.
* Clone and build Sailfish:
`git clone https://github.com/kingsfordgroup/sailfish.git`
`cd sailfish && mkdir build && cd build`
`cmake .. && make`
* Make index for the reference (i.e. the set of contigs in our case):
`sailfish index -t <ref_transcripts>/Trinity.fa -o <out_dir>/index -k <kmer_len>/31`
* Quantify reads:
Based on the number of replicates in each condition we have to run sailfish multiple times, our example assumes two conditions(**A** and **B**) with three replicates(**1**, **2**, **3**) in each:
`parallel -j 6 "samp={}; sailfish quant -i index -l IU -1 <(gunzip -c reads/{$samp}_1.fq.gz) -2 <(gunzip -c reads/{$samp}_2.fq.gz) -o {$samp}_quant --dumpEq -p 4" ::: A1 A2 B1 B2 C1 C2`
### 3. Clustering:
~~~Note: A detailed explanation of this step can also be found [here](https://github.com/keyavi/RapClust/tree/master#using-rapclust).
* If you have conda than RapClust can be installed directly from the cloud without any concern for the dependencies.
`conda create --name rapclust_env python=3`
`source activate rapclust_env `
`conda install rapclust`
* [optional] Below command can be used to install miniconda if conda was not available.
`wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh`
`bash Miniconda3-latest-Linux-x86_64.sh`
* Make configuration file:
Make a file with extension **.yaml** with following mandatory fields:
```
conditions:
- A
- B
samples:
A:
- A1_quant
- A2_quant
- A3_quant
B:
- B1_quant
- B2_quant
- B3_quant
outdir: <output_dir>/human_rapclust
```
* Run RapClust
`RapClust --config <Name_of_file>.yaml`