smRNA / miRNA Processing

Usage

Usage:

pipeline.sh
            [-r <genome_reference_file> (FASTA)>]
            [-g <genome_feature_file (GTF/GFF)>]
            [-m <mirbase_file (FASTA)>]
            [-o <output_directory>]
            [-n <cores>]
            [-f (overwrite existing files)]
            [-k (keep temp files)]
            <sequence_file> [<additional_sequence_files> <will_be_merged> <before_processing>]

What It Does

Aligns single-read (not paired-end) RNA samples to reference genomes, annotates them, and counts gene frequencies. Also aligns against miRBase. Visualizes read lengths after trimming. More specifically:

Merges multiple FASTQ files into a single FASTQ file (simple concatenation)
Trims adapter sequences using cutadapt
Aligns to a reference file specified by the user using bowtie2
Tallies the number of times reads align to each feature in an annotation file specified by the user
Aligns to the miRBase microRNA database (http://www.mirbase.org)
Visualizes the read lengths after trimming on a histogram created with matplotlib

What It Does Not Do

Automatically generate a report or statistics. This must be done manually, although an example report in RST format is supplied

Required Software

bowtie2
HTSeq (python module)
matplotlib (python module)

Input

Single-read sample data in FASTQ format
Reference genome for alignment in fasta format (with bowtie2 indexes)
Annotation file in gff format
miRBase file for alignment in fasta format (with bowtie2 indexes)

Output

Alignment files in BAM format
Annotated alignment files in BAM format
Feature frequency counts in CSV format
Visualization of read lengths in JPG format
A log of the commands executed and their output

Example Usage

bash pipeline.sh \
    -r path/to/reference_file.fa \
    -g path/to/annotation_file.gff \
    -m path/to/miRBase.fa \
    -o output_dir/ \
    -n 8 \
    -f \
    -k \
    data_file1.fastq data_file2.fq data_file3_fastq

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.gitignore		.gitignore
README.md		README.md
bin_types_by_size.py		bin_types_by_size.py
counts.py		counts.py
get_mapped_reads_fastq.sh		get_mapped_reads_fastq.sh
grab_names_from_sdb.py		grab_names_from_sdb.py
pipeline.py		pipeline.py
pipeline.sh		pipeline.sh
plots.py		plots.py
sll_logo.jpg		sll_logo.jpg
smRNA_report.rst		smRNA_report.rst
test_getopts.sh		test_getopts.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

smRNA / miRNA Processing

Usage

What It Does

What It Does Not Do

Required Software

Input

Output

Example Usage

About

Releases

Packages

Languages

mariogiov/miRNA_processing

Folders and files

Latest commit

History

Repository files navigation

smRNA / miRNA Processing

Usage

What It Does

What It Does Not Do

Required Software

Input

Output

Example Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages