Skip to content

A project housing some code and documentation for performing kinetics analyses using R.

Notifications You must be signed in to change notification settings

oakeley/R-kinetics

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The R-kinetics Project

This project provides a vignette that describes how to perform kinetic analyses using the R suite of tools provided by Pacific Biosciences. The aim of this project is to introduce users to Pacific Bioscience's data - specifically kinetics data that is collected when performing a sequencing experiment. In addition to describing the data structures, we provide instructions on using the pbh5 R API to explore the rich set of information provided.

Contents

  • analysis.pdf: This document provides an overview of the API functionality as well as instructions/guidelines on how to perform a kinetics analysis using PacBio data.

  • Src: contains example code/scripts to perform modification detection

  • Data: contains two example datasets which have been generated by Pacific Biosciences with known modifications to facilitate development of detection algorithms. These datasets are identical to those that one would obtain by performing a sequencing run with a PacBio instrument.

  • ReferenceRepository: contains the two reference sequences used during this analysis.

Installation

This setup has been tested on Ubuntu 10.04. Realistically, all this should work with any recent version of Linux or Mac OS X. All instructions here assume that you want to install things system-wide; if that is not the case, then you'll probably need to install/build some things locally.

The pbh5 package which provides all of the important tools for accessing the data and conducting the analysis can be run on stock R (>= 2.10) with only the h5r as a dependency. Therefore if one wants to get started with stock R, then one can install just the h5r and pbh5 packages (steps 2 and 3 below).

1.) Obtaining a Recent Version of R, i.e., R >= 2.11.0

The default R on most package-managed linuxes is relatively old. The best way to get a new version is to update R using your package management system. Clear instructions for different version of Linux can be found here:

http://cran.fhcrc.org/

For Ubuntu lucid (10.04), add the following line to /etc/apt/sources.list

deb http://cran.fhcrc.org/bin/linux/ubuntu lucid/

Then,

sudo apt-get update
sudo apt-get install r-base

2.) Obtaining the HDF5 Libraries

We need to install both the headers and shared libraries to access HDF5 file files.

sudo apt-get install libhdf5-serial-1.8.4 libhdf5-serial-dev 

3.) Installing h5r and pbh5 Packages

The easiest way to install h5r is via the "install.packages" function from within R., i.e.,

> install.packages("h5r")

Following successfull installation of the h5r package, we install the pbh5 R package using the following:

 wget https://github.com/PacificBiosciences/R-pbh5/zipball/master -O pbh5.zip && unzip pbh5.zip 
 sudo R CMD INSTALL PacificBiosciences-R-pbh5*

4.) Installing Additional R Packages

A couple of other packages are necessary to execute the analysis.Rnw document. These are reasonably standard R packages and should install straightforwardly.

> install.packages(c("ggplot2", "xtable"))
> source("http://bioconductor.org/biocLite.R")
> biocLite("Biostrings")

5.) Installing pbutils R Package

Finally, we provide some useful functions that aren't specific to our data structures.

 wget https://github.com/PacificBiosciences/R-pbutils/zipball/master -O pbutils.zip && unzip pbutils.zip && sudo R CMD INSTALL PacificBiosciences-R-pbutils*

6.) Running the Document

In order to generate analysis.pdf, one simply should execute make. This will download the data and then run the code in analysis.Rnw.

make

At this point, a new analysis.pdf should have been generated. What can often fail is the generation of a pdf from the input analysis.tex. In order to generate the pdf, one will need to install pdflatex, this might correspond to too many packages that a sysadmin would want to install. In any case, a user can simply march through the document and run the individual code snippets.

Disclaimer

THIS WEBSITE AND CONTENT AND ALL SITE-RELATED SERVICES, INCLUDING ANY DATA, ARE PROVIDED "AS IS," WITH ALL FAULTS, WITH NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTIES OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT OR FITNESS FOR A PARTICULAR PURPOSE. YOU ASSUME TOTAL RESPONSIBILITY AND RISK FOR YOUR USE OF THIS SITE, ALL SITE-RELATED SERVICES, AND ANY THIRD PARTY WEBSITES OR APPLICATIONS. NO ORAL OR WRITTEN INFORMATION OR ADVICE SHALL CREATE A WARRANTY OF ANY KIND. ANY REFERENCES TO SPECIFIC PRODUCTS OR SERVICES ON THE WEBSITES DO NOT CONSTITUTE OR IMPLY A RECOMMENDATION OR ENDORSEMENT BY PACIFIC BIOSCIENCES.

About

A project housing some code and documentation for performing kinetics analyses using R.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 89.0%
  • Shell 5.1%
  • Makefile 3.4%
  • TeX 2.5%