- Authors: Kushal K Dey, Zhengrong Xing, Dongyue Xie, Matthew Stephens
dashr is a companion package to the ashr package by Matthew Stephens (see paper here) which is aimed at adaptive shrinkage of compositional counts data.
This model assumes a mixture Dirichlet distribution for the true compositional proportions of categories, with the flexibility that the mixture may in theory comprise of infinitely many components of varying concentrations but all having the same known mean. The mixing proportions of these components are estimated from the model
When there are only two categories of composition, the mixture Dirichlet distribution would essentially reduce to mixture Beta distribution and dashr uses this distribution to model the message flow proportions along a Poisson multiscale tree model for time varying counts data to perform adaptive smoothing
To install the dashr package, please install the dependencies if not loaded already.
install.packages("devtools")
install.packages("Rcpp")
install.packages("inline")
install.packages("LaplacesDemon")
devtools::install_github("stephens999/ashr")
To perform the logo plot representations, please install the Logolas package and for comparing with the SMASH output as in the vignettes, please install the smashr package.
Then install the dashr package.
devtools::install_github("kkdey/dashr")
and then load the package in your R environment.
library(dashr)
For a more detailed description of the models underlying the dashr package, together with example applications, please check our vignette.
We first provide an example application of dashr for shrinking the base compositional probabilities in a simulated Trasncription Factor Binding Site (TFBS) example. We first generate the compositional counts data under this setting.
xmat <- cbind(c(5, 0, 2, 0),
c(1, 1, 0, 1),
c(100, 100, 50, 100),
c(20, 50, 100, 10),
c(10, 10, 200, 20),
c(50, 54, 58, 53),
c(1,1,1,3),
c(2, 4, 1, 1))
rownames(xmat) <- c("A", "C", "G", "T")
colnames(xmat) <- paste0("pos-", 1:dim(xmat)[2])
xmat_norm <- apply(xmat, 2, function(x) return(x/sum(x)))
xmat
We then fit the Dirichlet adaptive shrinkage (dash) model to the above compositional counts matrix.
out <- dash(xmat, optmethod = "mixEM", verbose=FALSE, bf=TRUE)
We present the logo plot representations (package Logolas) of the compositional weight matrix.
grid.newpage()
layout.rows <- 1
layout.cols <- 2
top.vp <- viewport(layout=grid.layout(layout.rows, layout.cols,
widths=unit(rep(6,layout.cols), rep("null", 2)),
heights=unit(c(20,50), rep("lines", 2))))
plot_reg <- vpList()
l <- 1
for(i in 1:layout.rows){
for(j in 1:layout.cols){
plot_reg[[l]] <- viewport(layout.pos.col = j, layout.pos.row = i, name = paste0("plotlogo", l))
l <- l+1
}
}
plot_tree <- vpTree(top.vp, plot_reg)
color_profile = list("type" = "per_row",
"col" = RColorBrewer::brewer.pal(4,name ="Spectral"))
pushViewport(plot_tree)
seekViewport(paste0("plotlogo", 1))
logomaker(xmat_norm,color_profile = color_profile,
frame_width = 1,
pop_name = "pre dash PWM",
newpage = F)
seekViewport(paste0('plotlogo',2))
logomaker(out$posmean,color_profile = color_profile,
frame_width = 1,
pop_name = "post dash PWM",
newpage = F)
We now present an example illustration of adaptive smoothing using the dashr package.
mu <- c(rep(10, 100), rep(20, 100), rep(30, 100), rep(10, 100))
x <- sapply(mu, function(x) rpois(1,x))
out <- dashr::dash_smooth(x)
plot(x, col = "gray80")
lines(mu, col = "black", lwd = 4)
lines(out$estimate, col = "red", lwd = 4)
legend("topright", # places a legend at the appropriate place
c("truth","dash-smooth"), # puts text in the legend
lty=c(1,1), # gives the legend appropriate symbols (lines)
lwd=c(2.5,2.5),
cex = 0.6,
col=c("black","red"))
The dashr package is availabel under GPL (>=2) license.
The authors would like to acknowledge Peter Carbonetto for helpful feedback.
For any queries or questions or bugs, please open an issue in this Github page or write to Kushal K Dey at kshldey@gmail.com.