forked from GuangchuangYu/enrichment4GTEx_clusterProfiler
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.rmd
56 lines (32 loc) · 2.15 KB
/
README.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# functional enrichment for GTEx paper
### Guangchuang Yu
#### 07/15, 2015
Refer to the [issue](https://github.com/GuangchuangYu/DOSE/issues/6), this document will reproduce functional analysis using [DOSE](http://www.bioconductor/packages/DOSE) and [clusterProfiler](http://www.bioconductor.org/packages/clusterProfiler) packages.
The paper and supplemental files are located at [paper folder](paper).
In page 63 of [supplemental file](paper/Mele.SM.pdf), the authors mentioned that *genes with high contribution of individuals to splicing variation* were used in GO enrichment analysis.
![](figures/Screenshot 2015-07-15 20.03.05.png)
The **Individual and Tissue contribution to variation of splicing** was stored in [supplemental table 17](paper/LargeSupplementTABLES_May1st_2015.xlsx), which was exported as a [csv file](tableS17.csv).
```{r}
require(magrittr)
require(DOSE)
require(RDAVIDWebService)
require(clusterProfiler)
table17 <- read.csv("paper/tableS17.csv")
gene <- with(table17, gene_id[ Individuals > quantile(Individuals, 0.98)])
gene %<>% as.character %>% gsub("\\.\\d+", "", .)
head(gene)
length(gene)
save(gene, file="cache/gene.rda")
```
The authors did not mention how they define __*high contribution*__, here I use the top 2% of genes and get `r length(gene)` genes with 168 genes that could be mapped in the DAVID database. Slightly large than the number (139) reported in [supplemental file](paper/Mele.SM.pdf).
The GO enrichment result reported in [supplemental table 18](paper/LargeSupplementTABLES_May1st_2015.xlsx) is:
![](figures/Screenshot 2015-07-15 19.49.01.png)
I pasted the genes into DAVID and got similar results.
![](figures/Screenshot 2015-07-15 21.10.11.png)
![](figures/Screenshot 2015-07-15 21.10.28.png)
Although the gene list I selected here is slighly different from the one author selected (which we don't know), it can reproduce the results reported in the paper.
Here, I use these genes to perform enrichment analyses and compare results from DAVID and clusterProfiler.
+ [GO (BP) enrichment analysis](GO_BP.md)
+ [KEGG enrichment analysis](KEGG.md)
+ [DO enrichment analysis](DO.md)
+ [Reactome enrichment analysis](Reactome.md)