You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have noted a substantial increase the proportion of non-biallelic SNPs across dbSNP 144 and 155 (increased from ~1% to 24%). This is explained in detail here and summarised in the code below (fuicntion written by @hpages).
Note that this is based on the dbSNP versions in Bioconductor (SNPlocs.Hsapiens.dbSNP) so we would like to confirm this aligns with dbSNP versions outside of Biocnductor but there doesn't seem to be any numbers available online about it. The only piece of work about this is quite old now and doesn't give specific numbers SNPs that come in threes. It would be great if anyone has any insight on this.
library(ggplot2)
library(cowplot)
pastel_cols <- c("#9A8822","#F5CDB4","#F8AFA8",
"#FDDDA0","#74A089","#85D4E3",
#added extra to make 7
'#78A2CC')
ggplot(summ, aes(x = factor(match,levels=rev(c("both biallelic",
"both non-biallelic",
"dbSNP 155 non-biallelic",
"dbSNP 144 non-biallelic"))), y = N))+
geom_bar(stat = "identity", fill = pastel_cols[c(2,3,2,2)])+
geom_text(label = with(summ, paste(prettyNum(N,big.mark=",",scientific=FALSE),
paste0('\n (', round(prop*100,2), '%)'))),
hjust = -.3)+
ylim(0,max(summ$N)+20000000)+
coord_flip() +
theme_cowplot()+
theme(axis.title.y=element_blank())
The text was updated successfully, but these errors were encountered:
Al-Murphy
changed the title
Hig proportion of non-biallelic SNPs in dbSNP 155 vs 144
High proportion of non-biallelic SNPs in dbSNP 155 vs 144
Aug 9, 2022
We have noted a substantial increase the proportion of non-biallelic SNPs across dbSNP 144 and 155 (increased from ~1% to 24%). This is explained in detail here and summarised in the code below (fuicntion written by @hpages).
Note that this is based on the dbSNP versions in Bioconductor (SNPlocs.Hsapiens.dbSNP) so we would like to confirm this aligns with dbSNP versions outside of Biocnductor but there doesn't seem to be any numbers available online about it. The only piece of work about this is quite old now and doesn't give specific numbers SNPs that come in threes. It would be great if anyone has any insight on this.
And just to visually represent them:
The text was updated successfully, but these errors were encountered: