GWAS: 曼哈顿图,QQ plot 图,膨胀系数( manhattan、Genomic Inflation Factor)

画曼哈顿图和QQ plot 首推R包“qqman”,简约方便。下面具体介绍以下。

一、画曼哈顿图

install.packages("qqman")

library(qqman)

  

1、准备包含SNP, CHR, BP, P的文件gwasResults(如果没有zscore可以不用管),如下所示:

2、上代码,如下所示:

manhattan(gwasResults)

  

plot of chunk unnamed-chunk-4

如果觉得不够美观,考虑添加一下参数:

manhattan(gwasResults, main = "Manhattan Plot", ylim = c(0, 10), cex = 0.6, cex.axis = 0.9, col = c("blue4", "orange3"), suggestiveline = F, genomewideline = 6, chrlabs = c(1:20, "P", "Q"))

  

plot of chunk unnamed-chunk-5

二、画 QQ plot 图

直接上代码:

qq(gwasResults$P)

  

plot of chunk unnamed-chunk-12

同样的,还可以修改参数,美观一下:

qq(gwasResults$P, main = "Q-Q plot of GWAS p-values", xlim = c(0, 7), ylim = c(0,    12), pch = 18, col = "blue4", cex = 1.5, las = 1)

  

plot of chunk unnamed-chunk-13

 

三、计算膨胀系数(Calculating Genomic Inflation Factor)

 

如果数据类型为Z值,则膨胀系数为:

z = gwasResults$zscore
lambda = round(median(z^2) / 0.454, 3)

  

如果数据类型是P值,则膨胀系数为:

p_value=gwasResults$P

z = qnorm(p_value/ 2)

lambda = round(median(z^2, na.rm = TRUE) / 0.454, 3)

  

​如果数据类型是CHISQ值,则膨胀系数为:

z = gwasResults$CHISQ

lambda = round(median(z^2, na.rm = TRUE) / 0.454, 3)

  

 

#关于0.454的由来:

#qchisq(0.5, 1)

#[1] 0.4549364

 

膨胀系数lambda的解读:

基因组膨胀因子λ定义为经验观察到的检验统计分布与预期中位数的中值之比,从而量化了因大量膨胀而造成结果的假阳性率。换句话说,λ定义为得到的卡方检验统计量的中值除以卡方分布的预期中值。预期的P值膨胀系数为1,当实际膨胀系数越偏离1,说明存在群体分层的现象越严重,容易有假阳性结果,需要重新矫正群体分层。



 

参考链接:

https://www.biostars.org/p/43328/

https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html

https://en.wikipedia.org/wiki/Population_stratification

chrome-extension://cdonnmffkdaoajfknoeeecmchibpmkmg/static/pdf/web/viewer.html?file=https%3A%2F%2Fpersonal.broadinstitute.org%2Fchrisnc%2Fothers%2FGWAS_Smith_MethMolBiol09.pdf

chrome-extension://cdonnmffkdaoajfknoeeecmchibpmkmg/static/pdf/web/viewer.html?file=http%3A%2F%2Fwww3.stat.sinica.edu.tw%2Fsisg2015%2Fdownload%2Fhandout%2FS1-notes.pdf

posted @ 2019-01-25 11:27  橙子牛奶糖  阅读(14869)  评论(18编辑  收藏  举报