Joint Modelling of Confounding Factors and Prominent Genetic Regulators Provides Increased Accuracy in Genetical Genomics Studies

被引:73
作者
Fusi, Nicolo [1 ]
Stegle, Oliver [2 ]
Lawrence, Neil D. [1 ]
机构
[1] Univ Sheffield, Sheffield Inst Translat Neurosci, Sheffield, S Yorkshire, England
[2] Max Planck Inst Dev Biol, Machine Learning & Computat Biol Res Grp, Tubingen, Germany
关键词
EXPRESSION; ASSOCIATION;
D O I
10.1371/journal.pcbi.1002330
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Expression quantitative trait loci (eQTL) studies are an integral tool to investigate the genetic component of gene expression variation. A major challenge in the analysis of such studies are hidden confounding factors, such as unobserved covariates or unknown subtle environmental perturbations. These factors can induce a pronounced artifactual correlation structure in the expression profiles, which may create spurious false associations or mask real genetic association signals. Here, we report PANAMA (Probabilistic ANAlysis of genoMic dAta), a novel probabilistic model to account for confounding factors within an eQTL analysis. In contrast to previous methods, PANAMA learns hidden factors jointly with the effect of prominent genetic regulators. As a result, this new model can more accurately distinguish true genetic association signals from confounding variation. We applied our model and compared it to existing methods on different datasets and biological systems. PANAMA consistently performs better than alternative methods, and finds in particular substantially more trans regulators. Importantly, our approach not only identifies a greater number of associations, but also yields hits that are biologically more plausible and can be better reproduced between independent studies. A software implementation of PANAMA is freely available online at http://ml.sheffield.ac.uk/qtl/.
引用
收藏
页数:9
相关论文
共 26 条
[1]  
Balding DJ., 2003, Handbook of Statistical Genetics
[2]   Genetical Genomics: Spotlight on QTL Hotspots [J].
Breitling, Rainer ;
Li, Yang ;
Tesson, Bruno M. ;
Fu, Jingyuan ;
Wu, Chunlei ;
Wiltshire, Tim ;
Gerrits, Alice ;
Bystrykh, Leonid V. ;
de Haan, Gerald ;
Su, Andrew I. ;
Jansen, Ritsert C. .
PLOS GENETICS, 2008, 4 (10)
[3]   Genetic dissection of transcriptional regulation in budding yeast [J].
Brem, RB ;
Yvert, G ;
Clinton, R ;
Kruglyak, L .
SCIENCE, 2002, 296 (5568) :752-755
[4]   Fundamentals of experimental design for cDNA microarrays [J].
Churchill, GA .
NATURE GENETICS, 2002, 32 (Suppl 4) :490-495
[5]   Multiple reference genomes and transcriptomes for Arabidopsis thaliana [J].
Gan, Xiangchao ;
Stegle, Oliver ;
Behr, Jonas ;
Steffen, Joshua G. ;
Drewe, Philipp ;
Hildebrand, Katie L. ;
Lyngsoe, Rune ;
Schultheiss, Sebastian J. ;
Osborne, Edward J. ;
Sreedharan, Vipin T. ;
Kahles, Andre ;
Bohnert, Regina ;
Jean, Geraldine ;
Derwent, Paul ;
Kersey, Paul ;
Belfield, Eric J. ;
Harberd, Nicholas P. ;
Kemen, Eric ;
Toomajian, Christopher ;
Kover, Paula X. ;
Clark, Richard M. ;
Raetsch, Gunnar ;
Mott, Richard .
NATURE, 2011, 477 (7365) :419-423
[6]   Adjusting batch effects in microarray expression data using empirical Bayes methods [J].
Johnson, W. Evan ;
Li, Cheng ;
Rabinovic, Ariel .
BIOSTATISTICS, 2007, 8 (01) :118-127
[7]   Efficient control of population structure in model organism association mapping [J].
Kang, Hyun Min ;
Zaitlen, Noah A. ;
Wade, Claire M. ;
Kirby, Andrew ;
Heckerman, David ;
Daly, Mark J. ;
Eskin, Eleazar .
GENETICS, 2008, 178 (03) :1709-1723
[8]   Variance component model to account for sample structure in genome-wide association studies [J].
Kang, Hyun Min ;
Sul, Jae Hoon ;
Service, Susan K. ;
Zaitlen, Noah A. ;
Kong, Sit-yee ;
Freimer, Nelson B. ;
Sabatti, Chiara ;
Eskin, Eleazar .
NATURE GENETICS, 2010, 42 (04) :348-U110
[9]   Accurate Discovery of Expression Quantitative Trait Loci Under Confounding From Spurious and Genuine Regulatory Hotspots [J].
Kang, Hyun Min ;
Ye, Chun ;
Eskin, Eleazar .
GENETICS, 2008, 180 (04) :1909-1925
[10]   Capturing heterogeneity in gene expression studies by surrogate variable analysis [J].
Leek, Jeffrey T. ;
Storey, John D. .
PLOS GENETICS, 2007, 3 (09) :1724-1735