Learning a Prior on Regulatory Potential from eQTL Data

被引:145
作者
Lee, Su-In [1 ]
Dudley, Aimee M. [2 ]
Drubin, David [3 ]
Silver, Pamela A. [3 ]
Krogan, Nevan J. [4 ]
Pe'er, Dana [5 ]
Koller, Daphne [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Inst Syst Biol, Seattle, WA USA
[3] Harvard Univ, Sch Med, Dept Syst Biol, Boston, MA USA
[4] Univ Calif San Francisco, Dept Mol & Cellular Pharmacol, San Francisco, CA 94143 USA
[5] Columbia Univ, Dept Biol Sci, New York, NY 10027 USA
来源
PLOS GENETICS | 2009年 / 5卷 / 01期
基金
美国国家科学基金会;
关键词
CYTOPLASMIC PROCESSING BODIES; SINGLE-NUCLEOTIDE POLYMORPHISMS; QUANTITATIVE TRAIT LOCUS; SACCHAROMYCES-CEREVISIAE; GENE-EXPRESSION; MESSENGER-RNAS; TRANSCRIPTION FACTORS; GENOMICS APPROACH; GLOBAL ANALYSIS; BUDDING YEAST;
D O I
10.1371/journal.pgen.1000358
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide RNA expression data provide a detailed view of an organism's biological state; hence, a dataset measuring expression variation between genetically diverse individuals (eQTL data) may provide important insights into the genetics of complex traits. However, with data from a relatively small number of individuals, it is difficult to distinguish true causal polymorphisms from the large number of possibilities. The problem is particularly challenging in populations with significant linkage disequilibrium, where traits are often linked to large chromosomal regions containing many genes. Here, we present a novel method, Lirnet, that automatically learns a regulatory potential for each sequence polymorphism, estimating how likely it is to have a significant effect on gene expression. This regulatory potential is defined in terms of "regulatory features''-including the function of the gene and the conservation, type, and position of genetic polymorphisms-that are available for any organism. The extent to which the different features influence the regulatory potential is learned automatically, making Lirnet readily applicable to different datasets, organisms, and feature sets. We apply Lirnet both to the human HapMap eQTL dataset and to a yeast eQTL dataset and provide statistical and biological results demonstrating that Lirnet produces significantly better regulatory programs than other recent approaches. We demonstrate in the yeast data that Lirnet can correctly suggest a specific causal sequence variation within a large, linked chromosomal region. In one example, Lirnet uncovered a novel, experimentally validated connection between Puf3-a sequence-specific RNA binding protein-and P-bodies-cytoplasmic structures that regulate translation and RNA stability as well as the particular causative polymorphism, a SNP in Mkt1, that induces the variation in the pathway.
引用
收藏
页数:24
相关论文
共 76 条
[21]   The DEAD box protein Dhh1 stimulates the decapping enzyme Dcp1 [J].
Fischer, N ;
Weis, K .
EMBO JOURNAL, 2002, 21 (11) :2788-2797
[22]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[23]   False discovery control with p-value weighting [J].
Genovese, Christopher R. ;
Roeder, Kathryn ;
Wasserman, Larry .
BIOMETRIKA, 2006, 93 (03) :509-524
[24]   Extensive association of functionally and cytotopically related mRNAs with Puf family RNA-binding proteins in yeast [J].
Gerber, AP ;
Herschlag, D ;
Brown, PO .
PLOS BIOLOGY, 2004, 2 (03) :342-354
[25]   Functional profiling of the Saccharomyces cerevisiae genome [J].
Giaever, G ;
Chu, AM ;
Ni, L ;
Connelly, C ;
Riles, L ;
Véronneau, S ;
Dow, S ;
Lucau-Danila, A ;
Anderson, K ;
André, B ;
Arkin, AP ;
Astromoff, A ;
El Bakkoury, M ;
Bangham, R ;
Benito, R ;
Brachat, S ;
Campanaro, S ;
Curtiss, M ;
Davis, K ;
Deutschbauer, A ;
Entian, KD ;
Flaherty, P ;
Foury, F ;
Garfinkel, DJ ;
Gerstein, M ;
Gotte, D ;
Güldener, U ;
Hegemann, JH ;
Hempel, S ;
Herman, Z ;
Jaramillo, DF ;
Kelly, DE ;
Kelly, SL ;
Kötter, P ;
LaBonte, D ;
Lamb, DC ;
Lan, N ;
Liang, H ;
Liao, H ;
Liu, L ;
Luo, CY ;
Lussier, M ;
Mao, R ;
Menard, P ;
Ooi, SL ;
Revuelta, JL ;
Roberts, CJ ;
Rose, M ;
Ross-Macdonald, P ;
Scherens, B .
NATURE, 2002, 418 (6896) :387-391
[26]   Transcriptional regulatory code of a eukaryotic genome [J].
Harbison, CT ;
Gordon, DB ;
Lee, TI ;
Rinaldi, NJ ;
Macisaac, KD ;
Danford, TW ;
Hannett, NM ;
Tagne, JB ;
Reynolds, DB ;
Yoo, J ;
Jennings, EG ;
Zeitlinger, J ;
Pokholok, DK ;
Kellis, M ;
Rolfe, PA ;
Takusagawa, KT ;
Lander, ES ;
Gifford, DK ;
Fraenkel, E ;
Young, RA .
NATURE, 2004, 431 (7004) :99-104
[27]   Genetic reconstruction of a functional transcriptional regulatory network [J].
Hu, Zhanzhi ;
Killion, Patrick J. ;
Iyer, Vishwanath R. .
NATURE GENETICS, 2007, 39 (05) :683-687
[28]   Functional discovery via a compendium of expression profiles [J].
Hughes, TR ;
Marton, MJ ;
Jones, AR ;
Roberts, CJ ;
Stoughton, R ;
Armour, CD ;
Bennett, HA ;
Coffey, E ;
Dai, HY ;
He, YDD ;
Kidd, MJ ;
King, AM ;
Meyer, MR ;
Slade, D ;
Lum, PY ;
Stepaniants, SB ;
Shoemaker, DD ;
Gachotte, D ;
Chakraburtty, K ;
Simon, J ;
Bard, M ;
Friend, SH .
CELL, 2000, 102 (01) :109-126
[29]   Global analysis of protein localization in budding yeast [J].
Huh, WK ;
Falvo, JV ;
Gerke, LC ;
Carroll, AS ;
Howson, RW ;
Weissman, JS ;
O'Shea, EK .
NATURE, 2003, 425 (6959) :686-691
[30]   Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data [J].
Ideker, T ;
Thorsson, V ;
Siegel, AF ;
Hood, LE .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (06) :805-817