Exploring the functional landscape of gene expression: directed search of large microarray compendia

被引:186
作者
Hibbs, Matthew A.
Hess, David C.
Myers, Chad L.
Huttenhower, Curtis
Li, Kai
Troyanskaya, Olga G. [1 ]
机构
[1] Princeton Univ, Lewis Sigler Inst Integrat Genom, Carl Icahn Lab, Princeton, NJ 08544 USA
[2] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
关键词
D O I
10.1093/bioinformatics/btm403
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The increasing availability of gene expression microarray technology has resulted in the publication of thousands of microarray gene expression datasets investigating various biological conditions. This vast repository is still underutilized due to the lack of methods for fast, accurate exploration of the entire compendium. Results: We have collected Saccharomyces cerevisiae gene expression microarray data containing roughly 2400 experimental conditions. We analyzed the functional coverage of this collection and we designed a context-sensitive search algorithm for rapid exploration of the compendium. A researcher using our system provides a small set of query genes to establish a biological search context; based on this query, we weight each dataset's relevance to the context, and within these weighted datasets we identify additional genes that are co-expressed with the query set. Our method exhibits an average increase in accuracy of 273% compared to previous mega-clustering approaches when recapitulating known biology. Further, we find that our search paradigm identifies novel biological predictions that can be verified through further experimentation. Our methodology provides the ability for biological researchers to explore the totality of existing microarray data in a manner useful for drawing conclusions and formulating hypotheses, which we believe is invaluable for the research community.
引用
收藏
页码:2692 / 2699
页数:8
相关论文
共 30 条
  • [1] Singular value decomposition for genome-wide expression data processing and modeling
    Alter, O
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) : 10101 - 10106
  • [2] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [3] Baldwin DN, 2003, GENOME BIOL, V4
  • [4] ArrayExpress - a public repository for microarray gene expression data at the EBI
    Brazma, A
    Parkinson, H
    Sarkans, U
    Shojatalab, M
    Vilo, J
    Abeygunawardena, N
    Holloway, E
    Kapushesky, M
    Kemmeren, P
    Lara, GG
    Oezcimen, A
    Rocca-Serra, P
    Sansone, SA
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 68 - 71
  • [5] Chromatin remodeling complexes: strength in diversity, precision through specialization
    Cairns, BR
    [J]. CURRENT OPINION IN GENETICS & DEVELOPMENT, 2005, 15 (02) : 185 - 190
  • [6] CHENG Y, 2000, P 8 INT C INT SYST M, P93
  • [7] SGD:: Saccharomyces Genome Database
    Cherry, JM
    Adler, C
    Ball, C
    Chervitz, SA
    Dwight, SS
    Hester, ET
    Jia, YK
    Juvik, G
    Roe, T
    Schroeder, M
    Weng, SA
    Botstein, D
    [J]. NUCLEIC ACIDS RESEARCH, 1998, 26 (01) : 73 - 79
  • [8] Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
    Edgar, R
    Domrachev, M
    Lash, AE
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 207 - 210
  • [9] Fisher RA, 1914, BIOMETRIKA, V10, P507
  • [10] Genomic expression programs in the response of yeast cells to environmental changes
    Gasch, AP
    Spellman, PT
    Kao, CM
    Carmel-Harel, O
    Eisen, MB
    Storz, G
    Botstein, D
    Brown, PO
    [J]. MOLECULAR BIOLOGY OF THE CELL, 2000, 11 (12) : 4241 - 4257