In silico gene function prediction using ontology-based pattern identification

被引:55
作者
Zhou, YY [1 ]
Young, JA
Santrosyan, A
Chen, KS
Yan, SF
Winzeler, EA
机构
[1] Novartis Res Fdn, Genom Inst, San Diego, CA 92121 USA
[2] Scripps Res Inst, Dept Cell Biol, La Jolla, CA 92037 USA
关键词
D O I
10.1093/bioinformatics/bti111
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: With the emergence of genome-wide expression profiling data sets, the guilt by association (GBA) principle has been a cornerstone for deriving gene functional interpretations in silico. Given the limited success of traditional methods for producing clusters of genes with great amounts of functional similarity, new data-mining algorithms are required to fully exploit the potential of high-throughput genomic approaches. Results: Ontology-based pattern identification (OPI) is a novel data-mining algorithm that systematically identifies expression patterns that best represent existing knowledge of gene function. Instead of relying on a universal threshold of expression similarity to define functionally related groups of genes, OPI finds the optimal analysis settings that yield gene expression patterns and gene lists that best predict gene function using the principle of GBA. We applied OPI to a publicly available gene expression data set on the life cycle of the malarial parasite Plasmodium falciparum and systematically annotated genes for 320 functional categories based on current Gene Ontology annotations. An ontology-based hierarchical tree of the 320 categories provided a systems-wide biological view of this important malarial parasite.
引用
收藏
页码:1237 / 1245
页数:9
相关论文
共 23 条
[1]   Quantifying the relationship between co-expression, co-regulation and gene function [J].
Allocco, DJ ;
Kohane, IS ;
Butte, AJ .
BMC BIOINFORMATICS, 2004, 5 (1)
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]   Iterative Group Analysis (iGA): A simple tool to enhance sensitivity and facilitate interpretation of microarray experiments [J].
Breitling, R ;
Amtmann, A ;
Herzyk, P .
BMC BIOINFORMATICS, 2004, 5 (1)
[5]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[6]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[7]   Transformation and normalization of oligonucleotide microarray data [J].
Geller, SC ;
Gregg, JP ;
Hagerman, P ;
Rocke, DM .
BIOINFORMATICS, 2003, 19 (14) :1817-1823
[8]   Gene expression data preprocessing [J].
Herrero, J ;
Díaz-Uriarte, R ;
Dopazo, J .
BIOINFORMATICS, 2003, 19 (05) :655-656
[9]   Discovery of gene function by expression profiling of the malaria parasite life cycle [J].
Le Roch, KG ;
Zhou, YY ;
Blair, PL ;
Grainger, M ;
Moch, JK ;
Haynes, JD ;
De la Vega, P ;
Holder, AA ;
Batalov, S ;
Carucci, DJ ;
Winzeler, EA .
SCIENCE, 2003, 301 (5639) :1503-1508
[10]   Monitoring the chromosome 2 intraerythrocytic transcriptome of Plasmodium falciparum using oligonucleotide arrays [J].
Le Roch, KG ;
Zhou, YY ;
Batalov, S ;
Winzeler, EA .
AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2002, 67 (03) :233-243