A literature network of human genes for high-throughput analysis of gene expression

被引:146
作者
Tor-Kristian Jenssen
Astrid Lægreid
Jan Komorowski
Eivind Hovig
机构
[1] Norwegian University of Science and Technology,Department of Computer and Information Science
[2] Norwegian University of Science and Technology,Department of Physiology and Biomedical Engineering
[3] Institute for Cancer Research,Department of Tumour Biology
[4] The Norwegian Radium Hospital,Department of Computer and Information Science
[5] Linköping University,undefined
关键词
D O I
10.1038/ng0501-21
中图分类号
学科分类号
摘要
We have carried out automated extraction of explicit and implicit biomedical knowledge from publicly available gene and text databases to create a gene-to-gene co-citation network for 13,712 named human genes by automated analysis of titles and abstracts in over 10 million MEDLINE records. The associations between genes have been annotated by linking genes to terms from the medical subject heading (MeSH) index and terms from the gene ontology (GO) database. The extracted database and accompanying web tools for gene-expression analysis have collectively been named 'PubGene'. We validated the extracted networks by three large-scale experiments showing that co-occurrence reflects biologically meaningful relationships, thus providing an approach to extract and structure known biology. We validated the applicability of the tools by analyzing two publicly available microarray data sets.
引用
收藏
页码:21 / 28
页数:7
相关论文
共 46 条
[1]  
Adams MD(2000)The genome sequence of Drosophila melanogaster Science 287 2185-2195
[2]  
Goffeau A(1996)Life with 6000 genes Science 274 546-567
[3]  
Ashburner M(2000)Gene ontology: tool for the unification of biology. The Gene Ontology Consortium Nature Genet. 25 25-29
[4]  
Schena M(1995)Quantitative monitoring of gene expression patterns with a complementary DNA microarray Science 270 467-470
[5]  
Shalon D(1998)Cluster analysis and display of genome-wide expression patterns Proc. Natl. Acad. Sci. USA 95 14863-14868
[6]  
Davis RW(2000)Automated extraction of information in molecular biology FEBS Lett. 476 12-17
[7]  
Brown PO(2000)Automatic extraction of protein interactions from scientific abstracts Pac. Symp. Biocomput. 5 541-552
[8]  
Eisen MB(2000)Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures Pac. Symp. Biocomput. 5 505-516
[9]  
Spellman PT(2000)EDGAR: extraction of drugs, genes and relations from the biomedical literature Pac. Symp. Biocomput. 5 517-528
[10]  
Brown PO(1998)Toward information extraction: identifying protein names from biological papers Pac. Symp. Biocomput. 3 705-716