A text-mining analysis of the human phenome

被引:501
作者
van Driel, MA
Bruggeman, J
Vriend, G
Brunner, HG
Leunissen, JA
机构
[1] Univ Nijmegen, Ctr Med, Dept Human Genet, NL-6525 GA Nijmegen, Netherlands
[2] Radboud Univ Nijmegen, Ctr Mol & Biomol Informat, NL-6525 ED Nijmegen, Netherlands
[3] Univ Wageningen & Res Ctr, Dept Bioinformat, NL-6703 HA Wageningen, Netherlands
关键词
phenome; text mining; candidate disease genes; phenotype-genotype relations;
D O I
10.1038/sj.ejhg.5201585
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A number of large-scale efforts are underway to define the relationships between genes and proteins in various species. But, few attempts have been made to systematically classify all such relationships at the phenotype level. Also, it is unknown whether such a phenotype map would carry biologically meaningful information. We have used text mining to classify over 5000 human phenotypes contained in the Online Mendelian Inheritance in Man database. We find that similarity between phenotypes reflects biological modules of interacting functionally related genes. These similarities are positively correlated with a number of measures of gene function, including relatedness at the level of protein sequence, protein motifs, functional annotation, and direct protein-protein interaction. Phenotype grouping reflects the modular nature of human disease genetics. Thus, phenotype mapping may be used to predict candidate genes for diseases as well as functional relations between genes and proteins. Such predictions will further improve if a unified system of phenotype descriptors is developed. The phenotype similarity data are accessible through a web interface at http://www.cmbi.ru.nl/MimMiner/.
引用
收藏
页码:535 / 542
页数:8
相关论文
共 39 条
[1]   Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families [J].
Andrade, MA ;
Valencia, A .
BIOINFORMATICS, 1998, 14 (07) :600-607
[2]  
[Anonymous], [No title captured]
[3]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]   The European dimension for the mouse genome mutagenesis program [J].
Auwerx, J ;
Avner, P ;
Baldock, R ;
Ballabio, A ;
Balling, R ;
Barbacid, M ;
Berns, A ;
Bradley, A ;
Brown, S ;
Carmeliet, P ;
Chambon, P ;
Cox, R ;
Davidson, D ;
Davies, K ;
Duboule, D ;
Forejt, J ;
Granucci, F ;
Hastie, N ;
de Angelis, MH ;
Jackson, I ;
Kioussis, D ;
Kollias, G ;
Lathrop, M ;
Lendahl, U ;
Malumbres, M ;
von Melchner, H ;
Müller, W ;
Partanen, J ;
Ricciardi-Castagnoli, P ;
Rigby, P ;
Rosen, B ;
Rosenthal, N ;
Skarnes, B ;
Stewart, AF ;
Thornton, J ;
Tocchini-Valentini, G ;
Wagner, E ;
Wahli, W ;
Wurst, W .
NATURE GENETICS, 2004, 36 (09) :925-927
[6]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[7]   Mapping phenotypes to language: a proposal to organize and standardize the clinical descriptions of malformations [J].
Biesecker, LG .
CLINICAL GENETICS, 2005, 68 (04) :320-326
[8]   Genome-wide RNAi analysis of growth and viability in Drosophila cells [J].
Boutros, M ;
Kiger, AA ;
Armknecht, S ;
Kerr, K ;
Hild, M ;
Koch, B ;
Haas, SA ;
Paro, R ;
Perrimon, N .
SCIENCE, 2004, 303 (5659) :832-835
[9]   Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships [J].
Brenner, SE ;
Chothia, C ;
Hubbard, TJP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) :6073-6078
[10]   From syndrome families to functional genomics [J].
Brunner, HG ;
van Driel, MA .
NATURE REVIEWS GENETICS, 2004, 5 (07) :545-551