Measuring semantic similarity between Gene Ontology terms

被引:138
作者
Couto, Francisco M. [1 ]
Silva, Mario J.
Coutinho, Pedro M.
机构
[1] Univ Lisbon, Dept Informat, Fac Ciencias, P-1699 Lisbon, Portugal
[2] CNRS, UMR 6098, Marseille, France
[3] Univ Aix Marseille 1, Marseille, France
[4] Univ Aix Marseille 2, F-13284 Marseille 07, France
关键词
knowledge manipulation technique; semantic similarity; gene ontology; bioinformatics;
D O I
10.1016/j.datak.2006.05.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many bioinformatics applications would benefit from comparing proteins based on their biological role rather than their sequence. This paper adds two new contributions. First, a study of the correlation between Gene Ontology (GO) terms and family similarity demonstrates that protein families constitute an appropriate baseline for validating GO similarity. Secondly, we introduce GraSM, a novel method that uses all the information in the graph structure of the Gene Ontology, instead of considering it as a hierarchical tree. GraSM gives a consistently higher family similarity correlation on all aspects of GO than the original semantic similarity measures. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:137 / 152
页数:16
相关论文
共 23 条
[1]   FatiGO:: a web tool for finding significant associations of Gene Ontology terms with groups of genes [J].
Al-Shahrour, F ;
Díaz-Uriarte, R ;
Dopazo, J .
BIOINFORMATICS, 2004, 20 (04) :578-580
[2]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkh131, 10.1093/nar/gkw1099]
[3]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[4]  
BUDANITSKY A, 2001, P WORKSH WORDN OTH L
[5]   The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology [J].
Camon, E ;
Magrane, M ;
Barrell, D ;
Lee, V ;
Dimmer, E ;
Maslen, J ;
Binns, D ;
Harte, N ;
Lopez, R ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D262-D266
[6]  
COUTO F, 2003, 0329 DIFCUL TR
[7]  
COUTO F, 2006, ADV DATA MINING TECH
[8]  
COUTO F, 2005, P ACM C INF KNOWL MA
[9]  
Couto FM, 2005, BMC BIOINFORMATICS, V6, DOI 10.1186/1471-2105-6-S1-S21
[10]   Intrinsic errors in genome annotation [J].
Devos, D ;
Valencia, A .
TRENDS IN GENETICS, 2001, 17 (08) :429-431