Assessing semantic similarity measures for the characterization of human regulatory pathways

被引:126
作者
Guo, X [1 ]
Liu, RX
Shriver, CD
Hu, H
Liebman, MN
机构
[1] Windber Res Inst, Windber, PA 15963 USA
[2] GlaxoSmithKline Pharmaceut R&D, King Of Prussia, PA 19420 USA
[3] Walter Reed Army Med Ctr, Washington, DC 20307 USA
关键词
D O I
10.1093/bioinformatics/btl042
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Pathway modeling requires the integration of multiple data including prior knowledge. In this study, we quantitatively assess the application of Gene Ontology (GO)-derived similarity measures for the characterization of direct and indirect interactions within human regulatory pathways. The characterization would help the integration of prior pathway knowledge for the modeling. Results: Our analysis indicates information content-based measures outperform graph structure-based measures for stratifying protein interactions. Measures in terms of GO biological process and molecular function annotations can be used alone or together for the validation of protein interactions involved in the pathways. However, GO cellular component-derived measures may not have the ability to separate true positives from noise. Furthermore, we demonstrate that the functional similarity of proteins within known regulatory pathways decays rapidly as the path length between two proteins increases. Several logistic regression models are built to estimate the confidence of both direct and indirect interactions within a pathway, which may be used to score putative pathways inferred from a scaffold of molecular interactions.
引用
收藏
页码:967 / 973
页数:7
相关论文
共 27 条
[1]  
[Anonymous], 2003, HP INVENT
[2]  
[Anonymous], 1997, PROC 10 RES COMPUTAT
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]  
Bader GD, 2003, NUCLEIC ACIDS RES, V31, P248, DOI 10.1093/nar/gkg056
[5]   Gaining confidence in high-throughput protein interaction networks [J].
Bader, JS ;
Chaudhuri, A ;
Rothberg, JM ;
Chant, J .
NATURE BIOTECHNOLOGY, 2004, 22 (01) :78-85
[6]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[7]   Online predicted human interaction database [J].
Brown, KR ;
Jurisica, I .
BIOINFORMATICS, 2005, 21 (09) :2076-2082
[8]   Global protein function annotation through mining genome-scale data in yeast Saccharomyces cerevisiae [J].
Chen, Y ;
Xu, D .
NUCLEIC ACIDS RESEARCH, 2004, 32 (21) :6414-6424
[9]   Computational analyses of high-throughput protein-protein interaction data [J].
Chen, Y ;
Xu, D .
CURRENT PROTEIN & PEPTIDE SCIENCE, 2003, 4 (03) :159-180
[10]   Protein interactions - Two methods for assessment of the reliability of high throughput observations [J].
Deane, CM ;
Salwinski, L ;
Xenarios, I ;
Eisenberg, D .
MOLECULAR & CELLULAR PROTEOMICS, 2002, 1 (05) :349-356