Automatic pathway building in biological association networks

被引:66
作者
Yuryev, A [1 ]
Mulyukov, Z [1 ]
Kotelnikova, E [1 ]
Maslov, S [1 ]
Egorov, S [1 ]
Nikitin, A [1 ]
Daraselia, N [1 ]
Mazo, I [1 ]
机构
[1] Ariadne Genom Inc, Rockville, MD 20850 USA
关键词
D O I
10.1186/1471-2105-7-171
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Scientific literature is a source of the most reliable and comprehensive knowledge about molecular interaction networks. Formalization of this knowledge is necessary for computational analysis and is achieved by automatic fact extraction using various text-mining algorithms. Most of these techniques suffer from high false positive rates and redundancy of the extracted information. The extracted facts form a large network with no pathways defined. Results: We describe the methodology for automatic curation of Biological Association Networks (BANs) derived by a natural language processing technology called Medscan. The curated data is used for automatic pathway reconstruction. The algorithm for the reconstruction of signaling pathways is also described and validated by comparison with manually curated pathways and tissue-specific gene expression profiles. Conclusion: Biological Association Networks extracted by MedScan technology contain sufficient information for constructing thousands of mammalian signaling pathways for multiple tissues. The automatically curated MedScan data is adequate for automatic generation of good quality signaling networks. The automatically generated Regulome pathways and manually curated pathways used for their validation are available free in the ResNetCore database from Ariadne Genomics, Inc. [ 1]. The pathways can be viewed and analyzed through the use of a free demo version of PathwayStudio software. The Medscan technology is also available for evaluation using the free demo version of PathwayStudio software.
引用
收藏
页数:13
相关论文
共 16 条
[1]  
[Anonymous], GENE ONTOLOGY
[2]   Extracting human protein interactions from MEDLINE using a full-sentence parser [J].
Daraselia, N ;
Yuryev, A ;
Egorov, S ;
Novichkova, S ;
Nikitin, A ;
Mazo, I .
BIOINFORMATICS, 2004, 20 (05) :604-U43
[3]  
DARASELIA N, 2004, P 2 EUR WORKSH DAT M, P11
[4]  
DUNNE A, 2003, SCI STKE, V171, P3
[5]  
Ideker Trey, 2002, Bioinformatics, V18 Suppl 1, pS233
[6]   Binding properties and evolution of homodimers in protein-protein interaction networks [J].
Ispolatov, I ;
Yuryev, A ;
Mazo, I ;
Maslov, S .
NUCLEIC ACIDS RESEARCH, 2005, 33 (11) :3629-3635
[7]  
MARSHALL B, 2004, IEEE T INF TECHN BIO
[8]   MedScan, a natural language processing engine for MEDLINE abstracts [J].
Novichkova, S ;
Egorov, S ;
Daraselia, N .
BIOINFORMATICS, 2003, 19 (13) :1699-1706
[9]   Automated extraction of information on protein-protein interactions from the biological literature [J].
Ono, T ;
Hishigaki, H ;
Tanigami, A ;
Takagi, T .
BIOINFORMATICS, 2001, 17 (02) :155-161
[10]   Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction [J].
Santos, C ;
Eggle, D ;
States, DJ .
BIOINFORMATICS, 2005, 21 (08) :1653-1658