Domain identification by clustering sequence alignments

被引:20
作者
Guan, XJ [1 ]
Du, L [1 ]
机构
[1] Glaxo Wellcome Res & Dev Ltd, Res Triangle Pk, NC 27709 USA
关键词
D O I
10.1093/bioinformatics/14.9.783
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: As sequence databases grow rapidly results from sequence comparison searches using fast search methods such as BLAST and FASTA tend to be long and difficult to digest. Results: In this paper, we present a new method to extract domain information from sequence comparison searches by clustering the resulting alignments according to their similarity to the query sequence. Efficient tree structures and algorithms are used to organize the alignment data such that structurally conserved elements can be easily identified. The hierarchical nature of the data structures used and the flexible X-Window-based interface provide an efficient and intuitive means to explore the alignment data at different levels so that the common domains, as well as distantly related features, can be explored.
引用
收藏
页码:783 / 788
页数:6
相关论文
共 13 条
[1]  
Aho A.V., 1974, The Design and Analysis of Computer Algorithms
[2]   ISSUES IN SEARCHING MOLECULAR SEQUENCE DATABASES [J].
ALTSCHUL, SF ;
BOGUSKI, MS ;
GISH, W ;
WOOTTON, JC .
NATURE GENETICS, 1994, 6 (02) :119-129
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]  
[Anonymous], 1978, Atlas of protein sequence and structure
[5]  
BAIROCH A, 1994, NUCLEIC ACIDS RES, V22, P3578
[6]   CONSTRUCTION OF VALIDATED, NONREDUNDANT COMPOSITE PROTEIN-SEQUENCE DATABASES [J].
BLEASBY, AJ ;
WOOTTON, JC .
PROTEIN ENGINEERING, 1990, 3 (03) :153-159
[7]   INFORMATION ENHANCEMENT METHODS FOR LARGE-SCALE SEQUENCE-ANALYSIS [J].
CLAVERIE, JM ;
STATES, DJ .
COMPUTERS & CHEMISTRY, 1993, 17 (02) :191-201
[8]   AN IMPROVED ALGORITHM FOR MATCHING BIOLOGICAL SEQUENCES [J].
GOTOH, O .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (03) :705-708
[9]   PERFORMANCE EVALUATION OF AMINO-ACID SUBSTITUTION MATRICES [J].
HENIKOFF, S ;
HENIKOFF, JG .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 17 (01) :49-61
[10]  
Miller GS, 1997, COMPUT APPL BIOSCI, V13, P81