APPLICATIONS AND STATISTICS FOR MULTIPLE HIGH-SCORING SEGMENTS IN MOLECULAR SEQUENCES

被引:279
作者
KARLIN, S
ALTSCHUL, SF
机构
[1] NIH,NATL LIB MED,NATL CTR BIOTECHNOL INFORMAT,BETHESDA,MD 20894
[2] STANFORD UNIV,DEPT MATH,STANFORD,CA 94305
关键词
SEQUENCE COMPARISON; PATTERN RECOGNITION; MOLECULAR FEATURES; STATISTICAL SIGNIFICANCE;
D O I
10.1073/pnas.90.12.5873
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Score-based measures of molecular-sequence features provide versatile aids for the study of proteins and DNA. They are used by many sequence data base search programs, as well as for identifying distinctive properties of single sequences. For any such measure, it is important to know what can be expected to occur purely by chance. The statistical distribution of high-scoring segments has been described elsewhere. However, molecular sequences will frequently yield several high-scoring segments for which some combined assessment is in order. This paper describes the statistical distribution for the sum of the scores of multiple high-scoring segments and illustrates its application to the identification of possible transmembrane segments and the evaluation of sequence similarity.
引用
收藏
页码:5873 / 5877
页数:5
相关论文
共 40 条