BACKPROPAGATION AND COUNTER-PROPAGATION NEURAL NETWORKS FOR PHYLOGENETIC CLASSIFICATION OF RIBOSOMAL-RNA SEQUENCES

被引:30
作者
WU, C
SHIVAKUMAR, S
机构
[1] Department of Epidemiology/Biomathematics, University of Texas Health Center at Tyler, Tyler
关键词
D O I
10.1093/nar/22.20.4291
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A neural network system has been developed for rapid and accurate classification of ribosomal RNA sequences according to phylogenetic relationship. The molecular sequences are encoded into neural input vectors using an n-gram hashing method. A SVD (singular value decomposition) method is used to compress and reduce the size of long and sparse n-gram input vectors. The neural networks used are three-layered, feed-forward networks that employ supervised learning paradigms, including the backpropagation algorithm and a modified counter-propagation algorithm. A pedagogical pattern selection strategy is used to reduce the training time. After trained with ribosomal RNA sequences of the RDP (Ribosomal Database Project) database, the system can classify query sequences into more than one hundred phylogenetic classes with a 100% accuracy at a rate of less than 0.3 CPU second per sequence on a workstation. When compared to other sequence similarity search methods, including Similarity Rank, Blast and Pasta, the neural network method has a higher classification accuracy at a speed of about an order of magnitude faster. The software tool will be made available to the biology community, and the system may be extended into a gene identification system for classifying indiscriminately sequenced DNA fragments.
引用
收藏
页码:4291 / 4299
页数:9
相关论文
共 27 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] BARKER WC, 1993, NUCLEIC ACIDS RES DA, V21, P3038
  • [3] LARGE-SCALE SPARSE SINGULAR VALUE COMPUTATIONS
    BERRY, MW
    [J]. INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1992, 6 (01): : 13 - 49
  • [4] A NOVEL-APPROACH TO PREDICTION OF THE 3-DIMENSIONAL STRUCTURES OF PROTEIN BACKBONES BY NEURAL NETWORKS
    BOHR, H
    BOHR, J
    BRUNAK, S
    COTTERILL, RMJ
    FREDHOLM, H
    LAUTRUP, B
    PETERSEN, SB
    [J]. FEBS LETTERS, 1990, 261 (01) : 43 - 46
  • [5] PEDAGOGICAL PATTERN SELECTION-STRATEGIES
    CACHIN, C
    [J]. NEURAL NETWORKS, 1994, 7 (01) : 175 - 181
  • [6] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [7] 2-9
  • [8] DETERMINATION OF EUKARYOTIC PROTEIN CODING REGIONS USING NEURAL NETWORKS AND INFORMATION-THEORY
    FARBER, R
    LAPEDES, A
    SIROTKIN, K
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1992, 226 (02) : 471 - 479
  • [9] PHYLOGENIES FROM MOLECULAR SEQUENCES - INFERENCE AND RELIABILITY
    FELSENSTEIN, J
    [J]. ANNUAL REVIEW OF GENETICS, 1988, 22 : 521 - 565
  • [10] FERRAN EA, 1994, PROTEIN SCI, V3, P507