Identification of protein coding regions in the human genome by quadratic discriminant analysis

被引:209
作者
Zhang, MQ
机构
[1] Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724
关键词
D O I
10.1073/pnas.94.2.565
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A new method for predicting internal coding exons in genomic DNA sequences has been developed. This method is based on a prediction algorithm that uses the quadratic discriminant function for multivariate statistical pattern recognition, Substantial improvements have been made (with only 9 discriminant variables) when compared with existing methods: HEXON [Solovyev, V. V., Salamov, A. A. & Lawrence, C. B. (1994) Nucleic Acids Res. 22, 5156-5163] (based on linear discriminant analysis) and GRAIL2 [Uberbacher, E. C. & Mural, R. J. (1991) Proc. Natl. Acad. Sci. USA 88, 11261-11265] (based on neural networks). A computer program called MZEF is freely available to the genome community and allows users to adjust prior probability and to output alternative overlapping exons.
引用
收藏
页码:565 / 568
页数:4
相关论文
共 27 条
  • [1] SEQUENCE IDENTIFICATION OF 2,375 HUMAN BRAIN GENES
    ADAMS, MD
    DUBNICK, M
    KERLAVAGE, AR
    MORENO, R
    KELLEY, JM
    UTTERBACK, TR
    NAGLE, JW
    FIELDS, C
    VENTER, JC
    [J]. NATURE, 1992, 355 (6361) : 632 - 634
  • [2] EXON AMPLIFICATION - A STRATEGY TO ISOLATE MAMMALIAN GENES BASED ON RNA SPLICING
    BUCKLER, AJ
    CHANG, DD
    GRAW, SL
    BROOK, JD
    HABER, DA
    SHARP, PA
    HOUSMAN, DE
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1991, 88 (09) : 4005 - 4009
  • [3] Evaluation of gene structure prediction programs
    Burset, M
    Guigo, R
    [J]. GENOMICS, 1996, 34 (03) : 353 - 367
  • [4] ELECTRONIC DATA PUBLISHING AND GENBANK
    CINKOSKY, MJ
    FICKETT, JW
    GILNA, P
    BURKS, C
    [J]. SCIENCE, 1991, 252 (5010) : 1273 - 1277
  • [5] A STREAMLINED RANDOM SEQUENCING STRATEGY FOR FINDING CODING EXONS
    CLAVERIE, JM
    [J]. GENOMICS, 1994, 23 (03) : 575 - 581
  • [6] A NEW 5-YEAR PLAN FOR THE UNITED-STATES HUMAN GENOME PROJECT
    COLLINS, F
    GALAS, D
    [J]. SCIENCE, 1993, 262 (5130) : 43 - 46
  • [7] APPLICATION OF CDNA SELECTION TECHNIQUES TO REGIONS OF THE HUMAN MHC
    FAN, WF
    WEI, XH
    SHUKLA, H
    PARIMOO, S
    XU, HX
    SANKHAVARAM, P
    LI, Z
    WEISSMAN, SM
    [J]. GENOMICS, 1993, 17 (03) : 575 - 581
  • [8] ASSESSMENT OF PROTEIN CODING MEASURES
    FICKETT, JW
    TUNG, CS
    [J]. NUCLEIC ACIDS RESEARCH, 1992, 20 (24) : 6441 - 6450
  • [9] The use of multiple measurements in taxonomic problems
    Fisher, RA
    [J]. ANNALS OF EUGENICS, 1936, 7 : 179 - 188
  • [10] GETTING THE MESSAGE - IDENTIFYING TRANSCRIBED SEQUENCES
    GARDINER, K
    MURAL, RJ
    [J]. TRENDS IN GENETICS, 1995, 11 (03) : 77 - 79