GeneMachine: gene prediction and sequence annotation

被引:16
作者
Makalowska, I [1 ]
Ryan, JF [1 ]
Baxevanis, AD [1 ]
机构
[1] NHGRI, Genome Technol Branch, NIH, Bethesda, MD 20892 USA
关键词
D O I
10.1093/bioinformatics/17.9.843
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A number of free-standing programs have been developed in order to help researchers find potential coding regions and deduce gene structure for long stretches of what is essentially 'anonymous DNA'. As these programs apply inherently different criteria to the question of what is and is not a coding region, multiple algorithms should be used in the course of positional cloning and positional candidate projects to assure that all potential coding regions within a previously-identified critical region are identified. Results: We have developed a gene identification tool called GeneMachine which allows users to query multiple exon and gene prediction programs in an automated fashion. BLAST searches are also performed in order to see whether a previously-characterized coding region corresponds to a region in the query sequence. A suite of Perl programs and modules are used to run MZEF, GENSCAN, GRAIL 2, FGENES, RepeatMasker, Sputnik, and BLAST The results of these runs are then parsed and written into ASN.1 format. Output files can be opened using NCBI Sequin, in essence using Sequin as both a workbench and as a graphical viewer. The main feature of GeneMachine is that the process is fully automated; the user is only required to launch GeneMachine and then open the resulting file with Sequin. Annotations can then be made to these results prior to submission to GenBank, thereby increasing the intrinsic value of these data.
引用
收藏
页码:843 / 844
页数:2
相关论文
共 7 条
[1]  
Abajian C., 1994, Sputnik
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[4]  
MURAL RJ, 1992, TRENDS BIOTECHNOL, V10, P67
[5]  
SMIT A, REPEAT MASKER
[6]  
SOLOVYEV VV, 1995, INTELLIGENT SYSTEMS, V3, P367
[7]   Identification of protein coding regions in the human genome by quadratic discriminant analysis [J].
Zhang, MQ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (02) :565-568