An integrated gene annotation and transcriptional profiling approach towards the full gene content of the Drosophila genome

被引:15
作者
M Hild
B Beckmann
SA Haas
B Koch
V Solovyev
C Busold
K Fellenberg
M Boutros
M Vingron
F Sauer
JD Hoheisel
R Paro
机构
[1] Zentrum für Molekulare Biologie Heidelberg (ZMBH), University of Heidelberg, Im Neuenheimer Feld 282, Heidelberg
[2] Division of Functional Genome Analysis, Deutsches Krebsforschungszentrum (DKFZ), Im Neuenheimer Feld 580, Heidelberg
[3] Max Planck Institute for Molecular Genetics, Ihnestraße 73, Berlin
[4] Softberry, Inc., 116 Radio Circle, Suite 400, Mount Kisko, 10549, NY
[5] Deutsches Krebsforschungszentrum (DKFZ), Im Neuenheimer Feld 580, Heidelberg
[6] Department of Biochemistry, University of California, Riverside, 92521, CA
关键词
Additional Data File; Additional Exon; Berkeley Drosophila Genome Project; Embryonic Stage; Gene Prediction;
D O I
10.1186/gb-2003-5-1-r3
中图分类号
学科分类号
摘要
Background: While the genome sequences for a variety of organisms are now available, the precise number of the genes encoded is still a matter of debate. For the human genome several stringent annotation approaches have resulted in the same number of potential genes, but a careful comparison revealed only limited overlap. This indicates that only the combination of different computational prediction methods and experimental evaluation of such in silico data will provide more complete genome annotations. In order to get a more complete gene content of the Drosophila melanogaster genome, we based our new D. melanogaster whole-transcriptome microarray, the Heidelberg FlyArray, on the combination of the Berkeley Drosophila Genome Project (BDGP) annotation and a novel ab initio gene prediction of lower stringency using the Fgenesh software. Results: Here we provide evidence for the transcription of approximately 2,600 additional genes predicted by Fgenesh. Validation of the developmental profiling data by RT-PCR and in situ hybridization indicates a lower limit of 2,000 novel annotations, thus substantially raising the number of genes that make a fly. Conclusions: The successful design and application of this novel Drosophila microarray on the basis of our integrated in silico/wet biology approach confirms our expectation that in silico approaches alone will always tend to be incomplete. The identification of at least 2,000 novel genes highlights the importance of gathering experimental evidence to discover all genes within a genome. Moreover, as such an approach is independent of homology criteria, it will allow the discovery of novel genes unrelated to known protein families or those that have not been strictly conserved between species. © 2003, Hild et al; licensee BioMed Central Ltd.
引用
收藏
相关论文
共 39 条
[1]  
Hogenesch J.B., Ching K.A., Batalov S., Su A.I., Walker J.R., Zhou Y., Kay S.A., Schultz P.G., Cooke M.P., A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes, Cell, 106, pp. 413-415, (2001)
[2]  
Daly M.J., Estimating the human gene count, Cell, 109, pp. 283-284, (2002)
[3]  
Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., Et al., The genome sequence of Drosophila melanogaster, Science, 287, pp. 2185-2195, (2000)
[4]  
Karlin S., Bergman A., Gentles A.J., Genomics. Annotation of the Drosophilagenome, Nature, 411, pp. 259-260, (2001)
[5]  
Gopal S., Schroeder M., Pieper U., Sczyrba A., Aytekin-Kurban G., Bekiranov S., Fajardo J.E., Eswar N., Sanchez R., Sali A., Et al., Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogastergenome, Nat Genet, 27, pp. 337-340, (2001)
[6]  
Andrews J., Bouffard G.G., Cheadle C., Lu J., Becker K.G., Oliver B., Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogastertestis, Genome Res, 10, pp. 2030-2043, (2000)
[7]  
Posey K.L., Jones L.B., Cerda R., Bajaj M., Huynh T., Hardin P.E., Hardin S.H., Survey of transcripts in the adult Drosophilabrain, Genome Biol, 2, (2001)
[8]  
Morin X., Daneman R., Zavortink M., Chia W., A protein trap strategy to detect GFP-tagged proteins expressed from their endogenous loci in Drosophila, Proc Natl Acad Sci USA, 98, pp. 15050-15055, (2001)
[9]  
Misra S., Crosby M.A., Mungall C.J., Matthews B.B., Campbell K.S., Hradecky P., Huang Y., Kaminker J.S., Millburn G.H., Prochnik S.E., Et al., Annotation of the Drosophila melanogastereuchromatic genome: a systematic review, Genome Biol, 3, (2002)
[10]  
Reese M.G., Kulp D., Tammana H., Haussler D., Genie - gene finding in Drosophila melanogaster, Genome Res, 10, pp. 529-538, (2000)