The mouse genome: Experimental examination of gene predictions and transcriptional start sites

被引:12
作者
Dike, S [1 ]
Balija, VS [1 ]
Nascimento, LU [1 ]
Xuan, ZY [1 ]
Ou, J [1 ]
Zutavern, T [1 ]
Palmer, LE [1 ]
Hannon, G [1 ]
Zhang, MQ [1 ]
McCombie, WR [1 ]
机构
[1] Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
关键词
D O I
10.1101/gr.3158304
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The completion of the mouse and other mammalian genome sequences will provide necessary, but not sufficient, knowledge for an understanding of much of mouse biology at the molecular level. As a requisite next step in this process, the genes in mouse and their structure must be elucidated. In particular, knowledge of the transcriptional start site of these genes will be necessary for further study of their regulatory regions. To assess the current state of mouse genome annotation to support this activity, we identified several hundred gene predictions in mouse with varying levels of supporting evidence and tested them using RACE-PCR. Modifications were made to the procedure allowing pooling of RNA samples, resulting in a scaleable procedure. The results illustrate potential errors or omissions in the current 5' end annotations in 58% of the genes detected. In testing experimentally unsupported gene predictions, we were able to identify 58 that are not usually annotated as genes but produced spliced transcripts (similar to25% success rate). In addition, in many genes we were able to detect novel exons not predicted by any gene prediction algorithms. In 19.8% of the genes detected in this study, multiple transcript species were observed. These data show an urgent need to provide direct experimental validation of gene annotations. Moreover, these results show that direct validation using RACE-PCR can be an important component of genome-wide validation. This approach can be a useful tool in the ongoing efforts to increase the quality of gene annotations, especially transcriptional start sites, in complex genomes.
引用
收藏
页码:2424 / 2429
页数:6
相关论文
共 24 条
[1]   NUMBER OF CPG ISLANDS AND GENES IN HUMAN AND MOUSE [J].
ANTEQUERA, F ;
BIRD, A .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (24) :11995-11999
[2]   Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes [J].
Aparicio, S ;
Chapman, J ;
Stupka, E ;
Putnam, N ;
Chia, J ;
Dehal, P ;
Christoffels, A ;
Rash, S ;
Hoon, S ;
Smit, A ;
Gelpke, MDS ;
Roach, J ;
Oh, T ;
Ho, IY ;
Wong, M ;
Detter, C ;
Verhoef, F ;
Predki, P ;
Tay, A ;
Lucas, S ;
Richardson, P ;
Smith, SF ;
Clark, MS ;
Edwards, YJK ;
Doggett, N ;
Zharkikh, A ;
Tavtigian, SV ;
Pruss, D ;
Barnstead, M ;
Evans, C ;
Baden, H ;
Powell, J ;
Glusman, G ;
Rowen, L ;
Hood, L ;
Tan, YH ;
Elgar, G ;
Hawkins, T ;
Venkatesh, B ;
Rokhsar, D ;
Brenner, S .
SCIENCE, 2002, 297 (5585) :1301-1310
[3]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[4]   Targeting a complex transcriptome: The construction of the mouse full-length cDNA encyclopedia [J].
Carninci, P ;
Waki, K ;
Shiraki, T ;
Konno, H ;
Shibata, K ;
Itoh, M ;
Aizawa, K ;
Arakawa, T ;
Ishii, Y ;
Sasaki, D ;
Bono, H ;
Kondo, S ;
Sugahara, Y ;
Saito, R ;
Osato, N ;
Fukuda, S ;
Sato, K ;
Watahiki, A ;
Hirozane-Kishikawa, T ;
Nakamura, M ;
Shibata, Y ;
Yasunishi, A ;
Kikuchi, N ;
Yoshiki, A ;
Kusakabe, M ;
Gustincich, S ;
Beisel, K ;
Pavan, W ;
Aidinis, V ;
Nakagawara, A ;
Held, WA ;
Iwata, H ;
Kono, T ;
Nakauchi, H ;
Lyons, P ;
Wells, C ;
Hume, DA ;
Fagiolini, M ;
Hensch, TK ;
Brinkmeier, M ;
Camper, S ;
Hirota, J ;
Mombaerts, P ;
Muramatsu, M ;
Okazaki, Y ;
Kawai, J ;
Hayashizaki, Y .
GENOME RESEARCH, 2003, 13 (6B) :1273-1289
[5]   Extra-long first-strand cDNA synthesis [J].
Carninci, P ;
Shiraki, T ;
Mizuno, Y ;
Muramatsu, M ;
Hayashizaki, Y .
BIOTECHNIQUES, 2002, 32 (05) :984-985
[6]   CART classification of human 5′ UTR sequences [J].
Davuluri, RV ;
Suzuki, Y ;
Sugano, S ;
Zhang, MQ .
GENOME RESEARCH, 2000, 10 (11) :1807-1816
[7]   Computational identification of promoters and first exons in the human genome [J].
Davuluri, RV ;
Grosse, I ;
Zhang, MQ .
NATURE GENETICS, 2001, 29 (04) :412-417
[8]   A computer program for aligning a cDNA sequence with a genomic DNA sequence [J].
Florea, L ;
Hartzell, G ;
Zhang, Z ;
Rubin, GM ;
Miller, W .
GENOME RESEARCH, 1998, 8 (09) :967-974
[9]   Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22 [J].
Kampa, D ;
Cheng, J ;
Kapranov, P ;
Yamanaka, M ;
Brubaker, S ;
Cawley, S ;
Drenkow, J ;
Piccolboni, A ;
Bekiranov, S ;
Helt, G ;
Tammana, H ;
Gingeras, TR .
GENOME RESEARCH, 2004, 14 (03) :331-342
[10]   Large-scale transcriptional activity in chromosomes 21 and 22 [J].
Kapranov, P ;
Cawley, SE ;
Drenkow, J ;
Bekiranov, S ;
Strausberg, RL ;
Fodor, SPA ;
Gingeras, TR .
SCIENCE, 2002, 296 (5569) :916-919