Whole genome sequence comparisons and "full-length" cDNA sequences:: A combined approach to evaluate and improve Arabidopsis genome annotation

被引:63
作者
Castelli, V
Aury, JM
Jaillon, O
Wincker, P
Clepet, C
Menard, M
Cruaud, C
Quétier, F
Scarpelli, C
Schächter, V
Temple, G
Caboche, M
Weissenbach, J
Salanoubat, M [1 ]
机构
[1] Life Technol, Carlsbad, CA 92008 USA
[2] Genoscope, Ctr Natl Sequencage, F-91000 Evry, France
[3] CNRS, UMR 3080, F-91000 Evry, France
[4] INRA, Unite Rech Genom Vegetale, F-91000 Evry, France
关键词
D O I
10.1101/gr.1515604
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
To evaluate the existing annotation of the Arabidopsis genome further, we generated a collection of evolutionary conserved regions (ecores) between Arabidopsis and rice. The ecore analysis provides evidence that the gene catalog of Arabidopsis is not yet complete, and that a number of these annotations require re-examination. To improve the Arabidopsis genome annotation further, we used a novel "full-length" enriched cDNA collection prepared from several tissues. An additional 1931 genes were covered by new "full-length" cDNA sequences, raising the number of annotated genes with a corresponding "full-length" cDNA sequence to about 14,000. Detailed comparisons between these "full-length" cDNA sequences and annotated genes show that this resource is very helpful in determining the correct structure of genes, in particular, those not yet supported by "full-length" cDNAs. In addition, a total of 326 genomic regions not included previously in the Arabidopsis genome annotation were detected by this cDNA resource, providing clues for new gene discovery. Because, as expected, the two data sets only partially overlap, their combination produces very useful information for improving the Arabidopsis genome annotation.
引用
收藏
页码:406 / 413
页数:8
相关论文
共 29 条
  • [1] Normalization and subtraction: Two approaches to facilitate gene discovery
    Bonaldo, MDF
    Lennon, G
    Soares, MB
    [J]. GENOME RESEARCH, 1996, 6 (09): : 791 - 806
  • [2] A large family of genes that share homology with CLAVATA3
    Cock, JM
    McCormick, S
    [J]. PLANT PHYSIOLOGY, 2001, 126 (03) : 939 - 942
  • [3] A computer program for aligning a cDNA sequence with a genomic DNA sequence
    Florea, L
    Hartzell, G
    Zhang, Z
    Rubin, GM
    Miller, W
    [J]. GENOME RESEARCH, 1998, 8 (09) : 967 - 974
  • [4] A draft sequence of the rice genome (Oryza sativa L. ssp japonica)
    Goff, SA
    Ricke, D
    Lan, TH
    Presting, G
    Wang, RL
    Dunn, M
    Glazebrook, J
    Sessions, A
    Oeller, P
    Varma, H
    Hadley, D
    Hutchinson, D
    Martin, C
    Katagiri, F
    Lange, BM
    Moughamer, T
    Xia, Y
    Budworth, P
    Zhong, JP
    Miguel, T
    Paszkowski, U
    Zhang, SP
    Colbert, M
    Sun, WL
    Chen, LL
    Cooper, B
    Park, S
    Wood, TC
    Mao, L
    Quail, P
    Wing, R
    Dean, R
    Yu, YS
    Zharkikh, A
    Shen, R
    Sahasrabudhe, S
    Thomas, A
    Cannings, R
    Gutin, A
    Pruss, D
    Reid, J
    Tavtigian, S
    Mitchell, J
    Eldredge, G
    Scholl, T
    Miller, RM
    Bhatnagar, S
    Adey, N
    Rubano, T
    Tusneem, N
    [J]. SCIENCE, 2002, 296 (5565) : 92 - 100
  • [5] Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies
    Haas, BJ
    Delcher, AL
    Mount, SM
    Wortman, JR
    Smith, RK
    Hannick, LI
    Maiti, R
    Ronning, CM
    Rusch, DB
    Town, CD
    Salzberg, SL
    White, O
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (19) : 5654 - 5666
  • [6] HAAS BJ, 2002, GENOME BIOL, V3, pH20
  • [7] Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information
    Hebsgaard, SM
    Korning, PG
    Tolstrup, N
    Engelbrecht, J
    Rouze, P
    Brunak, S
    [J]. NUCLEIC ACIDS RESEARCH, 1996, 24 (17) : 3439 - 3452
  • [8] Assessing the Drosophila melanogaster and Anopheles gambiae genome annotations using genome-wide sequence comparisons
    Jaillon, O
    Dossat, C
    Eckenberg, R
    Eiglmeier, K
    Segurens, A
    Aury, JM
    Roth, CW
    Scarpelli, C
    Brey, PT
    Weissenbach, J
    Wincker, P
    [J]. GENOME RESEARCH, 2003, 13 (07) : 1595 - 1599
  • [9] JAILLON O, 2004, IN PRESS COLD SPRING, V68
  • [10] Alternative splicing of transcripts encoding Toll-like plant resistance proteins - what's the functional relevance to innate immunity?
    Jordan, T
    Schornack, S
    Lahaye, T
    [J]. TRENDS IN PLANT SCIENCE, 2002, 7 (09) : 392 - 398