Complete reannotation of the Arabidopsis genome:: methods, tools, protocols and the final release

被引:109
作者
Haas, BJ [1 ]
Wortman, JR [1 ]
Ronning, CM [1 ]
Hannick, LI [1 ]
Smith, RK [1 ]
Maiti, R [1 ]
Chan, AP [1 ]
Yu, CH [1 ]
Farzad, M [1 ]
Wu, DY [1 ]
White, O [1 ]
Town, CD [1 ]
机构
[1] Inst Genom Res, Rockville, MD 20850 USA
关键词
D O I
10.1186/1741-7007-3-7
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Since the initial publication of its complete genome sequence, Arabidopsis thaliana has become more important than ever as a model for plant research. However, the initial genome annotation was submitted by multiple centers using inconsistent methods, making the data difficult to use for many applications. Results: Over the course of three years, TIGR has completed its effort to standardize the structural and functional annotation of the Arabidopsis genome. Using both manual and automated methods, Arabidopsis gene structures were refined and gene products were renamed and assigned to Gene Ontology categories. We present an overview of the methods employed, tools developed, and protocols followed, summarizing the contents of each data release with special emphasis on our final annotation release (version 5). Conclusion: Over the entire period, several thousand new genes and pseudogenes were added to the annotation. Approximately one third of the originally annotated gene models were significantly refined yielding improved gene structure annotations, and every protein-coding gene was manually inspected and classified using Gene Ontology terms.
引用
收藏
页数:19
相关论文
共 104 条
  • [1] Andrews J, 1996, GENETICS, V143, P1699
  • [2] Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
    Kaul, S
    Koo, HL
    Jenkins, J
    Rizzo, M
    Rooney, T
    Tallon, LJ
    Feldblyum, T
    Nierman, W
    Benito, MI
    Lin, XY
    Town, CD
    Venter, JC
    Fraser, CM
    Tabata, S
    Nakamura, Y
    Kaneko, T
    Sato, S
    Asamizu, E
    Kato, T
    Kotani, H
    Sasamoto, S
    Ecker, JR
    Theologis, A
    Federspiel, NA
    Palm, CJ
    Osborne, BI
    Shinn, P
    Conway, AB
    Vysotskaia, VS
    Dewar, K
    Conn, L
    Lenz, CA
    Kim, CJ
    Hansen, NF
    Liu, SX
    Buehler, E
    Altafi, H
    Sakano, H
    Dunn, P
    Lam, B
    Pham, PK
    Chao, Q
    Nguyen, M
    Yu, GX
    Chen, HM
    Southwick, A
    Lee, JM
    Miranda, M
    Toriumi, MJ
    Davis, RW
    [J]. NATURE, 2000, 408 (6814) : 796 - 815
  • [3] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [4] PRINTS and its automatic supplement, prePRINTS
    Attwood, TK
    Bradley, P
    Flower, DR
    Gaulton, A
    Maudling, N
    Mitchell, AL
    Moulton, G
    Nordle, A
    Paine, K
    Taylor, P
    Uddin, A
    Zygouri, C
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 400 - 402
  • [5] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
  • [6] Comparisons with Caenorhabditis (∼100 Mb) and Drosophila (∼175 Mb) using flow cytometry show genome size in Arabidopsis to be ∼157 Mb and thus ∼25 % larger than the Arabidopsis genome initiative estimate of ∼125 Mb
    Bennett, MD
    Leitch, IJ
    Price, HJ
    Johnston, JS
    [J]. ANNALS OF BOTANY, 2003, 91 (05) : 547 - 557
  • [7] Functional annotation of the Arabidopsis genome using controlled vocabularies
    Berardini, TZ
    Mundodi, S
    Reiser, L
    Huala, E
    Garcia-Hernandez, M
    Zhang, PF
    Mueller, LA
    Yoon, J
    Doyle, A
    Lander, G
    Moseyko, N
    Yoo, D
    Xu, I
    Zoeckler, B
    Montoya, M
    Miller, N
    Weems, D
    Rhee, SY
    [J]. PLANT PHYSIOLOGY, 2004, 135 (02) : 745 - 755
  • [8] Berriman M, 2004, METH MOL B, V270, P17
  • [9] A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome
    Blanc, G
    Hokamp, K
    Wolfe, KH
    [J]. GENOME RESEARCH, 2003, 13 (02) : 137 - 144
  • [10] Blumenthal T, 1998, BIOESSAYS, V20, P480, DOI 10.1002/(SICI)1521-1878(199806)20:6&lt