Curated genome annotation of Oryza sativa ssp japonica and comparative genome analysis with Arabidopsis thaliana -: The Rice Annotation Project

被引:163
作者
Gojobori, Takashi
机构
[1] Division of Genome and Biodiversity Research, National Institute of Agrobiological Sciences, Tsukuba
[2] Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Koto-ku
[3] Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Research Organization of Information and Systems, Mishima
[4] Japan Biological Information Research Center, Japan Biological Informatics Consortium, Koto-ku
[5] EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus
[6] Biometrics and Bioinformatics Unit, International Rice Research Institute, DAPO Box 7777, Metro Manila
[7] Department of Biology, McGill University, Montreal
[8] Biology Department, Brookhaven National Laboratory, Upton
[9] Department of Genetics, University of Georgia, Athens
[10] Waksman Institute of Microbiology, Rutgers University, Piscataway
[11] Institute for Bioinformatics, GSF National Research Center for Environment and Health
[12] Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233
[13] Institute of the Society for Techno-innovation of Agriculture, Forestry and Fisheries, Tsukuba
[14] Institute of Botany, Academia Sinica, Nankang
[15] Tsukuba Division, Mitsubishi Space Software Co., Ltd., Tsukuba
[16] Graduate School of Information Science and Technology, Hokkaido University, Sapporo
[17] Department of Plant Breeding, Cornell University, Ithaca
[18] Department of Biological Sciences, Tokyo Metropolitan University, Hachiojishi
[19] Department of Plant Molecular Biology, University of Delhi South Campus
[20] National Institute of Crop Science, National Agriculture and Food Research Organization, Tsukuba
[21] SWISS-PROT Group, Swiss Institute of Bioinformatics
[22] Cold Spring Harbor Laboratory, Cold Spring Harbor
[23] Division of Biology, California Institute of Technology, Pasadena
[24] Institute of Molecular Evolutionary Genetics, Department of Biology, Pennsylvania State University, University Park
[25] RIKEN BioResource Center, RIKEN Tsukuba Institute, Tsukuba
[26] Department of Molecular Genetics and Microbiology, Center for Infectious Diseases, State University of New York at Stony Brook, Stony Brook
[27] Genoscope
[28] Metabolomics Research Group, RIKEN Plant Science Center, Yokohama
[29] Technische Universität München, Genome Oriented Bioinformatics
[30] Plant Computational Biology, Max-Planck-Institute for Plant Breeding Research
[31] Plant Functional Genomics Research Group, RIKEN Plant Science Center, Yokohama
[32] RIKEN Plant Science Center, Yokohama
[33] National Research Centre on Plant Biotechnology, Indian Agricultural Research Institute
[34] National Center for Biotechnology Information, National Institutes of Health, Bethesda
[35] Rice Gene Discovery Unit, Kasetsart University
[36] Institute for Genomic Research, Rockville
[37] Arizona Genomics Institute, University of Arizona, Tucson
[38] National Institute of Agrobiological Sciences, Tsukuba
[39] Bio-Oriented Technology Research Advancement Institution, Minato-ku
关键词
D O I
10.1101/gr.5509507
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is similar to 32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene.
引用
收藏
页码:175 / 183
页数:9
相关论文
共 52 条
[31]   Bioverse: functional, structural and contextual annotation of proteins and proteomes [J].
McDermott, J ;
Samudrala, R .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3736-3737
[32]  
Misra S., 2002, Genome Biol, V3, DOI [10.1186/gb-2002-3-12-research0083, DOI 10.1186/GB-2002-3-12-RESEARCH0083]
[33]   Target site specificity of the Tos17 retrotransposon shows a preference for insertion within genes and against insertion in retrotransposon-rich regions of the genome [J].
Miyao, A ;
Tanaka, K ;
Murata, K ;
Sawaki, H ;
Takeda, S ;
Abe, K ;
Shinozuka, Y ;
Onosato, K ;
Hirochika, H .
PLANT CELL, 2003, 15 (08) :1771-1780
[34]  
Nei M, 2000, MOL EVOLUTIONARY PHY
[35]   The Rice Annotation Project Database (RAP-DB):: hub for Oryza sativa ssp japonica genome information [J].
Ohyanagi, Hajime ;
Tanaka, Tsuyoshi ;
Sakai, Hiroaki ;
Shigemoto, Yasumasa ;
Yamaguchi, Kaori ;
Habara, Takuya ;
Fujii, Yasuyuki ;
Antonio, Baltazar A. ;
Nagamura, Yoshiaki ;
Imanishi, Tadashi ;
Ikeo, Kazuho ;
Itoh, Takeshi ;
Gojobori, Takashi ;
Sasaki, Takuji .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D741-D744
[36]  
OTA T, 1994, J MOL EVOL, V38, P642
[37]   InterProScan: protein domains identifier [J].
Quevillon, E ;
Silventoinen, V ;
Pillai, S ;
Harte, N ;
Mulder, N ;
Apweiler, R ;
Lopez, R .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W116-W120
[38]   Microbial genes in the human genome: Lateral transfer or gene loss? [J].
Salzberg, SL ;
White, O ;
Peterson, J ;
Eisen, JA .
SCIENCE, 2001, 292 (5523) :1903-1906
[39]   The genome sequence and structure of rice chromosome 1 [J].
Sasaki, T ;
Matsumoto, T ;
Yamamoto, K ;
Sakata, K ;
Baba, T ;
Katayose, Y ;
Wu, JZ ;
Niimura, Y ;
Cheng, ZK ;
Nagamura, Y ;
Antonio, BA ;
Kanamori, H ;
Hosokawa, S ;
Masukawa, M ;
Arikawa, K ;
Chiden, Y ;
Hayashi, M ;
Okamoto, M ;
Ando, T ;
Aoki, H ;
Arita, K ;
Hamada, M ;
Harada, C ;
Hijishita, S ;
Honda, M ;
Ichikawa, Y ;
Idonuma, A ;
Iijima, M ;
Ikeno, M ;
Ito, S ;
Ito, T ;
Ito, Y ;
Ito, Y ;
Iwabuchi, A ;
Kamiya, K ;
Karasawa, W ;
Katagiri, S ;
Kikuta, A ;
Kobayashi, N ;
Kono, I ;
Machita, K ;
Maehara, T ;
Mizuno, H ;
Mizubayashi, T ;
Mukai, Y ;
Nagasaki, H ;
Nakashima, M ;
Nakama, Y ;
Nakamichi, Y ;
Nakamura, M .
NATURE, 2002, 420 (6913) :312-316
[40]   MIPS Arabidopsis thaliana Database (MAtDB):: an integrated biological knowledge resource for plant genomics [J].
Schoof, H ;
Ernst, R ;
Nazarov, V ;
Pfeifer, L ;
Mewes, HW ;
Mayer, KFX .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D373-D376