Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics

被引:160
作者
Aoki, Koh [1 ]
Yano, Kentaro [2 ]
Suzuki, Ayako [2 ]
Kawamura, Shingo [2 ]
Sakurai, Nozomu [1 ]
Suda, Kunihiro [1 ]
Kurabayashi, Atsushi [1 ]
Suzuki, Tatsuya [3 ]
Tsugane, Taneaki [3 ]
Watanabe, Manabu [3 ]
Ooga, Kazuhide [1 ]
Torii, Maiko [1 ]
Narita, Takanori [4 ]
Shin-i, Tadasu [4 ]
Kohara, Yuji [4 ]
Yamamoto, Naoki [2 ]
Takahashi, Hideki [5 ]
Watanabe, Yuichiro [6 ]
Egusa, Mayumi [7 ]
Kodama, Motoichiro [7 ]
Ichinose, Yuki [8 ]
Kikuchi, Mari [9 ]
Fukushima, Sumire [9 ]
Okabe, Akiko [9 ]
Arie, Tsutomu [9 ]
Sato, Yuko [10 ]
Yazawa, Katsumi [10 ]
Satoh, Shinobu [10 ]
Omura, Toshikazu [11 ]
Ezura, Hiroshi [11 ]
Shibata, Daisuke [1 ]
机构
[1] Kazusa DNA Res Inst, Kisarazu 2920818, Japan
[2] Meiji Univ, Tama Ku, Kawasaki, Kanagawa 2148571, Japan
[3] Chiba Prefectural Agr & Forestry Res Ctr, Midori Ku, Chiba 2660006, Japan
[4] Natl Inst Genet, Mishima, Shizuoka 4118540, Japan
[5] Tohoku Univ, Aoba Ku, Sendai, Miyagi 9818555, Japan
[6] Univ Tokyo, Meguro Ku, Tokyo 1538902, Japan
[7] Tottori Univ, Tottori 6808553, Japan
[8] Okayama Univ, Kita Ku, Okayama 7008530, Japan
[9] Tokyo Univ Agr & Technol, Fuchu, Tokyo 1838509, Japan
[10] Univ Tsukuba, Inst Biol Sci, Tsukuba, Ibaraki 3058571, Japan
[11] Univ Tsukuba, Ctr Gene Res, Tsukuba, Ibaraki 3058571, Japan
来源
BMC GENOMICS | 2010年 / 11卷
关键词
TRANSCRIPTION FACTORS; GENES; ANNOTATION; SEQUENCES; RESOURCE; CLONES; INFORMATION; COLLECTION; DATABASE; LIBRARY;
D O I
10.1186/1471-2164-11-210
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance. Results: To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%. Conclusion: The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.
引用
收藏
页数:16
相关论文
共 43 条
[1]   ESTs, cDNA microarrays, and gene expression profiling: tools for dissecting plant physiology and development [J].
Alba, R ;
Fei, ZJ ;
Payton, P ;
Liu, Y ;
Moore, SL ;
Debbie, P ;
Cohn, J ;
D'Ascenzo, M ;
Gordon, JS ;
Rose, JKC ;
Martin, G ;
Tanksley, SD ;
Bouzayen, M ;
Jahn, MM ;
Giovannoni, J .
PLANT JOURNAL, 2004, 39 (05) :697-714
[2]   Insights into corn genes derived from large-scale cDNA sequencing [J].
Alexandrov, Nickolai N. ;
Brover, Vyacheslav V. ;
Freidin, Stanislav ;
Troukhan, Maxim E. ;
Tatarinova, Tatiana V. ;
Zhang, Hongyu ;
Swaller, Timothy J. ;
Lu, Yu-Ping ;
Bouck, John ;
Flavell, Richard B. ;
Feldmann, Kenneth A. .
PLANT MOLECULAR BIOLOGY, 2009, 69 (1-2) :179-194
[3]   Features of Arabidopsis genes and genome discovered using full-length cDNAs [J].
Alexandrov, NN ;
Troukhan, ME ;
Brover, VV ;
Tatarinova, T ;
Flavell, RB ;
Feldmann, KA .
PLANT MOLECULAR BIOLOGY, 2006, 60 (01) :69-85
[4]   The GOA database in 2009-an integrated Gene Ontology Annotation resource [J].
Barrell, Daniel ;
Dimmer, Emily ;
Huntley, Rachael P. ;
Binns, David ;
O'Donovan, Claire ;
Apweiler, Rolf .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D396-D403
[5]   High-efficiency full-length cDNA cloning by biotinylated CAP trapper [J].
Carninci, P ;
Kvam, C ;
Kitamura, A ;
Ohsumi, T ;
Okazaki, Y ;
Itoh, M ;
Kamiya, M ;
Shibata, K ;
Sasaki, N ;
Izawa, M ;
Muramatsu, M ;
Hayashizaki, Y ;
Schneider, C .
GENOMICS, 1996, 37 (03) :327-336
[6]  
CHOMCZYNSKI P, 1987, ANAL BIOCHEM, V162, P156, DOI 10.1016/0003-2697(87)90021-2
[7]   TomatEST database:: in silico exploitation of EST data to explore expression patterns in tomato species [J].
D'Agostino, Nunzio ;
Aversano, Mario ;
Frusciante, Luigi ;
Chiusano, Maria Luisa .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D901-D905
[8]   AGRIS:: Arabidopsis Gene Regulatory Information Server, an information resource of Arabidopsis cis-regulatory elements and transcription factors -: art. no. 25 [J].
Davuluri, RV ;
Sun, H ;
Palaniswamy, SK ;
Matthews, N ;
Molina, C ;
Kurtz, M ;
Grotewold, E .
BMC BIOINFORMATICS, 2003, 4 (1)
[9]   Comparative plant genomics resources at PlantGDB [J].
Dong, QF ;
Lawrence, CJ ;
Schlueter, SD ;
Wilkerson, MD ;
Kurtz, S ;
Lushbough, C ;
Brendel, V .
PLANT PHYSIOLOGY, 2005, 139 (02) :610-618
[10]   Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment [J].
Ewing, B ;
Hillier, L ;
Wendl, MC ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :175-185