Characterization of soybean genomic features by analysis of its expressed sequence tags

被引:83
作者
Tian, AG
Wang, J
Cui, P
Han, YJ
Xu, H
Cong, LJ
Huang, XG
Wang, XL
Jiao, YZ
Wang, BJ
Wang, YJ
Zhang, JS
Chen, SY
机构
[1] Chinese Acad Sci, Plant Biotechnol Lab, Inst Genet & Dev Biol, Beijing 100101, Peoples R China
[2] Chinese Acad Sci, Beijing Genom Inst, Beijing 101300, Peoples R China
关键词
D O I
10.1007/s00122-003-1499-2
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
We analyzed 314,254 soybean expressed sequence tags (ESTs), including 29,540 from our laboratory and 284,714 from GenBank. These ESTs were assembled into 56,147 unigenes. About 76.92% of the unigenes were homologous to genes from Arabidopsis thaliana (Arabidopsis). The putative products of these unigenes were annotated according to their homology with the categorized proteins of Arabidopsis. Genes corresponding to cell growth and/or maintenance, enzymes and cell communication belonged to the slow-evolving class, whereas genes related to transcription regulation, cell, binding and death appeared to be fast-evolving. Soybean unigenes with no match to genes within the Arabidopsis genome were identified as soybean-specific genes. These genes were mainly involved in nodule development and the synthesis of seed storage proteins. In addition, we also identified 61 genes regulated by salicylic acid, 1,322 transcription factor genes and 326 disease resistance-like genes from soybean unigenes. SSR analysis showed that the soybean genome was more complex than the Arabidopsis and the Medicago truncatula genomes. GC content in soybean unigene sequences is similar to that in Arabidopsis and M. truncatula. Furthermore, the combined analysis of the EST database and the BAC-contig sequences revealed that the total gene number in the soybean genome is about 63,501.
引用
收藏
页码:903 / 913
页数:11
相关论文
共 44 条
[1]   COMPLEMENTARY-DNA SEQUENCING - EXPRESSED SEQUENCE TAGS AND HUMAN GENOME PROJECT [J].
ADAMS, MD ;
KELLEY, JM ;
GOCAYNE, JD ;
DUBNICK, M ;
POLYMEROPOULOS, MH ;
XIAO, H ;
MERRIL, CR ;
WU, A ;
OLDE, B ;
MORENO, RF ;
KERLAVAGE, AR ;
MCCOMBIE, WR ;
VENTER, JC .
SCIENCE, 1991, 252 (5013) :1651-1656
[2]  
Andreeva AV, 1998, BIOCHEM MOL BIOL INT, V44, P703
[3]  
Arumuganathan K, 1991, PLANT MOL BIOL REP, V9, P208, DOI [DOI 10.1007/BF02672069, 10.1007/BF02672069]
[4]   Evidence for a role of salicylic acid in the oxidative damage generated by NaCl and osmotic stress in Arabidopsis seedlings [J].
Borsani, O ;
Valpuesta, V ;
Botella, MA .
PLANT PHYSIOLOGY, 2001, 126 (03) :1024-1030
[5]   d2_cluster: A validated method for clustering EST and full-length cDNA sequences [J].
Burke, J ;
Davison, D ;
Hide, W .
GENOME RESEARCH, 1999, 9 (11) :1135-1142
[6]  
Cardle L, 2000, GENETICS, V156, P847
[7]   An integrated genetic linkage map of the soybean genome [J].
Cregan, PB ;
Jarvik, T ;
Bush, AL ;
Shoemaker, RC ;
Lark, KG ;
Kahler, AL ;
Kaya, N ;
VanToai, TT ;
Lohnes, DG ;
Chung, L ;
Specht, JE .
CROP SCIENCE, 1999, 39 (05) :1464-1490
[8]   The Arabidopsis thaliana cDNA sequencing projects [J].
Delseny, M ;
Cooke, R ;
Raynal, M ;
Grellet, F .
FEBS LETTERS, 1997, 403 (03) :221-224
[9]  
DEVER TE, 1994, J BIOL CHEM, V269, P3212
[10]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194