A BAC end view of the Musa acuminata genome

被引:40
作者
Cheung, Foo [1 ]
Town, Christopher D. [1 ]
机构
[1] J Craig Venter Inst, Rockville, MD 20850 USA
关键词
D O I
10.1186/1471-2229-7-29
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Background: Musa species contain the fourth most important crop in developing countries. Here, we report the analysis of 6,252 BAC end-sequences, in order to view the sequence composition of the Musa acuminata genome in a cost effective and efficient manner. Results: BAC end sequencing generated 6,252 reads representing 4,420,944 bp, including 2,979 clone pairs with an average read length after cleaning and filtering of 707 bp. All sequences have been submitted to GenBank, with the accession numbers DX451975 - DX458350. The BAC end-sequences, were searched against several databases and significant homology was found to mitochondria and chloroplast (2.6%), transposons and repetitive sequences (36%) and proteins (11%). Functional interpretation of the protein matches was carried out by Gene Ontology assignments from matches to Arabidopsis and was shown to cover a broad range of categories. From protein matching regions of Musa BAC end-sequences, it was determined that the GC content of coding regions was 47%. Where protein matches encompassed a start codon, GC content as a function of position (5 ' to 3 ') across 129 bp sliding windows generates a "rice-like" gradient. A total of 352 potential SSR markers were discovered. The most abundant simple sequence repeats in four size categories were AT-rich. After filtering mitochondria and chloroplast matches, thousands of BAC end- sequences had a significant BLASTN match to the Oryza sativa and Arabidopsis genome sequence. Of these, a small number of BAC end-sequence pairs were shown to map to neighboring regions of the Oryza sativa genome representing regions of potential microsynteny. Conclusion: Database searches with the BAC end- sequences and ab initio analysis identified those reads likely to contain transposons, repeat sequences, proteins and simple sequence repeats. Approximately 600 BAC end- sequences contained protein sequences that were not found in the existing available Musa expressed sequence tags, repeat or transposon databases. In addition, gene statistics, GC content and profile could also be estimated based on the region matching the top protein hit. A small number of BAC end pair sequences can be mapped to neighboring regions of the Oryza sativa representing regions of potential microsynteny. These results suggest that a large-scale BAC end sequencing strategy has the potential to anchor a small proportion of the genome of Musa acuminata to the genomes of Oryza sativa and possibly Arabidopsis.
引用
收藏
页数:7
相关论文
共 26 条
[1]   Gene content and density in banana (Musa acuminata) as revealed by genomic sequencing of BAC clones [J].
Aert, R ;
Sági, L ;
Volckaert, G .
THEORETICAL AND APPLIED GENETICS, 2004, 109 (01) :129-139
[2]   Analysis of the genome sequence of the flowering plant Arabidopsis thaliana [J].
Kaul, S ;
Koo, HL ;
Jenkins, J ;
Rizzo, M ;
Rooney, T ;
Tallon, LJ ;
Feldblyum, T ;
Nierman, W ;
Benito, MI ;
Lin, XY ;
Town, CD ;
Venter, JC ;
Fraser, CM ;
Tabata, S ;
Nakamura, Y ;
Kaneko, T ;
Sato, S ;
Asamizu, E ;
Kato, T ;
Kotani, H ;
Sasamoto, S ;
Ecker, JR ;
Theologis, A ;
Federspiel, NA ;
Palm, CJ ;
Osborne, BI ;
Shinn, P ;
Conway, AB ;
Vysotskaia, VS ;
Dewar, K ;
Conn, L ;
Lenz, CA ;
Kim, CJ ;
Hansen, NF ;
Liu, SX ;
Buehler, E ;
Altafi, H ;
Sakano, H ;
Dunn, P ;
Lam, B ;
Pham, PK ;
Chao, Q ;
Nguyen, M ;
Yu, GX ;
Chen, HM ;
Southwick, A ;
Lee, JM ;
Miranda, M ;
Toriumi, MJ ;
Davis, RW .
NATURE, 2000, 408 (6814) :796-815
[3]   Nuclear genome size and genomic distribution of ribosomal DNA in Musa and Ensete (Musaceae):: taxonomic implications [J].
Bartos, J ;
Alkhimova, O ;
Dolezelová, M ;
De Langhe, E ;
Dolezel, J .
CYTOGENETIC AND GENOME RESEARCH, 2005, 109 (1-3) :50-57
[4]   DNA sequence quality trimming and vector removal [J].
Chou, HH ;
Holmes, MH .
BIOINFORMATICS, 2001, 17 (12) :1093-1104
[5]   Isolation and characterization of microsatellite loci from a commercial cultivar of Musa acuminata [J].
Creste, Silvana ;
Benatti, Thiago R. ;
Orsi, Myrian R. ;
Risterucci, Ange-Marie ;
Figueira, Antonio .
MOLECULAR ECOLOGY NOTES, 2006, 6 (02) :303-306
[6]   Rapid genome evolution revealed by comparative sequence analysis of orthologous regions from four triticeae genomes [J].
Gu, YQ ;
Coleman-Derr, D ;
Kong, XY ;
Anderson, OD .
PLANT PHYSIOLOGY, 2004, 135 (01) :459-470
[7]   Frequency, type, distribution and annotation of simple sequence repeats in Rosaceae ESTs [J].
Jung S. ;
Abbott A. ;
Jesudurai C. ;
Tomkins J. ;
Main D. .
Functional & Integrative Genomics, 2005, 5 (3) :136-143
[8]   Nuclear DNA content and base composition in 28 taxa of Musa [J].
Kamaté, K ;
Brown, S ;
Durand, P ;
Bureau, JM ;
De Nay, D ;
Trinh, TH .
GENOME, 2001, 44 (04) :622-627
[9]   Differential distribution of simple sequence repeats in eukaryotic genome sequences [J].
Katti, MV ;
Ranjekar, PK ;
Gupta, VS .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (07) :1161-1167
[10]   A unique set of 11,008 onion expressed sequence tags reveals expressed sequence and genomic differences between the monocot orders Asparagales and Poales [J].
Kuhl, JC ;
Cheung, F ;
Yuan, QP ;
Martin, W ;
Zewdie, Y ;
McCallum, J ;
Catanach, A ;
Rutherford, P ;
Sink, KC ;
Jenderek, M ;
Prince, JP ;
Town, CD ;
Havey, MJ .
PLANT CELL, 2004, 16 (01) :114-125