A global assembly of cotton ESTs

被引:111
作者
Udall, JA
Swanson, JM
Haller, K
Rapp, RA
Sparks, ME
Hatfield, J
Yu, YS
Wu, YR
Dowd, C
Arpat, AB
Sickler, BA
Wilkins, TA
Guo, JY
Chen, XY
Scheffler, J
Taliercio, E
Turley, R
McFadden, H
Payton, P
Klueva, N
Allen, R
Zhang, DS
Haigler, C
Wilkerson, C
Suo, JF
Schulze, SR
Pierce, ML
Essenberg, M
Kim, H
Llewellyn, DJ
Dennis, ES
Kudrna, D
Wing, R
Paterson, AH
Soderlund, C
Wendel, JF [1 ]
机构
[1] Iowa State Univ, Dept Ecol Evolut & Organismal Biol, Ames, IA 50011 USA
[2] BIO5 Inst, Arizona Genom Computat Lab, Tucson, AZ 85721 USA
[3] Univ Arizona, Dept Plant Sci, Genom Inst, Tucson, AZ 85721 USA
[4] CSIRO Plant Ind, Canberra, ACT 2601, Australia
[5] Univ Calif Davis, Dept Plant Sci, Davis, CA 95616 USA
[6] Shanghai Inst Biol Sci, Inst Plant Physiol & Ecol, Shanghai 200032, Peoples R China
[7] USDA ARS, Stoneville, MS 38776 USA
[8] USDA, ARS, Lubbock, TX 79415 USA
[9] Texas Tech Univ, Dept Biol, Lubbock, TX 79409 USA
[10] N Carolina State Univ, Dept Crop Sci, Raleigh, NC 27695 USA
[11] N Carolina State Univ, Dept Bot, Raleigh, NC 27695 USA
[12] Michigan State Univ, Bioinformat Core Facil, E Lansing, MI 48824 USA
[13] Inst Genet & Dev Biol, Beijing 100101, Peoples R China
[14] Univ Georgia, Plant Genome Mapping Lab, Athens, GA 30602 USA
[15] Oklahoma State Univ, Oklahoma Agr Expt Stn, Stillwater, OK 74078 USA
关键词
D O I
10.1101/gr.4602906
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Approximately 185,000 Gossypium EST sequences comprising > 94,800,000 nucleotides were amassed from 30 cDNA libraries constructed from a variety Of tissues and organs under a range of conditions, including drought stress and pathogen challenges. These libraries were derived from allopolyploid cotton (Gossypium hirsutum; A(T) and D-T genomes) as well as its two diploid progenitors, Gossypium arboreum (A genome) and Gossypium raimondii (D genome). ESTs were assembled using the Program for Assembling and Viewing ESTs (PAVE), resulting in 22,030 contigs and 29,077 singletons (51,107 unigenes). Further comparisons among the singletons and contigs led to recognition of 33,665 exemplar sequences that represent a nonredundant set of putative Gossypium genes containing partial or full-length coding regions and usually one or two UTRs. The assembly, along with their UniProt BLASTX hits, GO annotation, and Pfam analysis results, are freely accessible as a public resource for cotton genomics. Because ESTs from diploid and allotetraploid Gossypium were combined in a single assembly, we were in many cases able to bioinformatically distinguish duplicated genes in allotetraploid cotton and assign them to either the A or D genome. The assembly and associated information provide a framework for future investigation of cotton functional and evolutionary genomics.
引用
收藏
页码:441 / 450
页数:10
相关论文
共 68 条
[31]   Efficient clustering of large EST data sets on parallel computers [J].
Kalyanaraman, A ;
Aluru, S ;
Kothari, S ;
Brendel, V .
NUCLEIC ACIDS RESEARCH, 2003, 31 (11) :2963-2974
[32]   Gene expression profiles during the initial phase of salt stress in rice [J].
Kawasaki, S ;
Borchert, C ;
Deyholos, M ;
Wang, H ;
Brazille, S ;
Kawai, K ;
Galbraith, D ;
Bohnert, HJ .
PLANT CELL, 2001, 13 (04) :889-905
[33]   A novel expression assay system for fiber-specific promoters in developing cotton fibers [J].
Kim, HJ ;
Williams, MY ;
Triplett, BA .
PLANT MOLECULAR BIOLOGY REPORTER, 2002, 20 (01) :7-18
[34]   QTL analysis of cotton fiber quality using multiple Gossypium hirsutum x Gossypium barbadense backcross generations [J].
Lacape, JM ;
Nguyen, TB ;
Courtois, B ;
Belot, JL ;
Giband, M ;
Gourlot, JP ;
Gawryziak, G ;
Roques, S ;
Hau, B .
CROP SCIENCE, 2005, 45 (01) :123-140
[35]   Development of an expressed sequence tag (EST) resource for wheat (Triticum aestivum L.):: EST generation, unigene analysis, probe selection and bioinformatics for a 16,000-locus bin-delineated map [J].
Lazo, GR ;
Chao, S ;
Hummel, DD ;
Edwards, H ;
Crossman, CC ;
Lui, N ;
Matthews, DE ;
Carollo, VL ;
Hane, DL ;
You, FM ;
Butler, GE ;
Miller, RE ;
Close, TJ ;
Peng, JH ;
Lapitan, NLV ;
Gustafson, JP ;
Qi, LL ;
Echalier, B ;
Gill, BS ;
Dilbirligi, M ;
Randhawa, HS ;
Gill, KS ;
Greene, RA ;
Sorrells, ME ;
Akhunov, ED ;
Dvorák, J ;
Linkiewicz, AM ;
Dubcovsky, J ;
Hossain, KG ;
Kalavacharla, V ;
Kianian, SF ;
Mahmoud, AA ;
Miftahudin ;
Ma, XT ;
Conley, EJ ;
Anderson, JA ;
Pathan, MS ;
Nguyen, HT ;
McGuire, PE ;
Qualset, CO ;
Anderson, DO .
GENETICS, 2004, 168 (02) :585-593
[36]   Molecular characterization of the cotton GhTUB1 gene that is preferentially expressed in fiber [J].
Li, XB ;
Cai, L ;
Cheng, NH ;
Liu, JW .
PLANT PHYSIOLOGY, 2002, 130 (02) :666-674
[37]   Modeling sequencing errors by combining Hidden Markov models [J].
Lottaz, C. ;
Iseli, C. ;
Jongeneel, C. V. ;
Bucher, P. .
BIOINFORMATICS, 2003, 19 :II103-II112
[38]   Methods for transcriptional profiling in plants. Be fruitful and replicate [J].
Meyers, BC ;
Galbraith, DW ;
Nelson, T ;
Agrawal, V .
PLANT PHYSIOLOGY, 2004, 135 (02) :637-652
[39]   EST analysis in barley defines a unigene set comprising 4,000 genes [J].
Michalek, W ;
Weschke, W ;
Pleissner, KP ;
Graner, A .
THEORETICAL AND APPLIED GENETICS, 2002, 104 (01) :97-103
[40]   Discrimination of homoeologous gene expression in hexaploid wheat by SNP analysis of contigs grouped from a large number of expressed sequence tags [J].
Mochida, K ;
Yamazaki, Y ;
Ogihara, Y .
MOLECULAR GENETICS AND GENOMICS, 2004, 270 (05) :371-377