The TIGR plant transcript assemblies database

被引:133
作者
Childs, Kevin L. [1 ]
Hamilton, John P. [1 ]
Zhu, Wei [1 ]
Ly, Eugene [1 ]
Cheung, Foo [1 ]
Wu, Hank [1 ]
Rabinowicz, Pablo D. [1 ]
Town, Chris D. [1 ]
Buell, C. Robin [1 ]
Chan, Agnes P. [1 ]
机构
[1] Inst Genom Res, Rockville, MD 20850 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/nar/gkl785
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The TIGR Plant Transcript Assemblies (TA) database (http://plantta.tigr.org) uses expressed sequences collected from the NCBI GenBank Nucleotide database for the construction of transcript assemblies. The sequences collected include expressed sequence tags (ESTs) and full-length and partial cDNAs, but exclude computationally predicted gene sequences. The TA database includes all plant species for which more than 1000 EST or cDNA sequences are publicly available. The EST and cDNA sequences are first clustered based on an all-versus-all pairwise sequence comparison, followed by the generation of consensus sequences (TAs) from individual clusters. The clustering and assembly procedures use the TGICL tool, Megablast and the CAP3 assembler. The UniProt Reference Clusters (UniRef100) protein database is used as the reference database for the functional annotation of the assemblies. The transcription orientation of each TA is determined based on the orientation of the alignment with the best protein hit. The TA sequences and annotation are available via web interfaces and FTP downloads. Assemblies can be retrieved by a text-based keyword search or a sequence-based BLAST search. The current version of the TA database is Release 2 (July 17, 2006) and includes a total of 215 plant species.
引用
收藏
页码:D846 / D851
页数:6
相关论文
共 14 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[3]   Comparative plant genomics resources at PlantGDB [J].
Dong, QF ;
Lawrence, CJ ;
Schlueter, SD ;
Wilkerson, MD ;
Kurtz, S ;
Lushbough, C ;
Brendel, V .
PLANT PHYSIOLOGY, 2005, 139 (02) :610-618
[4]   PlantGDB, plant genome database and analysis tools [J].
Dong, QF ;
Schlueter, SD ;
Brendel, V .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D354-D359
[5]   The Gene Ontology (GO) project in 2006 [J].
Harris, Midori A. ;
Clark, Jennifer I. ;
Ireland, Amelia ;
Lomax, Jane ;
Ashburner, Michael ;
Collins, Russell ;
Eilbeck, Karen ;
Lewis, Suzanna ;
Mungall, Chris ;
Richter, John ;
Rubin, Gerald M. ;
Shu, ShengQiang ;
Blake, Judith A. ;
Bult, Carol J. ;
Diehl, Alexander D. ;
Dolan, Mary E. ;
Drabkin, Harold J. ;
Eppig, Janan T. ;
Hill, David P. ;
Ni, Li ;
Ringwald, Martin ;
Balakrishnan, Rama ;
Binkley, Gail ;
Cherry, J. Michael ;
Christie, Karen R. ;
Costanzo, Maria C. ;
Dong, Qing ;
Engel, Stacia R. ;
Fisk, Dianna G. ;
Hirschman, Jodi E. ;
Hitz, Benjamin C. ;
Hong, Eurie L. ;
Lane, Christopher ;
Miyasato, Stuart ;
Nash, Robert ;
Sethuraman, Anand ;
Skrzypek, Marek ;
Theesfeld, Chandra L. ;
Weng, Shuai ;
Botstein, David ;
Dolinski, Kara ;
Oughtred, Rose ;
Berardini, Tanya ;
Mundodi, Suparna ;
Rhee, Seung Y. ;
Apweiler, Rolf ;
Barrell, Daniel ;
Camon, Evelyn ;
Dimmer, Emily ;
Mulder, Nicola .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D322-D326
[6]   CAP3: A DNA sequence assembly program [J].
Huang, XQ ;
Madan, A .
GENOME RESEARCH, 1999, 9 (09) :868-877
[7]  
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202. Article published online before March 2002, 10.1101/gr.229202]
[8]   The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomes [J].
Lee, Y ;
Tsai, J ;
Sunkara, S ;
Karamycheva, S ;
Pertea, G ;
Sultana, R ;
Antonescu, V ;
Chan, A ;
Cheung, F ;
Quackenbush, J .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D71-D74
[9]  
Mulder Nicola J, 2002, Brief Bioinform, V3, P225
[10]   TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets [J].
Pertea, G ;
Huang, XQ ;
Liang, F ;
Antonescu, V ;
Sultana, R ;
Karamycheva, S ;
Lee, Y ;
White, J ;
Cheung, F ;
Parvizi, B ;
Tsai, J ;
Quackenbush, J .
BIOINFORMATICS, 2003, 19 (05) :651-652