PlantTribes:: a gene and gene family resource for comparative genomics in plants

被引:77
作者
Wall, P. Kerr [1 ,2 ]
Leebens-Mack, Jim [1 ,2 ,3 ]
Mueller, Kai F. [1 ,2 ,4 ]
Field, Dawn [5 ]
Altman, Naomi S. [2 ,6 ]
dePamphilis, Claude W. [1 ,2 ]
机构
[1] Penn State Univ, Inst Mol Evolutionary Genet, Dept Biol, University Pk, PA 16802 USA
[2] Penn State Univ, Huck Inst Life Sci, University Pk, PA 16802 USA
[3] Univ Georgia, Dept Plant Biol, Athens, GA 30602 USA
[4] Univ Bonn, Nees Inst Biodivers Plants, D-53115 Bonn, Germany
[5] NERC, Ctr Ecol & Hydrol, Mol Evolut & Bioinformat Grp, Oxford OX1 3SR, England
[6] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
基金
英国自然环境研究理事会;
关键词
D O I
10.1093/nar/gkm972
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 15751584)] to classify all of these species protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting similar to 4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study.
引用
收藏
页码:D970 / D976
页数:7
相关论文
共 34 条
[1]   Floral gene resources from basal angiosperms for comparative genomics research [J].
Albert, Victor A. ;
Soltis, Douglas E. ;
Carlson, John E. ;
Farmerie, William G. ;
Wall, P. Kerr ;
Ilut, Daniel C. ;
Solow, Teri M. ;
Mueller, Lukas A. ;
Landherr, Lena L. ;
Hu, Yi ;
Buzgo, Matyas ;
Kim, Sangtae ;
Yoo, Mi-Jeong ;
Frohlich, Michael W. ;
Perl-Treves, Rafael ;
Schlarbaum, Scott E. ;
Bliss, Barbara J. ;
Zhang, Xiaohong ;
Tanksley, Steven D. ;
Oppenheimer, David G. ;
Soltis, Pamela S. ;
Ma, Hong ;
dePamphilis, Claude W. ;
Leebens-Mack, James H. .
BMC PLANT BIOLOGY, 2005, 5 (1)
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], 2006, GARLI GENETIC ALGORI
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]   Towards a comprehensive integration of morphological and genetic studies of floral development [J].
Buzgo, M ;
Soltis, DE ;
Soltis, PS ;
Ma, H .
TRENDS IN PLANT SCIENCE, 2004, 9 (04) :164-173
[6]   EST database for early flower development in California poppy (Eschscholzia californica Cham., Papaveraceae) tags over 6000 genes from a basal eudicot [J].
Carlson, John E. ;
Leebens-Mack, James H. ;
Wall, P. Kerr ;
Zahn, Laura M. ;
Mueller, Lukas A. ;
Landherr, Lena L. ;
Hu, Yi ;
Ilut, Daniel C. ;
Arrington, Jennifer M. ;
Choirean, Stephanie ;
Becker, Annette ;
Field, Dawn ;
Tanksley, Steven D. ;
Ma, Hong ;
dePamphilis, Claude W. .
PLANT MOLECULAR BIOLOGY, 2006, 62 (03) :351-369
[7]   The TIGR plant transcript assemblies database [J].
Childs, Kevin L. ;
Hamilton, John P. ;
Zhu, Wei ;
Ly, Eugene ;
Cheung, Foo ;
Wu, Hank ;
Rabinowicz, Pablo D. ;
Town, Chris D. ;
Buell, C. Robin ;
Chan, Agnes P. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D846-D851
[8]   NASCArrays: a repository for microarray data generated by NASC's transcriptomics service [J].
Craigon, DJ ;
James, N ;
Okyere, J ;
Higgins, J ;
Jotham, J ;
May, S .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D575-D577
[9]   Widespread genome duplications throughout the history of flowering plants [J].
Cui, Liying ;
Wall, P. Kerr ;
Leebens-Mack, James H. ;
Lindsay, Bruce G. ;
Soltis, Douglas E. ;
Doyle, Jeff J. ;
Soltis, Pamela S. ;
Carlson, John E. ;
Arumuganathan, Kathiravetpilla ;
Barakat, Abdelali ;
Albert, Victor A. ;
Ma, Hong ;
dePamphilis, Claude W. .
GENOME RESEARCH, 2006, 16 (06) :738-749
[10]   PlantGDB, plant genome database and analysis tools [J].
Dong, QF ;
Schlueter, SD ;
Brendel, V .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D354-D359