InParanoid 7: new algorithms and tools for eukaryotic orthology analysis

被引:477
作者
Ostlund, Gabriel [1 ]
Schmitt, Thomas [1 ]
Forslund, Kristoffer [1 ]
Kostler, Tina [1 ]
Messina, David N. [1 ]
Roopra, Sanjit [1 ]
Frings, Oliver [1 ]
Sonnhammer, Erik L. L. [1 ]
机构
[1] Stockholm Univ, Dept Biochem & Biophys, Stockholm Bioinformat Ctr, AlbaNova Univ Ctr, SE-10691 Stockholm, Sweden
基金
瑞典研究理事会;
关键词
PROTEIN-SEQUENCE; RESOURCE; GENE; PERFORMANCE; COMPLEXITY; DATABASE;
D O I
10.1093/nar/gkp931
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The InParanoid project gathers proteomes of completely sequenced eukaryotic species plus Escherichia coli and calculates pairwise ortholog relationships among them. The new release 7.0 of the database has grown by an order of magnitude over the previous version and now includes 100 species and their collective 1.3 million proteins organized into 42.7 million pairwise ortholog groups. The InParanoid algorithm itself has been revised and is now both more specific and sensitive. Based on results from our recent benchmarking of low-complexity filters in homology assignment, a two-pass BLAST approach was developed that makes use of high-precision compositional score matrix adjustment, but avoids the alignment truncation that sometimes follows. We have also updated the InParanoid web site (http://InParanoid.sbc.su.se). Several features have been added, the response times have been improved and the site now sports a new, clearer look. As the number of ortholog databases has grown, it has become difficult to compare among these resources due to a lack of standardized source data and incompatible representations of ortholog relationships. To facilitate data exchange and comparisons among ortholog databases, we have developed and are making available two XML schemas: SeqXML for the input sequences and OrthoXML for the output ortholog clusters.
引用
收藏
页码:D196 / D203
页数:8
相关论文
共 38 条
[1]  
Alexeyenko Andrey, 2006, Drug Discov Today Technol, V3, P137, DOI 10.1016/j.ddtec.2006.06.002
[2]   Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods [J].
Altenhoff, Adrian M. ;
Dessimoz, Christophe .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (01)
[3]   Protein database searches using compositionally adjusted substitution matrices [J].
Altschul, SF ;
Wootton, JC ;
Gertz, EM ;
Agarwala, R ;
Morgulis, A ;
Schäffer, AA ;
Yu, YK .
FEBS JOURNAL, 2005, 272 (20) :5101-5109
[4]   Sequence resources at the Candida genome database [J].
Arnaud, Martha B. ;
Costanzo, Maria C. ;
Skrzypek, Marek S. ;
Shah, Prachi ;
Binkley, Gail ;
Lane, Christopher ;
Miyasato, Stuart R. ;
Sherlock, Gavin .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D452-D456
[5]   GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis [J].
Aurrecoechea, Cristina ;
Brestelli, John ;
Brunk, Brian P. ;
Carlton, Jane M. ;
Dommer, Jennifer ;
Fischer, Steve ;
Gajria, Bindu ;
Gao, Xin ;
Gingle, Alan ;
Grant, Greg ;
Harb, Omar S. ;
Heiges, Mark ;
Innamorato, Frank ;
Iodice, John ;
Kissinger, Jessica C. ;
Kraemer, Eileen ;
Li, Wei ;
Miller, John A. ;
Morrison, Hilary G. ;
Nayak, Vishal ;
Pennington, Cary ;
Pinney, Deborah F. ;
Roos, David S. ;
Ross, Chris ;
Stoeckert, Christian J., Jr. ;
Sullivan, Steven ;
Treatman, Charles ;
Wang, Haiming .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D526-D530
[6]   PlasmoDB: a functional genomic database for malaria parasites [J].
Aurrecoechea, Cristina ;
Brestelli, John ;
Brunk, Brian P. ;
Dommer, Jennifer ;
Fischer, Steve ;
Gajria, Bindu ;
Gao, Xin ;
Gingle, Alan ;
Grant, Greg ;
Harb, Omar S. ;
Heiges, Mark ;
Innamorato, Frank ;
Iodice, John ;
Kissinger, Jessica C. ;
Kraemer, Eileen ;
Li, Wei ;
Miller, John A. ;
Nayak, Vishal ;
Pennington, Cary ;
Pinney, Deborah F. ;
Roos, David S. ;
Ross, Chris ;
Stoeckert, Christian J., Jr. ;
Treatman, Charles ;
Wang, Haiming .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D539-D543
[7]   InParanoid 6:: eukaryotic ortholog clusters with inparalogs [J].
Berglund, Ann-Charlotte ;
Sjolund, Erik ;
Ostlund, Gabriel ;
Sonnhammer, Erik L. L. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D263-D266
[8]   WormBase:: new content and better access [J].
Bieri, Tamberlyn ;
Blasiar, Darin ;
Ozersky, Philip ;
Antoshechkin, Igor ;
Bastiani, Carol ;
Canaran, Payan ;
Chan, Juancarlos ;
Chen, Nansheng ;
Chen, Wen J. ;
Davis, Paul ;
Fiedler, Tristan J. ;
Girard, Lisa ;
Han, Michael ;
Harris, Todd W. ;
Kishore, Ranjana ;
Lee, Raymond ;
McKay, Sheldon ;
Muller, Hans-Michael ;
Nakamura, Cecilia ;
Petcherski, Andrei ;
Rangarajan, Arun ;
Rogers, Anthony ;
Schindelman, Gary ;
Schwarz, Erich M. ;
Spooner, Will ;
Tuli, Mary Ann ;
Van Auken, Kimberly ;
Wang, Daniel ;
Wang, Xiaodong ;
Williams, Gary ;
Durbin, Richard ;
Stein, Lincoln D. ;
Sternberg, Paul W. ;
Spieth, John .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D506-D510
[9]   Assessing Performance of Orthology Detection Strategies Applied to Eukaryotic Genomes [J].
Chen, Feng ;
Mackey, Aaron J. ;
Vermunt, Jeroen K. ;
Roos, David S. .
PLOS ONE, 2007, 2 (04)
[10]   Orthology and functional conservation in eukaryotes [J].
Dolinski, Kara ;
Botstein, David .
ANNUAL REVIEW OF GENETICS, 2007, 41 :465-507