InterProScan 5: genome-scale protein function classification

被引:5891
作者
Jones, Philip [1 ,2 ]
Binns, David [1 ]
Chang, Hsin-Yu [1 ]
Fraser, Matthew [1 ]
Li, Weizhong [1 ]
McAnulla, Craig [1 ]
McWilliam, Hamish [1 ]
Maslen, John [1 ,2 ]
Mitchell, Alex [1 ]
Nuka, Gift [1 ]
Pesseat, Sebastien [1 ]
Quinn, Antony F. [1 ]
Sangrador-Vegas, Amaia [1 ]
Scheremetjew, Maxim [1 ]
Yong, Siew-Yit [1 ]
Lopez, Rodrigo [1 ]
Hunter, Sarah [1 ]
机构
[1] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Hinxton CB10 1SD, England
[2] Wellcome Trust Sanger Inst, Hinxton CB10 1SA, England
基金
英国生物技术与生命科学研究理事会;
关键词
FAMILY CLASSIFICATION; ANNOTATION; TOPOLOGY;
D O I
10.1093/bioinformatics/btu031
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code.
引用
收藏
页码:1236 / 1240
页数:5
相关论文
共 26 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
[Anonymous], NUCLEIC ACIDS RES
[3]   Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]   The PRINTS database: a fine-grained protein sequence annotation and analysis resource-its status in 2012 [J].
Attwood, Teresa K. ;
Coletta, Alain ;
Muirhead, Gareth ;
Pavlopoulou, Athanasia ;
Philippou, Peter B. ;
Popov, Ivan ;
Roma-Mateo, Carlos ;
Theodosiou, Athina ;
Mitchell, Alex L. .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2012,
[6]   The ENZYME database in 2000 [J].
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :304-305
[7]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
[8]   The ProDom database of protein domain families: more emphasis on 3D [J].
Bru, C ;
Courcelle, E ;
Carrre, S ;
Beausse, Y ;
Dalmar, S ;
Kahn, D .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D212-D215
[9]  
Eddy Sean R, 2009, Genome Inform, V23, P205
[10]   A new bioinformatics analysis tools framework at EMBL-EBI [J].
Goujon, Mickael ;
McWilliam, Hamish ;
Li, Weizhong ;
Valentin, Franck ;
Squizzato, Silvano ;
Paern, Juri ;
Lopez, Rodrigo .
NUCLEIC ACIDS RESEARCH, 2010, 38 :W695-W699