GHOSTX: An Improved Sequence Homology Search Algorithm Using a Query Suffix Array and a Database Suffix Array

被引:62
作者
Suzuki, Shuji [1 ]
Kakuta, Masanori [1 ]
Ishida, Takashi [1 ]
Akiyama, Yutaka [1 ]
机构
[1] Tokyo Inst Technol, Grad Sch Informat Sci & Engn, Meguro Ku, Tokyo 152, Japan
关键词
LOCAL ALIGNMENT SEARCH; PROTEIN; BLAST; GENERATION; KEGG; TOOL;
D O I
10.1371/journal.pone.0103833
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, huge amounts of sequence data create the problem that even general homology search analyses using BLASTX become difficult in terms of computational cost. We designed a new homology search algorithm that finds seed sequences based on the suffix arrays of a query and a database, and have implemented it as GHOSTX. GHOSTX achieved approximately 131-165 times acceleration over a BLASTX search at similar levels of sensitivity. GHOSTX is distributed under the BSD 2-clause license and is available for download at http://www.bi.cs.titech.ac.jp/ghostx/. Currently, sequencing technology continues to improve, and sequencers are increasingly producing larger and larger quantities of data. This explosion of sequence data makes computational analysis with contemporary tools more difficult. We offer this tool as a potential solution to this problem.
引用
收藏
页数:8
相关论文
共 15 条
[1]
Protein database searches using compositionally adjusted substitution matrices [J].
Altschul, SF ;
Wootton, JC ;
Gertz, EM ;
Agarwala, R ;
Morgulis, A ;
Schäffer, AA ;
Yu, YK .
FEBS JOURNAL, 2005, 272 (20) :5101-5109
[2]
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]
BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]
Inexact Local Alignment Search over Suffix Arrays [J].
Ghodsi, Mohammadreza ;
Pop, Mihai .
2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2009, :83-+
[5]
Structure, function and diversity of the healthy human microbiome [J].
Huttenhower, Curtis ;
Gevers, Dirk ;
Knight, Rob ;
Abubucker, Sahar ;
Badger, Jonathan H. ;
Chinwalla, Asif T. ;
Creasy, Heather H. ;
Earl, Ashlee M. ;
FitzGerald, Michael G. ;
Fulton, Robert S. ;
Giglio, Michelle G. ;
Hallsworth-Pepin, Kymberlie ;
Lobos, Elizabeth A. ;
Madupu, Ramana ;
Magrini, Vincent ;
Martin, John C. ;
Mitreva, Makedonka ;
Muzny, Donna M. ;
Sodergren, Erica J. ;
Versalovic, James ;
Wollam, Aye M. ;
Worley, Kim C. ;
Wortman, Jennifer R. ;
Young, Sarah K. ;
Zeng, Qiandong ;
Aagaard, Kjersti M. ;
Abolude, Olukemi O. ;
Allen-Vercoe, Emma ;
Alm, Eric J. ;
Alvarado, Lucia ;
Andersen, Gary L. ;
Anderson, Scott ;
Appelbaum, Elizabeth ;
Arachchi, Harindra M. ;
Armitage, Gary ;
Arze, Cesar A. ;
Ayvaz, Tulin ;
Baker, Carl C. ;
Begg, Lisa ;
Belachew, Tsegahiwot ;
Bhonagiri, Veena ;
Bihan, Monika ;
Blaser, Martin J. ;
Bloom, Toby ;
Bonazzi, Vivien ;
Brooks, J. Paul ;
Buck, Gregory A. ;
Buhay, Christian J. ;
Busam, Dana A. ;
Campbell, Joseph L. .
NATURE, 2012, 486 (7402) :207-214
[6]
KEGG: Kyoto Encyclopedia of Genes and Genomes [J].
Kanehisa, M ;
Goto, S .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :27-30
[7]
KEGG for integration and interpretation of large-scale molecular data sets [J].
Kanehisa, Minoru ;
Goto, Susumu ;
Sato, Yoko ;
Furumichi, Miho ;
Tanabe, Mao .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D109-D114
[8]
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202, 10.1101/gr.229202. Article published online before March 2002]
[9]
Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes [J].
Kurokawa, Ken ;
Itoh, Takehiko ;
Kuwahara, Tomomi ;
Oshima, Kenshiro ;
Toh, Hidehiro ;
Toyoda, Atsushi ;
Takami, Hideto ;
Morita, Hidetoshi ;
Sharma, Vineet K. ;
Srivastava, Tulika P. ;
Taylor, Todd D. ;
Noguchi, Hideki ;
Mori, Hiroshi ;
Ogura, Yoshitoshi ;
Ehrlich, Dusko S. ;
Itoh, Kikuji ;
Takagi, Toshihisa ;
Sakaki, Yoshiyuki ;
Hayashi, Tetsuya ;
Hattori, Masahira .
DNA RESEARCH, 2007, 14 (04) :169-181
[10]
Manber Udi., 1990, SODA 90, P319