A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples

被引:332
作者
Naccache, Samia N. [1 ,2 ]
Federman, Scot [1 ,2 ]
Veeraraghavan, Narayanan [1 ,2 ]
Zaharia, Matei [3 ]
Lee, Deanna [1 ,2 ]
Samayoa, Erik [1 ,2 ]
Bouquet, Jerome [1 ,2 ]
Greninger, Alexander L. [4 ]
Luk, Ka-Cheung [5 ]
Enge, Barryett [6 ]
Wadford, Debra A. [6 ]
Messenger, Sharon L. [6 ]
Genrich, Gillian L. [1 ]
Pellegrino, Kristen [7 ]
Grard, Gilda [8 ]
Leroy, Eric [8 ]
Schneider, Bradley S. [9 ]
Fair, Joseph N. [9 ]
Martinez, Miguel A. [10 ]
Isa, Pavel [10 ]
Crump, John A. [11 ,12 ,13 ,14 ]
DeRisi, Joseph L. [4 ]
Sittler, Taylor [1 ]
Hackett, John, Jr. [5 ]
Miller, Steve [1 ,2 ]
Chiu, Charles Y. [1 ,2 ,15 ]
机构
[1] UCSF, Dept Lab Med, San Francisco, CA 94107 USA
[2] UCSF Abbott Viral Diagnost & Discovery Ctr, San Francisco, CA 94107 USA
[3] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94720 USA
[4] UCSF, Dept Biochem, San Francisco, CA 94107 USA
[5] Abbott Diagnost, Abbott Pk, IL 60064 USA
[6] Calif Dept Publ Hlth, Viral & Rickettsial Dis Lab, Richmond, CA 94804 USA
[7] UCSF, Dept Family & Community Med, San Francisco, CA 94143 USA
[8] Ctr Int Rech Med Franceville, Viral Emergent Dis Unit, Franceville, Gabon
[9] Metabiota Inc, San Francisco, CA 94104 USA
[10] Univ Nacl Autonoma Mexico, Inst Biotecnol, Dept Genet Desarrollo & Fisiol Mol, Cuernavaca 62260, Morelos, Mexico
[11] Duke Univ, Med Ctr, Div Infect Dis & Int Hlth, Durham, NC 27708 USA
[12] Duke Univ, Med Ctr, Duke Global Hlth Inst, Durham, NC 27708 USA
[13] Kilimanjaro Christian Med Ctr, Moshi 7393, Kilimanjaro, Tanzania
[14] Univ Otago, Ctr Int Hlth, Dunedin 9054, New Zealand
[15] UCSF, Div Infect Dis, Dept Med, San Francisco, CA 94143 USA
基金
美国国家卫生研究院;
关键词
READ ALIGNMENT; INFECTION; ETIOLOGY; MALARIA; PNEUMONIA; AUSTRALIA; SOFTWARE; PROGRAM; GENOMES; QUALITY;
D O I
10.1101/gr.171934.113
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Unbiased next-generation sequencing (NGS) approaches enable comprehensive pathogen detection in the clinical microbiology laboratory and have numerous applications for public health surveillance, outbreak investigation, and the diagnosis of infectious diseases. However, practical deployment of the technology is hindered by the bioinformatics challenge of analyzing results accurately and in a clinically relevant timeframe. Here we describe SURPI (``sequence-based ultrarapid pathogen identification''), a computational pipeline for pathogen identification from complex metagenomic NGS data generated from clinical samples, and demonstrate use of the pipeline in the analysis of 237 clinical samples comprising more than 1.1 billion sequences. Deployable on both cloud-based and standalone servers, SURPI leverages two state-of-the-art aligners for accelerated analyses, SNAP and RAPSearch, which are as accurate as existing bioinformatics tools but orders of magnitude faster in performance. In fast mode, SURPI detects viruses and bacteria by scanning data sets of 7-500 million reads in 11 min to 5 h, while in comprehensive mode, all known microorganisms are identified, followed by de novo assembly and protein homology searches for divergent viruses in 50 min to 16 h. SURPI has also directly contributed to real-time microbial diagnosis in acutely ill patients, underscoring its potential key role in the development of unbiased NGS-based clinical assays in infectious diseases that demand rapid turnaround times.
引用
收藏
页码:1180 / 1192
页数:13
相关论文
共 56 条
[1]   Understanding diagnostic tests 3: receiver operating characteristic curves [J].
Akobeng, Anthony K. .
ACTA PAEDIATRICA, 2007, 96 (05) :644-647
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Etiology of acute gastroenteritis in hospitalized children in Melbourne, Australia, from April 1980 to March 1993 [J].
Barnes, GL ;
Uren, E ;
Stevens, KB ;
Bishop, RF .
JOURNAL OF CLINICAL MICROBIOLOGY, 1998, 36 (01) :133-138
[4]   Rapid identification of non-human sequences in high-throughput sequencing datasets [J].
Bhaduri, Aparna ;
Qu, Kun ;
Lee, Carolyn S. ;
Ungewickell, Alexander ;
Khavari, Paul A. .
BIOINFORMATICS, 2012, 28 (08) :1174-1175
[5]   Diagnostic approaches for patients with suspected encephalitis [J].
Bloch K.C. ;
Glaser C. .
Current Infectious Disease Reports, 2007, 9 (4) :315-322
[6]   CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes [J].
Borozan, Ivan ;
Wilson, Shane ;
Blanchette, Paola ;
Laflamme, Philippe ;
Watt, Stuart N. ;
Krzyzanowski, Paul M. ;
Sircoulomb, Fabrice ;
Rottapel, Robert ;
Branton, Philip E. ;
Ferretti, Vincent .
BMC BIOINFORMATICS, 2012, 13
[7]   Genetic Detection and Characterization of Lujo Virus, a New Hemorrhagic Fever-Associated Arenavirus from Southern Africa [J].
Briese, Thomas ;
Paweska, Janusz T. ;
McMullan, Laura K. ;
Hutchison, Stephen K. ;
Street, Craig ;
Palacios, Gustavo ;
Khristova, Marina L. ;
Weyer, Jacqueline ;
Swanepoel, Robert ;
Egholm, Michael ;
Nichol, Stuart T. ;
Lipkin, W. Ian .
PLOS PATHOGENS, 2009, 5 (05)
[8]   Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma [J].
Castellarin, Mauro ;
Warren, Rene L. ;
Freeman, J. Douglas ;
Dreolini, Lisa ;
Krzywinski, Martin ;
Strauss, Jaclyn ;
Barnes, Rebecca ;
Watson, Peter ;
Allen-Vercoe, Emma ;
Moore, Richard A. ;
Holt, Robert A. .
GENOME RESEARCH, 2012, 22 (02) :299-306
[9]   Cross-Species Transmission of a Novel Adenovirus Associated with a Fulminant Pneumonia Outbreak in a New World Monkey Colony [J].
Chen, Eunice C. ;
Yagi, Shigeo ;
Kelly, Kristi R. ;
Mendoza, Sally P. ;
Maninger, Nicole ;
Rosenthal, Ann ;
Spinner, Abigail ;
Bales, Karen L. ;
Schnurr, David P. ;
Lerche, Nicholas W. ;
Chiu, Charles Y. .
PLOS PATHOGENS, 2011, 7 (07)
[10]   Viral pathogen discovery [J].
Chius, Charles Y. .
CURRENT OPINION IN MICROBIOLOGY, 2013, 16 (04) :468-478