Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

被引:161
作者
Reid, Jeffrey G. [1 ]
Carroll, Andrew [2 ]
Veeraraghavan, Narayanan [1 ]
Dahdouli, Mahmoud [1 ]
Sundquist, Andreas [2 ]
English, Adam [1 ]
Bainbridge, Matthew [1 ]
White, Simon [1 ]
Salerno, William [1 ]
Buhay, Christian [1 ]
Yu, Fuli [1 ,3 ]
Muzny, Donna [1 ]
Daly, Richard [2 ]
Duyk, Geoff [2 ]
Gibbs, Richard A. [1 ,3 ]
Boerwinkle, Eric [1 ,4 ]
机构
[1] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[2] DNAnexus, Mountain View, CA 94040 USA
[3] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
[4] Univ Texas Hlth Sci Ctr Houston, Ctr Human Genet, Houston, TX 77030 USA
关键词
NGS data; Variant calling; Annotation; Clinical sequencing; Cloud computing; READ ALIGNMENT; DISCOVERY; FRAMEWORK; GALAXY;
D O I
10.1186/1471-2105-15-30
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Background: Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results: To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions: By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.
引用
收藏
页数:11
相关论文
共 21 条
[1]
Exome Sequencing of Head and Neck Squamous Cell Carcinoma Reveals Inactivating Mutations in NOTCH1 [J].
Agrawal, Nishant ;
Frederick, Mitchell J. ;
Pickering, Curtis R. ;
Bettegowda, Chetan ;
Chang, Kyle ;
Li, Ryan J. ;
Fakhry, Carole ;
Xie, Tong-Xin ;
Zhang, Jiexin ;
Wang, Jing ;
Zhang, Nianxiang ;
El-Naggar, Adel K. ;
Jasser, Samar A. ;
Weinstein, John N. ;
Trevino, Lisa ;
Drummond, Jennifer A. ;
Muzny, Donna M. ;
Wu, Yuanqing ;
Wood, Laura D. ;
Hruban, Ralph H. ;
Westra, William H. ;
Koch, Wayne M. ;
Califano, Joseph A. ;
Gibbs, Richard A. ;
Sidransky, David ;
Vogelstein, Bert ;
Velculescu, Victor E. ;
Papadopoulos, Nickolas ;
Wheeler, David A. ;
Kinzler, Kenneth W. ;
Myers, Jeffrey N. .
SCIENCE, 2011, 333 (6046) :1154-1157
[2]
Whole-Genome Sequencing for Optimized Patient Management [J].
Bainbridge, Matthew N. ;
Wiszniewski, Wojciech ;
Murdock, David R. ;
Friedman, Jennifer ;
Gonzaga-Jauregui, Claudia ;
Newsham, Irene ;
Reid, Jeffrey G. ;
Fink, John K. ;
Morgan, Margaret B. ;
Gingras, Marie-Claude ;
Muzny, Donna M. ;
Hoang, Linh D. ;
Yousaf, Shahed ;
Lupski, James R. ;
Gibbs, Richard A. .
SCIENCE TRANSLATIONAL MEDICINE, 2011, 3 (87)
[3]
Integrated genomic analyses of ovarian carcinoma [J].
Bell, D. ;
Berchuck, A. ;
Birrer, M. ;
Chien, J. ;
Cramer, D. W. ;
Dao, F. ;
Dhir, R. ;
DiSaia, P. ;
Gabra, H. ;
Glenn, P. ;
Godwin, A. K. ;
Gross, J. ;
Hartmann, L. ;
Huang, M. ;
Huntsman, D. G. ;
Iacocca, M. ;
Imielinski, M. ;
Kalloger, S. ;
Karlan, B. Y. ;
Levine, D. A. ;
Mills, G. B. ;
Morrison, C. ;
Mutch, D. ;
Olvera, N. ;
Orsulic, S. ;
Park, K. ;
Petrelli, N. ;
Rabeno, B. ;
Rader, J. S. ;
Sikic, B. I. ;
Smith-McCune, K. ;
Sood, A. K. ;
Bowtell, D. ;
Penny, R. ;
Testa, J. R. ;
Chang, K. ;
Dinh, H. H. ;
Drummond, J. A. ;
Fowler, G. ;
Gunaratne, P. ;
Hawes, A. C. ;
Kovar, C. L. ;
Lewis, L. R. ;
Morgan, M. B. ;
Newsham, I. F. ;
Santibanez, J. ;
Reid, J. G. ;
Trevino, L. R. ;
Wu, Y. -Q. ;
Wang, M. .
NATURE, 2011, 474 (7353) :609-615
[4]
Blankenberg Daniel, 2010, Curr Protoc Mol Biol, VChapter 19, DOI 10.1002/0471142727.mb1910s89
[5]
An integrative variant analysis suite for whole exome next-generation sequencing data [J].
Challis, Danny ;
Yu, Jin ;
Evani, Uday S. ;
Jackson, Andrew R. ;
Paithankar, Sameer ;
Coarfa, Cristian ;
Milosavljevic, Aleksandar ;
Gibbs, Richard A. ;
Yu, Fuli .
BMC BIOINFORMATICS, 2012, 13
[6]
A framework for variation discovery and genotyping using next-generation DNA sequencing data [J].
DePristo, Mark A. ;
Banks, Eric ;
Poplin, Ryan ;
Garimella, Kiran V. ;
Maguire, Jared R. ;
Hartl, Christopher ;
Philippakis, Anthony A. ;
del Angel, Guillermo ;
Rivas, Manuel A. ;
Hanna, Matt ;
McKenna, Aaron ;
Fennell, Tim J. ;
Kernytsky, Andrew M. ;
Sivachenko, Andrey Y. ;
Cibulskis, Kristian ;
Gabriel, Stacey B. ;
Altshuler, David ;
Daly, Mark J. .
NATURE GENETICS, 2011, 43 (05) :491-+
[7]
Galaxy: A platform for interactive large-scale genome analysis [J].
Giardine, B ;
Riemer, C ;
Hardison, RC ;
Burhans, R ;
Elnitski, L ;
Shah, P ;
Zhang, Y ;
Blankenberg, D ;
Albert, I ;
Taylor, J ;
Miller, W ;
Kent, WJ ;
Nekrutenko, A .
GENOME RESEARCH, 2005, 15 (10) :1451-1455
[8]
Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences [J].
Goecks, Jeremy ;
Nekrutenko, Anton ;
Taylor, James .
GENOME BIOLOGY, 2010, 11 (08)
[9]
Chipster: user-friendly analysis software for microarray and other high-throughput data [J].
Kallio, M. Aleksi ;
Tuimala, Jarno T. ;
Hupponen, Taavi ;
Klemela, Petri ;
Gentile, Massimiliano ;
Scheinin, Ilari ;
Koski, Mikko ;
Kaki, Janne ;
Korpelainen, Eija I. .
BMC GENOMICS, 2011, 12
[10]
Fast and accurate long-read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2010, 26 (05) :589-595