Computational solutions to large-scale data management and analysis

被引:384
作者
Schadt, Eric E. [1 ]
Linderman, Michael D. [2 ]
Sorenson, Jon [1 ]
Lee, Lawrence [1 ]
Nolan, Garry P. [3 ]
机构
[1] Pacific Biosci, Menlo Pk, CA 94025 USA
[2] Stanford Univ, Comp Syst Lab, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Microbiol & Immunol, Stanford, CA 94305 USA
关键词
MOLECULAR DYNAMIC SIMULATION; GRAPHICS PROCESSING UNITS; DISEASE; TIME; NETWORKS; CLOUD;
D O I
10.1038/nrg2857
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist-such as cloud and heterogeneous computing-to successfully tackle our big data problems.
引用
收藏
页码:647 / 657
页数:11
相关论文
共 39 条
[1]   Genetic Mapping in Human Disease [J].
Altshuler, David ;
Daly, Mark J. ;
Lander, Eric S. .
SCIENCE, 2008, 322 (5903) :881-888
[2]  
[Anonymous], 2009, CLOUDS BERKELEY VIEW
[3]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theory of NP-Completeness
[4]   Mass Cytometry: Technique for Real Time Single Cell Multitarget Immunoassay Based on Inductively Coupled Plasma Time-of-Flight Mass Spectrometry [J].
Bandura, Dmitry R. ;
Baranov, Vladimir I. ;
Ornatsky, Olga I. ;
Antonov, Alexei ;
Kinach, Robert ;
Lou, Xudong ;
Pavlov, Serguei ;
Vorobiev, Sergey ;
Dick, John E. ;
Tanner, Scott D. .
ANALYTICAL CHEMISTRY, 2009, 81 (16) :6813-6822
[5]  
Barroso LuizAndre., 2009, DATACENTER COMPUTER, P1
[6]  
Bell G., 2005, Petascale computations systems: Balanced cyberinfrastructure in a data-centric world
[7]   Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility [J].
Buyya, Rajkumar ;
Yeo, Chee Shin ;
Venugopal, Srikumar ;
Broberg, James ;
Brandic, Ivona .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (06) :599-616
[8]   Variations in DNA elucidate molecular networks that cause disease [J].
Chen, Yanqing ;
Zhu, Jun ;
Lum, Pek Yee ;
Yang, Xia ;
Pinto, Shirly ;
MacNeil, Douglas J. ;
Zhang, Chunsheng ;
Lamb, John ;
Edwards, Stephen ;
Sieberts, Solveig K. ;
Leonardson, Amy ;
Castellini, Lawrence W. ;
Wang, Susanna ;
Champy, Marie-France ;
Zhang, Bin ;
Emilsson, Valur ;
Doss, Sudheer ;
Ghazalpour, Anatole ;
Horvath, Steve ;
Drake, Thomas A. ;
Lusis, Aldons J. ;
Schadt, Eric E. .
NATURE, 2008, 452 (7186) :429-435
[9]   VertNet: A New Model for Biodiversity Data Sharing [J].
Constable, Heather ;
Guralnick, Robert ;
Wieczorek, John ;
Spencer, Carol ;
Peterson, A. Townsend .
PLOS BIOLOGY, 2010, 8 (02)
[10]   Bacterial Community Variation in Human Body Habitats Across Space and Time [J].
Costello, Elizabeth K. ;
Lauber, Christian L. ;
Hamady, Micah ;
Fierer, Noah ;
Gordon, Jeffrey I. ;
Knight, Rob .
SCIENCE, 2009, 326 (5960) :1694-1697