CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

被引:160
作者
Angiuoli, Samuel V. [1 ,2 ]
Matalka, Malcolm [1 ]
Gussman, Aaron [1 ]
Galens, Kevin [1 ]
Vangala, Mahesh [1 ]
Riley, David R. [1 ]
Arze, Cesar [1 ]
White, James R. [1 ]
White, Owen [1 ]
Fricke, W. Florian [1 ]
机构
[1] Univ Maryland, Sch Med, IGS, Baltimore, MD 21201 USA
[2] Univ Maryland, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
来源
BMC BIOINFORMATICS | 2011年 / 12卷
基金
美国国家科学基金会;
关键词
RAST SERVER; DATABASE; ANNOTATION; SEARCH; SOFTWARE; ERGATIS; GENES; DNA;
D O I
10.1186/1471-2105-12-356
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results: We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion: The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single-and multi-core computers and cloud systems for high throughput data processing.
引用
收藏
页数:15
相关论文
共 67 条
[1]   Galaxy CloudMan: delivering cloud compute clusters [J].
Afgan, Enis ;
Baker, Dannon ;
Coraor, Nate ;
Chapman, Brad ;
Nekrutenko, Anton ;
Taylor, James .
BMC BIOINFORMATICS, 2010, 11
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]  
[Anonymous], AM EL COMP CLOUD
[5]  
[Anonymous], GANGLIA MONITORING S
[6]  
[Anonymous], QIIME VIRT BOX
[7]  
[Anonymous], BOOK HIGH SPEED BULK
[8]  
[Anonymous], SCI CLOUDS NIMB OP
[9]  
[Anonymous], STARDEV CLUSTER
[10]  
[Anonymous], AM EL BLOCK STOR