The impact of Docker containers on the performance of genomic pipelines

被引:66
作者
Di Tommaso, Paolo [1 ,2 ]
Palumbo, Emilio [1 ,2 ]
Chatzou, Maria [1 ,2 ]
Prieto, Pablo [1 ,2 ]
Heuer, Michael L. [3 ]
Notredame, Cedric [1 ,2 ]
机构
[1] Ctr Genom Regulat CRG, Bioinformat & Genom Program, Barcelona, Spain
[2] Univ Pompeu Fabra, Barcelona, Spain
[3] Natl Marrow Donor Program, Dept Bioinformat Res, Minneapolis, MN USA
关键词
Workflow; Pipelines; Docker; Virtualisation; Bioinformatics; READ ALIGNMENT;
D O I
10.7717/peerj.1273
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
Genomic pipelines consist of several pieces of third party software and, because of their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment and reproducibility issues. Docker containers are emerging as a possible solution for many of these problems, as they allow the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing platforms. Thus, the question that arises is to what extent the use of Docker containers might affect the performance of these pipelines. Here we address this question and conclude that Docker containers have only a minor impact on the performance of common genomic pipelines, which is negligible when the executed jobs are long in terms of computational time.
引用
收藏
页数:10
相关论文
共 18 条
[1]
BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]
Boettiger Carl, 2015, ACM SIGOPS Operating Systems Review, V49, P71
[3]
Di Tommaso P, 2014, NEXTFLOW NOVEL TOOL
[4]
An integrated encyclopedia of DNA elements in the human genome [J].
Dunham, Ian ;
Kundaje, Anshul ;
Aldred, Shelley F. ;
Collins, Patrick J. ;
Davis, CarrieA. ;
Doyle, Francis ;
Epstein, Charles B. ;
Frietze, Seth ;
Harrow, Jennifer ;
Kaul, Rajinder ;
Khatun, Jainab ;
Lajoie, Bryan R. ;
Landt, Stephen G. ;
Lee, Bum-Kyu ;
Pauli, Florencia ;
Rosenbloom, Kate R. ;
Sabo, Peter ;
Safi, Alexias ;
Sanyal, Amartya ;
Shoresh, Noam ;
Simon, Jeremy M. ;
Song, Lingyun ;
Trinklein, Nathan D. ;
Altshuler, Robert C. ;
Birney, Ewan ;
Brown, James B. ;
Cheng, Chao ;
Djebali, Sarah ;
Dong, Xianjun ;
Dunham, Ian ;
Ernst, Jason ;
Furey, Terrence S. ;
Gerstein, Mark ;
Giardine, Belinda ;
Greven, Melissa ;
Hardison, Ross C. ;
Harris, Robert S. ;
Herrero, Javier ;
Hoffman, Michael M. ;
Iyer, Sowmya ;
Kellis, Manolis ;
Khatun, Jainab ;
Kheradpour, Pouya ;
Kundaje, Anshul ;
Lassmann, Timo ;
Li, Qunhua ;
Lin, Xinying ;
Marinov, Georgi K. ;
Merkel, Angelika ;
Mortazavi, Ali .
NATURE, 2012, 489 (7414) :57-74
[5]
Felter Wes., 2014, An Updated Performance Comparison of Virtual Machines and Linux Containers
[6]
Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome [J].
Garijo, Daniel ;
Kinnings, Sarah ;
Xie, Li ;
Xie, Lei ;
Zhang, Yinliang ;
Bourne, Philip E. ;
Gil, Yolanda .
PLOS ONE, 2013, 8 (11)
[7]
Gent IP, 2013, ARXIV13043674
[8]
Skyport - Container-Based Execution Environment Management for Multi-Cloud Scientific Workflows [J].
Gerlach, Wolfgang ;
Tang, Wei ;
Keegan, Kevin ;
Harrison, Travis ;
Wilke, Andreas ;
Bischof, Jared ;
D'Souza, Mark ;
Devoid, Scott ;
Murphy-Olson, Daniel ;
Desai, Narayan ;
Meyer, Folker .
2014 5TH INTERNATIONAL WORKSHOP ON DATA-INTENSIVE COMPUTING IN THE CLOUDS (DATACLOUD), 2014, :25-32
[9]
Hinsen Konrad, 2014, F1000Res, V3, P289, DOI 10.12688/f1000research.5773.3