OpenCyto: An Open Source Infrastructure for Scalable, Robust, Reproducible, and Automated, End-to-End Flow Cytometry Data Analysis

被引:136
作者
Finak, Greg [1 ]
Frelinger, Jacob [1 ]
Jiang, Wenxin [1 ]
Newell, Evan W. [2 ]
Ramey, John [1 ]
Davis, Mark M. [3 ,4 ,5 ]
Kalams, Spyros A. [6 ,7 ]
De Rosa, Stephen C. [1 ,8 ]
Gottardo, Raphael [1 ,9 ]
机构
[1] Fred Hutchinson Canc Res Ctr, Vaccine & Infect Dis Div, Seattle, WA 98104 USA
[2] Singapore Immunol Network, Agcy Sci Technol & Res, Singapore, Singapore
[3] Stanford Univ, Dept Microbiol & Immunol, Stanford, CA 94305 USA
[4] Stanford Univ, Inst Immun Transplantat & Infect, Stanford, CA 94305 USA
[5] Stanford Univ, Howard Hughes Med Inst, Stanford, CA 94305 USA
[6] Vanderbilt Univ, Sch Med, Dept Med, Div Infect Dis, Nashville, TN 37212 USA
[7] Vanderbilt Univ, Sch Med, Dept Pathol Microbiol & Immunol, Nashville, TN 37212 USA
[8] Univ Washington, Dept Lab Med, Seattle, WA 98195 USA
[9] Univ Washington, Dept Stat, Seattle, WA 98195 USA
关键词
BIOCONDUCTOR PACKAGE; VISUALIZATION; VALIDATION; CELLS;
D O I
10.1371/journal.pcbi.1003806
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Flow cytometry is used increasingly in clinical research for cancer, immunology and vaccines. Technological advances in cytometry instrumentation are increasing the size and dimensionality of data sets, posing a challenge for traditional data management and analysis. Automated analysis methods, despite a general consensus of their importance to the future of the field, have been slow to gain widespread adoption. Here we present OpenCyto, a new BioConductor infrastructure and data analysis framework designed to lower the barrier of entry to automated flow data analysis algorithms by addressing key areas that we believe have held back wider adoption of automated approaches. OpenCyto supports end-to-end data analysis that is robust and reproducible while generating results that are easy to interpret. We have improved the existing, widely used core BioConductor flow cytometry infrastructure by allowing analysis to scale in a memory efficient manner to the large flow data sets that arise in clinical trials, and integrating domain-specific knowledge as part of the pipeline through the hierarchical relationships among cell populations. Pipelines are defined through a text-based csv file, limiting the need to write data-specific code, and are data agnostic to simplify repetitive analysis for core facilities. We demonstrate how to analyze two large cytometry data sets: an intracellular cytokine staining (ICS) data set from a published HIV vaccine trial focused on detecting rare, antigen-specific T-cell populations, where we identify a new subset of CD8 T-cells with a vaccine-regimen specific response that could not be identified through manual analysis, and a CyTOF T-cell phenotyping data set where a large staining panel and many cell populations are a challenge for traditional analysis. The substantial improvements to the core BioConductor flow cytometry packages give OpenCyto the potential for wide adoption. It can rapidly leverage new developments in computational cytometry and facilitate reproducible analysis in a unified environment.
引用
收藏
页数:12
相关论文
共 49 条
[1]  
Aghaeepour N, 2013, NAT METHODS, V10, P445, DOI [10.1038/nmeth0513-445e, 10.1038/nmeth0513-445c]
[2]   Early immunologic correlates of HIV protection can be identified from computational analysis of complex multivariate T-cell flow cytometry assays* [J].
Aghaeepour, Nima ;
Chattopadhyay, Pratip K. ;
Ganesan, Anuradha ;
O'Neill, Kieran ;
Zare, Habil ;
Jalali, Adrin ;
Hoos, Holger H. ;
Roederer, Mario ;
Brinkman, Ryan R. .
BIOINFORMATICS, 2012, 28 (07) :1009-1016
[3]   viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia [J].
Amir, El-ad David ;
Davis, Kara L. ;
Tadmor, Michelle D. ;
Simonds, Erin F. ;
Levine, Jacob H. ;
Bendall, Sean C. ;
Shenfeld, Daniel K. ;
Krishnaswamy, Smita ;
Nolan, Garry P. ;
Pe'er, Dana .
NATURE BIOTECHNOLOGY, 2013, 31 (06) :545-+
[4]  
[Anonymous], NAT METHODS
[5]  
[Anonymous], FLOWSTATS STAT METHO
[6]   B Cells With High Side Scatter Parameter by Flow Cytometry Correlate With Inferior Survival in Diffuse Large B-Cell Lymphoma [J].
Bashashati, Ali ;
Johnson, Nathalie A. ;
Khodabakhshi, Alireza Hadj ;
Whiteside, Matthew D. ;
Zare, Habil ;
Scott, David W. ;
Lo, Kenneth ;
Gottardo, Raphael ;
Brinkman, Fiona S. L. ;
Connors, Joseph M. ;
Slack, Graham W. ;
Gascoyne, Randy D. ;
Weng, Andrew P. ;
Brinkman, Ryan R. .
AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2012, 137 (05) :805-814
[7]   Single-cell mass cytometry adapted to measurements of the cell cycle [J].
Behbehani, Gregory K. ;
Bendall, Sean C. ;
Clutter, Matthew R. ;
Fantl, Wendy J. ;
Nolan, Garry P. .
CYTOMETRY PART A, 2012, 81A (07) :552-566
[8]   Flow Cytometry, Amped Up [J].
Benoist, Christophe ;
Hacohen, Nir .
SCIENCE, 2011, 332 (6030) :677-678
[9]   STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT [J].
BLAND, JM ;
ALTMAN, DG .
LANCET, 1986, 1 (8476) :307-310
[10]   Multiplexed mass cytometry profiling of cellular states perturbed by small-molecule regulators [J].
Bodenmiller, Bernd ;
Zunder, Eli R. ;
Finck, Rachel ;
Chen, Tiffany J. ;
Savig, Erica S. ;
Bruggner, Robert V. ;
Simonds, Erin F. ;
Bendall, Sean C. ;
Sachs, Karen ;
Krutzik, Peter O. ;
Nolan, Garry P. .
NATURE BIOTECHNOLOGY, 2012, 30 (09) :858-U89