Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets

被引:457
作者
Belkina, Anna C. [1 ,2 ]
Ciccolella, Christopher O. [3 ]
Anno, Rina [4 ]
Halpert, Richard [5 ]
Spidlen, Josef [5 ]
Snyder-Cappione, Jennifer E. [2 ,6 ]
机构
[1] Boston Univ, Sch Med, Dept Pathol & Lab Med, Boston, MA 02118 USA
[2] Boston Univ, Sch Med, Flow Cytometry Core Facil, Boston, MA 02118 USA
[3] Omiq Inc, Santa Clara, CA 95050 USA
[4] Kansas State Univ, Dept Math, Manhattan, KS 66506 USA
[5] BD Life Sci FlowJo, Ashland, OR 97520 USA
[6] Boston Univ, Sch Med, Dept Microbiol, Boston, MA 02118 USA
关键词
MASS CYTOMETRY; FLOW-CYTOMETRY; CELLS; IMMUNE; HETEROGENEITY; FLUORESCENCE; PANEL;
D O I
10.1038/s41467-019-13055-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
Accurate and comprehensive extraction of information from high-dimensional single cell datasets necessitates faithful visualizations to assess biological populations. A state-of-the-art algorithm for non-linear dimension reduction, t-SNE, requires multiple heuristics and fails to produce clear representations of datasets when millions of cells are projected. We develop opt-SNE, an automated toolkit for t-SNE parameter selection that utilizes Kullback-Leibler divergence evaluation in real time to tailor the early exaggeration and overall number of gradient descent iterations in a dataset-specific manner. The precise calibration of early exaggeration together with opt-SNE adjustment of gradient descent learning rate dramatically improves computation time and enables high-quality visualization of large cytometry and transcriptomics datasets, overcoming limitations of analysis tools with hard-coded parameters that often produce poorly resolved or misleading maps of fluorescent and mass cytometry data. In summary, opt-SNE enables superior data resolution in t-SNE space and thereby more accurate data interpretation.
引用
收藏
页数:12
相关论文
共 47 条
[1]
Amid E., 2018, PREPRINT
[2]
viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia [J].
Amir, El-ad David ;
Davis, Kara L. ;
Tadmor, Michelle D. ;
Simonds, Erin F. ;
Levine, Jacob H. ;
Bendall, Sean C. ;
Shenfeld, Daniel K. ;
Krishnaswamy, Smita ;
Nolan, Garry P. ;
Pe'er, Dana .
NATURE BIOTECHNOLOGY, 2013, 31 (06) :545-+
[3]
[Anonymous], 2017, PREPRINT
[4]
[Anonymous], 2017, ARXIV170602582
[5]
Arora S., 2018, PMLR, P1455
[6]
High-dimensional analysis of the murine myeloid cell system [J].
Becher, Burkhard ;
Schlitzer, Andreas ;
Chen, Jinmiao ;
Mair, Florian ;
Sumatoh, Hermi R. ;
Teng, Karen Wei Weng ;
Low, Donovan ;
Ruedl, Christiane ;
Riccardi-Castagnoli, Paola ;
Poidinger, Michael ;
Greter, Melanie ;
Ginhoux, Florent ;
Newell, Evan W. .
NATURE IMMUNOLOGY, 2014, 15 (12) :1181-1189
[7]
Multivariate Computational Analysis of Gamma Delta T Cell Inhibitory Receptor Signatures Reveals the Divergence of Healthy and ART-Suppressed HIV plus Aging [J].
Belkina, Anna C. ;
Starchenko, Alina ;
Drake, Katherine A. ;
Proctor, Elizabeth A. ;
Pihl, Riley M. F. ;
Olson, Alex ;
Lauffenburger, Douglas A. ;
Lin, Nina ;
Snyder-Cappione, Jennifer E. .
FRONTIERS IN IMMUNOLOGY, 2018, 9
[8]
OMIP-037: 16-color panel to measure inhibitory receptor signatures from multiple human immune cell subsets [J].
Belkina, Anna C. ;
Snyder-Cappione, Jennifer E. .
CYTOMETRY PART A, 2017, 91A (02) :175-179
[9]
Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic Continuum [J].
Bendall, Sean C. ;
Simonds, Erin F. ;
Qiu, Peng ;
Amir, El-ad D. ;
Krutzik, Peter O. ;
Finck, Rachel ;
Bruggner, Robert V. ;
Melamed, Rachel ;
Trejo, Angelica ;
Ornatsky, Olga I. ;
Balderas, Robert S. ;
Plevritis, Sylvia K. ;
Sachs, Karen ;
Pe'er, Dana ;
Tanner, Scott D. ;
Nolan, Garry P. .
SCIENCE, 2011, 332 (6030) :687-696
[10]
Branson K, 2018, PREPRINT