Interpretable dimensionality reduction of single cell transcriptome data with deep generative models

被引:234
作者
Ding, Jiarui [1 ,2 ,3 ,4 ]
Condon, Anne [1 ]
Shah, Sohrab P. [1 ,2 ,3 ,5 ]
机构
[1] Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1Z4, Canada
[2] BC Canc Agcy, Dept Mol Oncol, Vancouver, BC V5Z 1L3, Canada
[3] Univ British Columbia, Dept Pathol & Lab Med, Vancouver, BC V6T 2B5, Canada
[4] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[5] Mem Sloan Kettering Canc Ctr, 1275 York Ave, New York, NY 10065 USA
基金
加拿大创新基金会; 加拿大自然科学与工程研究理事会; 加拿大健康研究院;
关键词
RNA-SEQ DATA; VISUALIZATION; HETEROGENEITY; CHALLENGES;
D O I
10.1038/s41467-018-04368-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Single-cell RNA-sequencing has great potential to discover cell types, identify cell states, trace development lineages, and reconstruct the spatial organization of cells. However, dimension reduction to interpret structure in single-cell sequencing data remains a challenge. Existing algorithms are either not able to uncover the clustering structures in the data or lose global information such as groups of clusters that are close to each other. We present a robust statistical model, scvis, to capture and visualize the low-dimensional structures in single-cell gene expression data. Simulation results demonstrate that low-dimensional representations learned by scvis preserve both the local and global neighbor structures in the data. In addition, scvis is robust to the number of data points and learns a probabilistic parametric mapping function to add new data points to an existing embedding. We then use scvis to analyze four single-cell RNA-sequencing datasets, exemplifying interpretable two-dimensional representations of the high-dimensional single-cell RNA-sequencing data.
引用
收藏
页数:13
相关论文
共 62 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia [J].
Amir, El-ad David ;
Davis, Kara L. ;
Tadmor, Michelle D. ;
Simonds, Erin F. ;
Levine, Jacob H. ;
Bendall, Sean C. ;
Shenfeld, Daniel K. ;
Krishnaswamy, Smita ;
Nolan, Garry P. ;
Pe'er, Dana .
NATURE BIOTECHNOLOGY, 2013, 31 (06) :545-+
[3]   destiny: diffusion maps for large-scale single cell data in R [J].
Angerer, Philipp ;
Haghverdi, Laleh ;
Buettner, Maren ;
Theis, Fabian J. ;
Marr, Carsten ;
Buettner, Florian .
BIOINFORMATICS, 2016, 32 (08) :1241-1243
[4]  
[Anonymous], PHENOGRAPH
[5]  
[Anonymous], 2007, ARTIF INTELL
[6]  
[Anonymous], 2012, GPy: A gaussian process framework in python
[7]  
[Anonymous], 2016, 4 INT C LEARN REPR S
[8]  
[Anonymous], 2009, J Mach Learn Res
[9]  
[Anonymous], 2014, P 6 INT C LEARN REPR
[10]  
[Anonymous], SINGLE CELL PORTAL