f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq

被引:79
作者
Buettner, Florian [1 ,6 ]
Pratanwanich, Naruemon [1 ]
McCarthy, Davis J. [1 ,2 ]
Marioni, John C. [1 ,3 ,4 ]
Stegle, Oliver [1 ,5 ]
机构
[1] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Cambridge CB10 1SD, England
[2] St Vincents Inst Med Res, 41 Victoria Parade, Fitzroy, Vic 3065, Australia
[3] Canc Res UK Cambridge Inst, Cambridge, England
[4] Wellcome Trust Res Labs, Sanger Inst, Wellcome Genome Campus, Cambridge, England
[5] European Mol Biol Lab, Genome Biol Unit, Meyerhofstr 1, D-69117 Heidelberg, Germany
[6] Helmholtz Zentrum Munchen, German Res Ctr Environm Hlth, Inst Computat Biol, Neuherberg, Germany
来源
GENOME BIOLOGY | 2017年 / 18卷
基金
英国医学研究理事会;
关键词
Single-cell RNA-seq; Sparse factor analysis; Gene set annotations; EMBRYONIC STEM-CELLS; UNWANTED VARIATION; GENE-EXPRESSION; HETEROGENEITY; DIFFERENTIATION; PATHWAY; GROWTH;
D O I
10.1186/s13059-017-1334-8
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Single-cell RNA-sequencing (scRNA-seq) allows studying heterogeneity in gene expression in large cell populations. Such heterogeneity can arise due to technical or biological factors, making decomposing sources of variation difficult. We here describe f-scLVM (factorial single-cell latent variable model), a method based on factor analysis that uses pathway annotations to guide the inference of interpretable factors underpinning the heterogeneity. Our model jointly estimates the relevance of individual factors, refines gene set annotations, and infers factors without annotation. In applications to multiple scRNA-seq datasets, we find that f-scLVM robustly decomposes scRNA-seq datasets into interpretable components, thereby facilitating the identification of novel subpopulations.
引用
收藏
页数:13
相关论文
共 43 条
[1]   Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity [J].
Angermueller, Christof ;
Clark, Stephen J. ;
Lee, Heather J. ;
Macaulay, Iain C. ;
Teng, Mabel J. ;
Hu, Tim Xiaoming ;
Krueger, Felix ;
Smallwood, Sebastien A. ;
Ponting, Chris P. ;
Voet, Thierry ;
Kelsey, Gavin ;
Stegle, Oliver ;
Reik, Wolf .
NATURE METHODS, 2016, 13 (03) :229-+
[2]  
[Anonymous], THESIS
[3]   Principal Component Analysis with Noisy and/or Missing Data [J].
Bailey, Stephen .
PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF THE PACIFIC, 2012, 124 (919) :1015-1023
[4]  
Bartholomew DJ, 2017, LATENT VARIABLE MODE
[5]   The metastasis-associated gene Prl-3 is a p53 target involved in cell-cycle regulation [J].
Basak, Shashwati ;
Jacobs, Suzanne B. R. ;
Krieg, Adam J. ;
Pathak, Navneeta ;
Zeng, Qi ;
Kaldis, Philipp ;
Giaccia, Amato J. ;
Attardi, Laura D. .
MOLECULAR CELL, 2008, 30 (03) :303-314
[6]   Transcription profiling of platelet-derived growth factor-B-deficient mouse embryos identifies RGS5 as a novel marker for pericytes and vascular smooth muscle cells [J].
Bondjers, C ;
Kalén, M ;
Hellström, M ;
Scheidl, SJ ;
Abramsson, A ;
Renner, O ;
Lindahl, P ;
Cho, HS ;
Kehrl, J ;
Betsholtz, C .
AMERICAN JOURNAL OF PATHOLOGY, 2003, 162 (03) :721-729
[7]   Single-cell chromatin accessibility reveals principles of regulatory variation [J].
Buenostro, Jason D. ;
Wu, Beijing ;
Litzenburger, Ulrike M. ;
Ruff, Dave ;
Gonzales, Michael L. ;
Snyder, Michael P. ;
Chang, Howard Y. ;
Greenleaf, William J. .
NATURE, 2015, 523 (7561) :486-U264
[8]  
Buettner F, 2017, F SCLVM REFERENCE SO
[9]  
Buettner F, 2017, ADDITIONAL FILE RETI
[10]  
Buettner F, 2017, F SCLVM SOFTWARE