Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression

被引:2473
作者
Hafemeister, Christoph [1 ]
Satija, Rahul [1 ,2 ]
机构
[1] New York Genome Ctr, 101 6th Ave, New York, NY 10013 USA
[2] NYU, Ctr Genom & Syst Biol, 12 Waverly PI, New York, NY 10003 USA
基金
美国国家卫生研究院;
关键词
Single-cell RNA-seq; Normalization; DIFFERENTIAL EXPRESSION ANALYSIS; QUANTIFICATION; CHALLENGES;
D O I
10.1186/s13059-019-1874-1
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 [微生物学]; 090105 [作物生产系统与生态工程];
摘要
Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from "regularized negative binomial regression," where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.
引用
收藏
页数:15
相关论文
共 37 条
[1]
Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]
SCnorm: robust normalization of single-cell RNA-seq data [J].
Bacher, Rhonda ;
Chu, Li-Fang ;
Leng, Ning ;
Gasch, Audrey P. ;
Thomson, James A. ;
Stewart, Ron M. ;
Newton, Michael ;
Kendziorski, Christina .
NATURE METHODS, 2017, 14 (06) :584-+
[3]
Single-cell chromatin accessibility reveals principles of regulatory variation [J].
Buenostro, Jason D. ;
Wu, Beijing ;
Litzenburger, Ulrike M. ;
Ruff, Dave ;
Gonzales, Michael L. ;
Snyder, Michael P. ;
Chang, Howard Y. ;
Greenleaf, William J. .
NATURE, 2015, 523 (7561) :486-U264
[4]
Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells [J].
Buettner, Florian ;
Natarajan, Kedar N. ;
Casale, F. Paolo ;
Proserpio, Valentina ;
Scialdone, Antonio ;
Theis, Fabian J. ;
Teichmann, Sarah A. ;
Marioni, John C. ;
Stegie, Oliver .
NATURE BIOTECHNOLOGY, 2015, 33 (02) :155-160
[5]
Integrating single-cell transcriptomic data across different conditions, technologies, and species [J].
Butler, Andrew ;
Hoffman, Paul ;
Smibert, Peter ;
Papalexi, Efthymia ;
Satija, Rahul .
NATURE BIOTECHNOLOGY, 2018, 36 (05) :411-+
[6]
Correcting the Mean-Variance Dependency for Differential Variability Testing Using Single-Cell RNA Sequencing Data [J].
Eling, Nils ;
Richard, Arianne C. ;
Richardson, Sylvia ;
Marioni, John C. ;
Vallejos, Catalina A. .
CELL SYSTEMS, 2018, 7 (03) :284-+
[7]
Single-cell RNA-seq denoising using a deep count autoencoder [J].
Eraslan, Goekcen ;
Simon, Lukas M. ;
Mircea, Maria ;
Mueller, Nikola S. ;
Theis, Fabian J. .
NATURE COMMUNICATIONS, 2019, 10 (1)
[8]
MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data [J].
Finak, Greg ;
McDavid, Andrew ;
Yajima, Masanao ;
Deng, Jingyuan ;
Gersuk, Vivian ;
Shalek, Alex K. ;
Slichter, Chloe K. ;
Miller, Hannah W. ;
McElrath, M. Juliana ;
Prlic, Martin ;
Linsley, Peter S. ;
Gottardo, Raphael .
GENOME BIOLOGY, 2015, 16
[9]
Grün D, 2014, NAT METHODS, V11, P637, DOI [10.1038/nmeth.2930, 10.1038/NMETH.2930]
[10]
Hafemeister C, 2019, SCTRANSFORM