A test metric for assessing single-cell RNA-seq batch correction

被引:262
作者
Buettner, Maren [1 ]
Miao, Zhichao [2 ,3 ]
Wolf, F. Alexander [1 ]
Teichmann, Sarah A. [2 ,3 ,4 ]
Theis, Fabian J. [1 ,5 ]
机构
[1] Helmholtz Zentrum Munchen, Inst Computat Biol, German Res Ctr Environm Hlth, Neuherberg, Germany
[2] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Cambridge, England
[3] Wellcome Sanger Inst, Wellcome Genome Campus, Cambridge, England
[4] Univ Cambridge, Dept Phys, Cavendish Lab, Cambridge, England
[5] Tech Univ Munich, Dept Math, Munich, Germany
基金
英国惠康基金;
关键词
GENE-EXPRESSION; SEQUENCING DATA; NORMALIZATION; PROGRAMS; PACKAGE; FATE;
D O I
10.1038/s41592-018-0254-1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Single-cell transcriptomics is a versatile tool for exploring heterogeneous cell populations, but as with all genomics experiments, batch effects can hamper data integration and interpretation. The success of batch-effect correction is often evaluated by visual inspection of low-dimensional embeddings, which are inherently imprecise. Here we present a user-friendly, robust and sensitive k-nearest-neighbor batch-effect test (kBET; https://github.com/theislab/kBET) for quantification of batch effects. We used kBET to assess commonly used batch-regression and normalization approaches, and to quantify the extent to which they remove batch effects while preserving biological variability. We also demonstrate the application of kBET to data from peripheral blood mononuclear cells (PBMCs) from healthy donors to distinguish cell-type-specific inter-individual variability from changes in relative proportions of cell populations. This has important implications for future data-integration efforts, central to projects such as the Human Cell Atlas.
引用
收藏
页码:43 / +
页数:9
相关论文
共 51 条
[1]   Ensembl 2017 [J].
Aken, Bronwen L. ;
Achuthan, Premanand ;
Akanni, Wasiu ;
Amode, M. Ridwan ;
Bernsdorff, Friederike ;
Bhai, Jyothish ;
Billis, Konstantinos ;
Carvalho-Silva, Denise ;
Cummins, Carla ;
Clapham, Peter ;
Gil, Laurent ;
Giron, Carlos Garcia ;
Gordon, Leo ;
Hourlier, Thibaut ;
Hunt, Sarah E. ;
Janacek, Sophie H. ;
Juettemann, Thomas ;
Keenan, Stephen ;
Laird, Matthew R. ;
Lavidas, Ilias ;
Maurel, Thomas ;
McLaren, William ;
Moore, Benjamin ;
Murphy, Daniel N. ;
Nag, Rishi ;
Newman, Victoria ;
Nuhn, Michael ;
Ong, Chuang Kee ;
Parker, Anne ;
Patricio, Mateus ;
Riat, Harpreet Singh ;
Sheppard, Daniel ;
Sparrow, Helen ;
Taylor, Kieron ;
Thormann, Anja ;
Vullo, Alessandro ;
Walts, Brandon ;
Wilder, Steven P. ;
Zadissa, Amonida ;
Kostadima, Myrto ;
Martin, Fergal J. ;
Muffato, Matthieu ;
Perry, Emily ;
Ruffier, Magali ;
Staines, Daniel M. ;
Trevanion, Stephen J. ;
Cunningham, Fiona ;
Yates, Andrew ;
Zerbino, Daniel R. ;
Flicek, Paul .
NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) :D635-D642
[2]  
Andrews T. S, 2018, DROPOUT BASED FEATUR
[3]  
Angerer Philipp, 2017, Current Opinion in Systems Biology, V4, P85, DOI 10.1016/j.coisb.2017.07.004
[4]  
[Anonymous], SINGLE CELL RNA SEQ
[5]   SCnorm: robust normalization of single-cell RNA-seq data [J].
Bacher, Rhonda ;
Chu, Li-Fang ;
Leng, Ning ;
Gasch, Audrey P. ;
Thomson, James A. ;
Stewart, Ron M. ;
Newton, Michael ;
Kendziorski, Christina .
NATURE METHODS, 2017, 14 (06) :584-+
[6]   Eigenvalues of large sample covariance matrices of spiked population models [J].
Baik, Jinho ;
Silverstein, Jack W. .
JOURNAL OF MULTIVARIATE ANALYSIS, 2006, 97 (06) :1382-1408
[7]   Cell fate inclination within 2-cell and 4-cell mouse embryos revealed by single-cell RNA sequencing [J].
Blase, Fernando H. ;
Cao, Xiaoyi ;
Zhong, Sheng .
GENOME RESEARCH, 2014, 24 (11) :1787-1796
[8]   Lineage-Specific Profiling Delineates the Emergence and Progression of Naive Pluripotency in Mammalian Embryogenesis [J].
Boroviak, Thorsten ;
Loos, Remco ;
Lombard, Patrick ;
Okahara, Junko ;
Behr, Ruediger ;
Sasaki, Erika ;
Nichols, Jennifer ;
Smith, Austin ;
Bertone, Paul .
DEVELOPMENTAL CELL, 2015, 35 (03) :366-382
[9]  
Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/nmeth.2645, 10.1038/NMETH.2645]
[10]   f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq [J].
Buettner, Florian ;
Pratanwanich, Naruemon ;
McCarthy, Davis J. ;
Marioni, John C. ;
Stegle, Oliver .
GENOME BIOLOGY, 2017, 18