Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors

被引:1343
作者
Haghverdi, Laleh [1 ,2 ]
Lun, Aaron T. L. [3 ]
Morgan, Michael D. [4 ]
Marioni, John C. [1 ,3 ,4 ]
机构
[1] EBI, EMBL, Cambridge, England
[2] Helmholtz Zentrum Munchen, Inst Computat Biol, Munich, Germany
[3] Univ Cambridge, Canc Res UK Cambridge Inst, Cambridge, England
[4] Wellcome Trust Sanger Inst, Cambridge, England
基金
英国惠康基金;
关键词
SEQ; STEM; MAP;
D O I
10.1038/nbt.4091
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Large-scale single-cell RNA sequencing (scRNA-seq) data sets that are produced in different laboratories and at different times contain batch effects that may compromise the integration and interpretation of the data. Existing scRNA-seq analysis methods incorrectly assume that the composition of cell populations is either known or identical across batches. We present a strategy for batch correction based on the detection of mutual nearest neighbors (MNNs) in the high-dimensional expression space. Our approach does not rely on predefined or equal population compositions across batches; instead, it requires only that a subset of the population be shared between batches. We demonstrate the superiority of our approach compared with existing methods by using both simulated and real scRNA-seq data sets. Using multiple droplet-based scRNA-seq data sets, we demonstrate that our MNN batch-effect-correction method can be scaled to large numbers of cells.
引用
收藏
页码:421 / +
页数:9
相关论文
共 32 条
[1]   destiny: diffusion maps for large-scale single cell data in R [J].
Angerer, Philipp ;
Haghverdi, Laleh ;
Buettner, Maren ;
Theis, Fabian J. ;
Marr, Carsten ;
Buettner, Florian .
BIOINFORMATICS, 2016, 32 (08) :1241-1243
[2]   A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure [J].
Baron, Maayan ;
Veres, Adrian ;
Wolock, Samuel L. ;
Faust, Aubrey L. ;
Gaujoux, Renaud ;
Vetere, Amedeo ;
Ryu, Jennifer Hyoje ;
Wagner, Bridget K. ;
Shen-Orr, Shai S. ;
Klein, Allon M. ;
Melton, Douglas A. ;
Yanai, Itai .
CELL SYSTEMS, 2016, 3 (04) :346-+
[3]   Single-Cell Trajectory Detection Uncovers Progression and Regulatory Coordination in Human B Cell Development [J].
Bendall, Sean C. ;
Davis, Kara L. ;
Amir, El-ad David ;
Tadmor, Michelle D. ;
Simonds, Erin F. ;
Chen, Tiffany J. ;
Shenfeld, Daniel K. ;
Nolan, Garry P. ;
Pe'er, Dana .
CELL, 2014, 157 (03) :714-725
[4]   Quantifying Disorder through Conditional Entropy: An Application to Fluid Mixing [J].
Brandani, Giovanni B. ;
Schor, Marieke ;
MacPhee, Cait E. ;
Grubmueller, Helmut ;
Zachariae, Ulrich ;
Marenduzzo, Davide .
PLOS ONE, 2013, 8 (06)
[5]  
Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/nmeth.2645, 10.1038/NMETH.2645]
[6]  
Buttner M., 2017, ASSESSMENT BATCH COR
[7]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[8]   Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput [J].
Gierahn, Todd M. ;
Wadsworth, Marc H., II ;
Hughes, Travis K. ;
Bryson, Bryan D. ;
Butler, Andrew ;
Satija, Rahul ;
Fortune, Sarah ;
Love, J. Christopher ;
Shalek, Alex K. .
NATURE METHODS, 2017, 14 (04) :395-+
[9]   De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data [J].
Grun, Dominic ;
Muraro, Mauro J. ;
Boisset, Jean-Charles ;
Wiebrands, Kay ;
Lyubimova, Anna ;
Dharmadhikari, Gitanjali ;
van den Born, Maaike ;
van Es, Johan ;
Jansen, Erik ;
Clevers, Hans ;
de Koning, Eelco J. P. ;
van Oudenaarden, Alexander .
CELL STEM CELL, 2016, 19 (02) :266-277
[10]  
Hicks S. C., 2017, MISSING DATA TECHNIC