Computational principles and challenges in single-cell data integration

被引:220
作者
Argelaguet, Ricard [1 ,2 ]
Cuomo, Anna S. E. [1 ,3 ]
Stegle, Oliver [3 ,4 ,5 ]
Marioni, John C. [1 ,3 ,6 ]
机构
[1] European Bioinformat Inst EMBL EBI, European Mol Biol Lab, Hinxton, England
[2] Babraham Inst, Epigenet Programme, Cambridge, England
[3] Wellcome Sanger Inst, Wellcome Genome Campus, Cambridge, England
[4] German Canc Res Ctr, Div Computat Genom & Syst Genet, Heidelberg, Germany
[5] Mol Biol Lab, Genome Biol Unit, Heidelberg, Germany
[6] Univ Cambridge, Canc Res UK Cambridge Inst, Cambridge, England
基金
欧洲研究理事会;
关键词
MIXED-MODEL ANALYSIS; RNA-SEQUENCING DATA; GENE-EXPRESSION; MOUSE; GENOME; SEQ; TRANSCRIPTOME; EVOLUTIONARY; CHROMATIN; ATLAS;
D O I
10.1038/s41587-021-00895-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The development of single-cell multimodal assays provides a powerful tool for investigating multiple dimensions of cellular heterogeneity, enabling new insights into development, tissue homeostasis and disease. A key challenge in the analysis of single-cell multimodal data is to devise appropriate strategies for tying together data across different modalities. The term 'data integration' has been used to describe this task, encompassing a broad collection of approaches ranging from batch correction of individual omics datasets to association of chromatin accessibility and genetic variation with transcription. Although existing integration strategies exploit similar mathematical ideas, they typically have distinct goals and rely on different principles and assumptions. Consequently, new definitions and concepts are needed to contextualize existing methods and to enable development of new methods. As the number of single-cell experiments with multiple data modalities increases, Argelaguet and colleagues review the concepts and challenges of data integration.
引用
收藏
页码:1202 / 1215
页数:14
相关论文
共 137 条
[1]  
Alpert A, 2018, NAT METHODS, V15, P267, DOI [10.1038/NMETH.4628, 10.1038/nmeth.4628]
[2]   Single-cell and spatial transcriptomics enables probabilistic inference of cell type topography [J].
Andersson, Alma ;
Bergenstrahle, Joseph ;
Asp, Michaela ;
Bergenstrahle, Ludvig ;
Jurek, Aleksandra ;
Fernandez Navarro, Jose ;
Lundeberg, Joakim .
COMMUNICATIONS BIOLOGY, 2020, 3 (01)
[3]   Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity [J].
Angermueller, Christof ;
Clark, Stephen J. ;
Lee, Heather J. ;
Macaulay, Iain C. ;
Teng, Mabel J. ;
Hu, Tim Xiaoming ;
Krueger, Felix ;
Smallwood, Sebastien A. ;
Ponting, Chris P. ;
Voet, Thierry ;
Kelsey, Gavin ;
Stegle, Oliver ;
Reik, Wolf .
NATURE METHODS, 2016, 13 (03) :229-+
[4]   The origin and evolution of cell types [J].
Arendt, Detlev ;
Musser, Jacob M. ;
Baker, Clare V. H. ;
Bergman, Aviv ;
Cepko, Connie ;
Erwin, Douglas H. ;
Pavlicev, Mihaela ;
Schlosser, Gerhard ;
Widder, Stefanie ;
Laubichler, Manfred D. ;
Wagner, Gunter P. .
NATURE REVIEWS GENETICS, 2016, 17 (12) :744-757
[5]   MOFA plus : a statistical framework for comprehensive integration of multi-modal single-cell data [J].
Argelaguet, Ricard ;
Arnol, Damien ;
Bredikhin, Danila ;
Deloro, Yonatan ;
Velten, Britta ;
Marioni, John C. ;
Stegle, Oliver .
GENOME BIOLOGY, 2020, 21 (01)
[6]   Multi-omics profiling of mouse gastrulation at single-cell resolution [J].
Argelaguet, Ricard ;
Clark, Stephen J. ;
Mohammed, Hisham ;
Stapel, L. Carine ;
Krueger, Christel ;
Kapourani, Chantriolnt-Andreas ;
Imaz-Rosshandler, Ivan ;
Lohoff, Tim ;
Xiang, Yunlong ;
Hanna, Courtney W. ;
Smallwood, Sebastien ;
Ibarra-Soria, Ximena ;
Buettner, Florian ;
Sanguinetti, Guido ;
Xie, Wei ;
Krueger, Felix ;
Gottgens, Berthold ;
Rugg-Gunn, Peter J. ;
Kelsey, Gavin ;
Dean, Wendy ;
Nichols, Jennifer ;
Stegle, Oliver ;
Marioni, John C. ;
Reik, Wolf .
NATURE, 2019, 576 (7787) :487-+
[7]   Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets [J].
Argelaguet, Ricard ;
Velten, Britta ;
Arnol, Damien ;
Dietrich, Sascha ;
Zenz, Thorsten ;
Marioni, John C. ;
Buettner, Florian ;
Huber, Wolfgang ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (06)
[8]   Modeling Cell-Cell Interactions from Spatial Molecular Data with Spatial Variance Component Analysis [J].
Arnol, Damien ;
Schapiro, Denis ;
Bodenmiller, Bernd ;
Saez-Rodriguez, Julio ;
Stegle, Oliver .
CELL REPORTS, 2019, 29 (01) :202-+
[9]   Joint analysis of heterogeneous single-cell RNA-seq dataset collections [J].
Barkas, Nikolas ;
Petukhov, Viktor ;
Nikolaeva, Daria ;
Lozinsky, Yaroslav ;
Demharter, Samuel ;
Khodosevich, Konstantin ;
Kharchenko, Peter V. .
NATURE METHODS, 2019, 16 (08) :695-+
[10]   A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure [J].
Baron, Maayan ;
Veres, Adrian ;
Wolock, Samuel L. ;
Faust, Aubrey L. ;
Gaujoux, Renaud ;
Vetere, Amedeo ;
Ryu, Jennifer Hyoje ;
Wagner, Bridget K. ;
Shen-Orr, Shai S. ;
Klein, Allon M. ;
Melton, Douglas A. ;
Yanai, Itai .
CELL SYSTEMS, 2016, 3 (04) :346-+