The transcriptome of single cells can reveal important information about cellular states and heterogeneity within populations of cells. Recently, single-cell RNA-sequencing has facilitated expression profiling of large numbers of single cells in parallel. To fully exploit these data, it is critical that suitable computational approaches are developed. One key challenge, especially pertinent when considering dividing populations of cells, is to understand the cell-cycle stage of each captured cell. Here we describe and compare five established supervised machine learning methods and a custom-built predictor for allocating cells to their cell-cycle stage on the basis of their transcriptome. In particular, we assess the impact of different normalisation strategies and the usage of prior knowledge on the predictive power of the classifiers. We tested the methods on previously published datasets and found that a PCA-based approach and the custom predictor performed best. Moreover, our analysis shows that the performance depends strongly on normalisation and the usage of prior knowledge. Only by leveraging prior knowledge in form of cell-cycle annotated genes and by preprocessing the data using a rank-based normalisation, is it possible to robustly capture the transcriptional cell-cycle signature across different cell types, organisms and experimental protocols. (C) 2015 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.orgilicenses/by/4.0/).
机构:
Inst Pasteur, Unite Genet Fonct Souris, CNRS, URA 2578, F-75724 Paris 05, FranceInst Pasteur, Unite Genet Fonct Souris, CNRS, URA 2578, F-75724 Paris 05, France
机构:
Inst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
European Bioinformat Inst, European Mol Biol Lab, Cambridge, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Buettner, Florian
;
Natarajan, Kedar N.
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Natarajan, Kedar N.
;
论文数: 引用数:
h-index:
机构:
Casale, F. Paolo
;
论文数: 引用数:
h-index:
机构:
Proserpio, Valentina
;
Scialdone, Antonio
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Scialdone, Antonio
;
Theis, Fabian J.
论文数: 0引用数: 0
h-index: 0
机构:
Inst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Tech Univ Munich, Dept Math, D-80290 Munich, GermanyInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Theis, Fabian J.
;
Teichmann, Sarah A.
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Teichmann, Sarah A.
;
Marioni, John C.
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Marioni, John C.
;
Stegie, Oliver
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
Deng, Qiaolin
;
Ramskold, Daniel
论文数: 0引用数: 0
h-index: 0
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, Sweden
Karolinska Inst, Dept Cell & Mol Biol, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
Ramskold, Daniel
;
Reinius, Bjorn
论文数: 0引用数: 0
h-index: 0
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, Sweden
Karolinska Inst, Dept Cell & Mol Biol, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
Reinius, Bjorn
;
Sandberg, Rickard
论文数: 0引用数: 0
h-index: 0
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, Sweden
Karolinska Inst, Dept Cell & Mol Biol, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
机构:
Mem Sloan Kettering Canc Ctr, Computat Biol Ctr, New York, NY 10065 USA
Gerstner Sloan Kettering Grad Sch Biomed Sci, New York, NY 10065 USAUniv Copenhagen, NNF Ctr Prot Res, Fac Hlth Sci, DK-2200 Copenhagen N, Denmark
机构:
Inst Pasteur, Unite Genet Fonct Souris, CNRS, URA 2578, F-75724 Paris 05, FranceInst Pasteur, Unite Genet Fonct Souris, CNRS, URA 2578, F-75724 Paris 05, France
机构:
Inst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
European Bioinformat Inst, European Mol Biol Lab, Cambridge, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Buettner, Florian
;
Natarajan, Kedar N.
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Natarajan, Kedar N.
;
论文数: 引用数:
h-index:
机构:
Casale, F. Paolo
;
论文数: 引用数:
h-index:
机构:
Proserpio, Valentina
;
Scialdone, Antonio
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Scialdone, Antonio
;
Theis, Fabian J.
论文数: 0引用数: 0
h-index: 0
机构:
Inst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Tech Univ Munich, Dept Math, D-80290 Munich, GermanyInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Theis, Fabian J.
;
Teichmann, Sarah A.
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Teichmann, Sarah A.
;
Marioni, John C.
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, England
Wellcome Trust Sanger Inst, Hinxton, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
Marioni, John C.
;
Stegie, Oliver
论文数: 0引用数: 0
h-index: 0
机构:
European Bioinformat Inst, European Mol Biol Lab, Cambridge, EnglandInst Computat Biol, Munchen German Res Ctr Environm Hlth, Helmholtz Zentrum Munchen, Neuherberg, Germany
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
Deng, Qiaolin
;
Ramskold, Daniel
论文数: 0引用数: 0
h-index: 0
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, Sweden
Karolinska Inst, Dept Cell & Mol Biol, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
Ramskold, Daniel
;
Reinius, Bjorn
论文数: 0引用数: 0
h-index: 0
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, Sweden
Karolinska Inst, Dept Cell & Mol Biol, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
Reinius, Bjorn
;
Sandberg, Rickard
论文数: 0引用数: 0
h-index: 0
机构:
Ludwig Inst Canc Res, S-17177 Stockholm, Sweden
Karolinska Inst, Dept Cell & Mol Biol, S-17177 Stockholm, SwedenLudwig Inst Canc Res, S-17177 Stockholm, Sweden
机构:
Mem Sloan Kettering Canc Ctr, Computat Biol Ctr, New York, NY 10065 USA
Gerstner Sloan Kettering Grad Sch Biomed Sci, New York, NY 10065 USAUniv Copenhagen, NNF Ctr Prot Res, Fac Hlth Sci, DK-2200 Copenhagen N, Denmark