Non-linear PCA: a missing data approach

被引:146
作者
Scholz, M [1 ]
Kaplan, F
Guy, CL
Kopka, J
Selbig, J
机构
[1] Max Planck Inst Mol Plant Physiol, Potsdam, Germany
[2] Univ Florida, Plant Mol & Cellular Bot Program, Dept Environm Hort, Gainesville, FL 32611 USA
[3] Univ Potsdam, Potsdam, Germany
关键词
D O I
10.1093/bioinformatics/bti634
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Visualizing and analysing the potential non-linear structure of a dataset is becoming an important task in molecular biology. This is even more challenging when the data have missing values. Results: Here, we propose an inverse model that performs non-linear principal component analysis (NLPCA) from incomplete datasets. Missing values are ignored while optimizing the model, but can be estimated afterwards. Results are shown for both artificial and experimental datasets. In contrast to linear methods, non-linear methods were able to give better missing value estimations for non-linear structured data. Application: We applied this technique to a time course of metabolite data from a cold stress experiment on the model plant Arabidopsis thaliana, and could approximate the mapping function from any time point to the metabolite responses. Thus, the inverse NLPCA provides greatly improved information for better understanding the complex response to cold stress.
引用
收藏
页码:3887 / 3895
页数:9
相关论文
共 36 条
[21]   Limitations of nonlinear PCA as performed with generic neural networks [J].
Malthouse, EC .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (01) :165-173
[22]  
Monahan AH, 2003, J CLIMATE, V16, P2005, DOI 10.1175/1520-0442(2003)016<2005:TVSOWC>2.0.CO
[23]  
2
[24]   A Bayesian missing value estimation method for gene expression profile data [J].
Oba, S ;
Sato, M ;
Takemasa, I ;
Monden, M ;
Matsubara, K ;
Ishii, S .
BIOINFORMATICS, 2003, 19 (16) :2088-2096
[25]  
Oh JH, 1998, ADV NEUR IN, V10, P605
[26]  
Raiko T, 2001, 8TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING, VOLS 1-3, PROCEEDING, P822
[27]  
Roweis S, 1998, ADV NEUR IN, V10, P626
[28]   Nonlinear dimensionality reduction by locally linear embedding [J].
Roweis, ST ;
Saul, LK .
SCIENCE, 2000, 290 (5500) :2323-+
[29]   Think globally, fit locally: Unsupervised learning of low dimensional manifolds [J].
Saul, LK ;
Roweis, ST .
JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 4 (02) :119-155
[30]   Nonlinear component analysis as a kernel eigenvalue problem [J].
Scholkopf, B ;
Smola, A ;
Muller, KR .
NEURAL COMPUTATION, 1998, 10 (05) :1299-1319