Non-linear PCA: a missing data approach

被引:146
作者
Scholz, M [1 ]
Kaplan, F
Guy, CL
Kopka, J
Selbig, J
机构
[1] Max Planck Inst Mol Plant Physiol, Potsdam, Germany
[2] Univ Florida, Plant Mol & Cellular Bot Program, Dept Environm Hort, Gainesville, FL 32611 USA
[3] Univ Potsdam, Potsdam, Germany
关键词
D O I
10.1093/bioinformatics/bti634
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Visualizing and analysing the potential non-linear structure of a dataset is becoming an important task in molecular biology. This is even more challenging when the data have missing values. Results: Here, we propose an inverse model that performs non-linear principal component analysis (NLPCA) from incomplete datasets. Missing values are ignored while optimizing the model, but can be estimated afterwards. Results are shown for both artificial and experimental datasets. In contrast to linear methods, non-linear methods were able to give better missing value estimations for non-linear structured data. Application: We applied this technique to a time course of metabolite data from a cold stress experiment on the model plant Arabidopsis thaliana, and could approximate the mapping function from any time point to the metabolite responses. Thus, the inverse NLPCA provides greatly improved information for better understanding the complex response to cold stress.
引用
收藏
页码:3887 / 3895
页数:9
相关论文
共 36 条
[1]  
Bishop C. M., 1996, Neural networks for pattern recognition
[2]  
Bishop CM, 1999, IEE CONF PUBL, P509, DOI 10.1049/cp:19991160
[3]  
DeMers D., 1993, Advances in Neural Information Processing Systems, P580
[4]  
Diamantaras KI, 1996, Principal Component Neural Networks: Theory and Applications
[5]  
Flannery B.P., 1992, NUMERICAL RECIPES C
[6]   Zooming in on a quantitative trait for tomato yield using interspecific introgressions [J].
Fridman, E ;
Carrari, F ;
Liu, YS ;
Fernie, AR ;
Zamir, D .
SCIENCE, 2004, 305 (5691) :1786-1789
[7]  
HASSOUN MH, 1997, WORKSH ADV AUT AUT B
[8]   PRINCIPAL CURVES [J].
HASTIE, T ;
STUETZLE, W .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (406) :502-516
[9]  
Haykin S, 1998, Neural networks: a comprehensive foundation
[10]   REPLICATOR NEURAL NETWORKS FOR UNIVERSAL OPTIMAL SOURCE-CODING [J].
HECHTNIELSEN, R .
SCIENCE, 1995, 269 (5232) :1860-1863