Dimensionality reduction and visualization in principal component analysis

被引:213
作者
Ivosev, Gordana [1 ]
Burton, Lyle [1 ]
Bonner, Ron [1 ]
机构
[1] MDS Analyt Technol, Concord, ON L4K 4V8, Canada
关键词
D O I
10.1021/ac800110w
中图分类号
O65 [分析化学];
学科分类号
070302 [分析化学]; 081704 [应用化学];
摘要
Many modem applications of analytical chemistry involve the collection of large megavariate data sets and subsequent processing with multivariate analysis techniques (MVA), two of the more common goals being data analysis (also known as data mining and exploratory data analysis) and classification. Classification attempts to determine variables that can distinguish known classes allowing unknown samples to be correctly assigned, whereas data analysis seeks to uncover and understand or confirm relationships between the samples and the variables. An important part of analysis is visualization which allows analysts to apply their expertise and knowledge and is often easier for the samples than the variables since there are frequently far more of the latter. Here we describe principal component variable grouping (PCVG), an unsupervised, intuitive method that assigns a large number of variables to a smaller number of groups that can be more readily visualized and understood. Knowledge of the source or nature of the variables in a group allows them all to be appropriately treated, for example, removed if they result from uninteresting effects or replaced by a single representative for further processing.
引用
收藏
页码:4933 / 4944
页数:12
相关论文
共 18 条
[1]
Procrustes rotation in analytical chemistry, a tutorial [J].
Andrade, JM ;
Gómez-Carracedo, MP ;
Krzanowski, W ;
Kubista, M .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2004, 72 (02) :123-132
[2]
Scaling and normalization effects in NMR spectroscopic metabonomic data sets [J].
Craig, A ;
Cloareo, O ;
Holmes, E ;
Nicholson, JK ;
Lindon, JC .
ANALYTICAL CHEMISTRY, 2006, 78 (07) :2262-2267
[3]
De Hoffmann E., 2007, Mass spectrometry principles and applications, chapter 1, ion sources
[4]
Exploring expression data: Identification and analysis of coexpressed genes [J].
Heyer, LJ ;
Kruglyak, S ;
Yooseph, S .
GENOME RESEARCH, 1999, 9 (11) :1106-1115
[5]
Jackson JE., 2003, A users guide to principal components
[6]
Jolliffe I.T, 2004, PRINCIPAL COMPONENT, Vsecond
[7]
Discarding or downweighting high-noise variables in factor analytic models [J].
Paatero, P ;
Hopke, PK .
ANALYTICA CHIMICA ACTA, 2003, 490 (1-2) :277-289
[8]
A NONLINEAR MAPPING FOR DATA STRUCTURE ANALYSIS [J].
SAMMON, JW .
IEEE TRANSACTIONS ON COMPUTERS, 1969, C 18 (05) :401-&
[9]
Investigation of analytical variation in metabonomic analysis using liquid chromatography/mass spectrometry [J].
Sangster, Tim P. ;
Wingate, Julie E. ;
Burton, Lyle ;
Teichert, Friederike ;
Wilson, Ian D. .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2007, 21 (18) :2965-2970
[10]
PCA-ContVarDia: an improvement of the PCA-VarDia technique for curve resolution in GC-MS and TG-MS analysis [J].
Statheropoulos, M ;
Mikedi, K .
ANALYTICA CHIMICA ACTA, 2001, 446 (1-2) :353-370