Generalization of pair correlation method (PCM) for nonparametric variable selection

被引:43
作者
Héberger, K
Rajkó, R
机构
[1] Hungarian Acad Sci, Inst Chem, Chem Res Ctr, H-1525 Budapest, Hungary
[2] Univ Szeged, Coll Fac Food Engn, Dept Unit Operat & Environm Engn, H-6701 Szeged, Hungary
关键词
pair correlation method; non-parametric variable and/or model selection;
D O I
10.1002/cem.748
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The pair correlation method (PCM) has been developed for choosing between two correlated predictor variables (factors) provided that the scatter is caused not only by random effects. The distinction between two variables can be made using an arrangement into a 2 x 2 contingency table. Further on, suitable test statistics can be used to decide the significance of differences between factors. PCM can easily be generalized (GPCM) for variable selection purposes using more than two variables. The comparison of factors can be made pairwise in all possible combinations. If a given statistical test indicates a significant difference between the factors, the following terms are used for the overwhelming and subordinate factors: superior-inferior or winner-loser respectively. Every comparison can mark a factor as superior, inferior or no decision can be made. The following step is ranking of predictor variables. Three ways of ranking have been elaborated: (i) simple ranking, (ii) ranking based on differences and (iii) ranking according to probability-weighted differences. (Difference here means number of wins minus number of losses.) Suitable examples are presented to show the usefulness and applicability of the method in various conditions. Copyright (C) 2002 John Wiley Sons, Ltd.
引用
收藏
页码:436 / 443
页数:8
相关论文
共 30 条
[1]   Genetic algorithm-based method for selecting wavelengths and model size for use with partial least-squares regression: Application to near-infrared spectroscopy [J].
Bangalore, AS ;
Shaffer, RE ;
Small, GW ;
Arnold, MA .
ANALYTICAL CHEMISTRY, 1996, 68 (23) :4200-4212
[2]   Partial order ranking-based QSAR's: estimation of solubilities and octanol-water partitioning [J].
Carlsen, L ;
Sorensen, PB ;
Thomsen, M .
CHEMOSPHERE, 2001, 43 (03) :295-302
[3]  
CARLSEN L, 2000, 107 NERI, P38
[4]   Elimination of uninformative variables for multivariate calibration [J].
Centner, V ;
Massart, DL ;
deNoord, OE ;
deJong, S ;
Vandeginste, BM ;
Sterna, C .
ANALYTICAL CHEMISTRY, 1996, 68 (21) :3851-3858
[5]  
Conover W. J., 1980, PRACTICAL NONPARAMET
[6]   Nonparametric regression applied to quantitative structure - Activity relationships [J].
Constans, P ;
Hirst, JD .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (02) :452-459
[7]   Variable selection for neural networks in multivariate calibration [J].
Despagne, F ;
Massart, DL .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1998, 40 (02) :145-163
[8]   PROJECTION PURSUIT REGRESSION [J].
FRIEDMAN, JH ;
STUETZLE, W .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1981, 76 (376) :817-823
[9]   Evaluation of polarity indicators and stationary phases by principal component analysis in gas-liquid chromatography [J].
Héberger, K .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1999, 47 (01) :41-49
[10]  
Héberger K, 1999, J CHEMOMETR, V13, P473, DOI 10.1002/(SICI)1099-128X(199905/08)13:3/4<473::AID-CEM558>3.3.CO