Dimensionality reduction using genetic algorithms

被引:551
作者
Raymer, ML [1 ]
Punch, WE
Goodman, ED
Kuhn, LA
Jain, AK
机构
[1] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
[2] Michigan State Univ, Case Ctr Comp Aided Engn & Mfg, E Lansing, MI 48824 USA
[3] Michigan State Univ, Dept Biochem, E Lansing, MI 48824 USA
基金
美国国家科学基金会;
关键词
curse of dimensionality; feature extraction; feature selection; genetic algorithms; pattern classification;
D O I
10.1109/4235.850656
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern have a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy, Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured since each new feature may be a linear combination of all of the features in the original pattern vector, Here, we present a new approach to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm, The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. A masking vector is also employed to perform simultaneous selection of a subset of the features, We employ this technique in combination with the k nearest neighbor classification rule, and compare the results with classical feature selection and extraction techniques including sequential floating forward feature selection, and linear discriminant analysis. We also present results for the identification of favorable water-binding sites on protein surfaces, an important problem in biochemistry and drug design.
引用
收藏
页码:164 / 171
页数:8
相关论文
共 41 条
[11]   1977 RIETZ LECTURE - BOOTSTRAP METHODS - ANOTHER LOOK AT THE JACKKNIFE [J].
EFRON, B .
ANNALS OF STATISTICS, 1979, 7 (01) :1-26
[12]  
Efron B., 1980, JACKKNIFE BOOTSTRAP
[13]  
FERRI FJ, 1994, PATTERN RECOGN, V4, P403
[14]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[15]  
FRASER A. S., 1957, AUSTRALIAN JOUR BIOL SCI, V10, P484
[16]  
Hart P.E., 1973, Pattern recognition and scene analysis
[17]  
HOLLAND JH, 1975, ADAPTATOIN NATURAL A
[18]   SELECTING FUZZY IF-THEN RULES FOR CLASSIFICATION PROBLEMS USING GENETIC ALGORITHMS [J].
ISHIBUCHI, H ;
NOZAKI, K ;
YAMAMOTO, N ;
TANAKA, H .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1995, 3 (03) :260-270
[19]   Feature selection: Evaluation, application, and small sample performance [J].
Jain, A ;
Zongker, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (02) :153-158
[20]   BOOTSTRAP TECHNIQUES FOR ERROR ESTIMATION [J].
JAIN, AK ;
DUBES, RC ;
CHEN, CC .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1987, 9 (05) :628-633