Genetic programming with a genetic algorithm for feature construction and selection

被引:93
作者
Smith M.G. [1 ]
Bull L. [1 ]
机构
[1] Faculty of Computing, Engineering and Mathematical Sciences, University of the West of England, Bristol
关键词
Classification; Feature construction; Feature selection; Genetic algorithm; Genetic programming; Machine learning;
D O I
10.1007/s10710-005-2988-7
中图分类号
学科分类号
摘要
The use of machine learning techniques to automatically analyse data for information is becoming increasingly widespread. In this paper we primarily examine the use of Genetic Programming and a Genetic Algorithm to pre-process data before it is classified using the C4.5 decision tree learning algorithm. Genetic Programming is used to construct new features from those available in the data, a potentially significant process for data mining since it gives consideration to hidden relationships between features. A Genetic Algorithm is used to determine which such features are the most predictive. Using ten well-known datasets we show that our approach, in comparison to C4.5 alone, provides marked improvement in a number of cases. We then examine its use with other well-known machine learning techniques. © 2005 Springer Science + Business Media, Inc.
引用
收藏
页码:265 / 281
页数:16
相关论文
共 24 条
  • [21] Siedlecki W., Sklansky J., On automatic feature selection, International Journal of Pattern Recognition and Artificial Intelligence, 2, pp. 197-220, (1988)
  • [22] Song D., Heywood M.I., Zincir-Heywood A.N., A linear genetic programming approach to intrusion detection, Genetic and Evolutionary Computation - GECCO-2003, pp. 2325-2336, (2003)
  • [23] Vafaie H., De Jong K., Genetic algorithms as a tool for restructuring feature space representations, Proceedings of the International Conference on Tools with A.I., (1995)
  • [24] Witten I., Frank E., Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, (2000)