Data mining for feature selection in gene expression autism data

被引:49
作者
Latkowski, Tomasz [1 ]
Osowski, Stanislaw [1 ,2 ]
机构
[1] Mil Univ Technol, Fac Elect, PL-00908 Warsaw, Poland
[2] Warsaw Univ Technol, Fac Elect Engn, Warsaw, Poland
关键词
Gene expression microarrays; Feature selection; Clustering; Classification; Autism; CLASSIFICATION; CANCER;
D O I
10.1016/j.eswa.2014.08.043
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
The paper presents application of data mining methods for recognizing the most significant genes and gene sequences (treated as features) stored in a dataset of gene expression microarray. The investigations are performed for autism data. Few chosen methods of feature selection have been applied and their results integrated in the final outcome. In this way we find the contents of small set of the most important genes associated with autism. They have been applied in the classification procedure aimed on recognition of autism from reference group members. The results of numerical experiments concerning selection of the most important genes and classification of the cases on the basis of the selected genes will be discussed. The main contribution of the paper is in developing the fusion system of the results of many selection approaches into the final set, most closely associated with autism. We have also proposed special procedure of estimating the number of highest rank genes used in classification procedure. 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:864 / 872
页数:9
相关论文
共 28 条
[1]
Microarray gene expression classification with few genes: Criteria to combine attribute selection and classification methods [J].
Alonso-Gonzalez, Carlos J. ;
Isaac Moro-Sancho, Q. ;
Simon-Hurtado, Arancha ;
Varela-Arrabal, Ricardo .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (08) :7270-7280
[2]
Autism and Increased Paternal Age Related Changes in Global Levels of Gene Expression Regulation [J].
Alter, Mark D. ;
Kharkar, Rutwik ;
Ramsey, Keri E. ;
Craig, David W. ;
Melmed, Raun D. ;
Grebe, Theresa A. ;
Bay, R. Curtis ;
Ober-Reynolds, Sharman ;
Kirwan, Janet ;
Jones, Josh J. ;
Turner, J. Blake ;
Hen, Rene ;
Stephan, Dietrich A. .
PLOS ONE, 2011, 6 (02)
[3]
[Anonymous], 2013, MATL US MAN STAT TOO
[4]
[Anonymous], 2003, PATTERN CLASSIFICATI
[5]
A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes [J].
Baldi, P ;
Long, AD .
BIOINFORMATICS, 2001, 17 (06) :509-519
[6]
Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]
Gene selection and classification using Taguchi chaotic binary particle swarm optimization [J].
Chuang, Li-Yeh ;
Yang, Cheng-San ;
Wu, Kuo-Chuan ;
Yang, Cheng-Hong .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (10) :13367-13377
[8]
De Rinaldis E., 2007, DNA MICROARRAYS CURR
[9]
Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[10]
Using game theory to detect genes involved in Autism Spectrum Disorder [J].
Esteban, Francisco J. ;
Wall, Dennis P. .
TOP, 2011, 19 (01) :121-129