一种结合随机森林和邻域粗糙集的特征选择方法

被引:27
作者
吴辰文
王伟
李长生
梁靖涵
闫光辉
机构
[1] 兰州交通大学电子与信息工程学院
关键词
肿瘤基因数据; 随机森林特征封装; Relief算法; 邻域粗糙集; 特征选择;
D O I
10.20009/j.cnki.21-1106/tp.2017.06.034
中图分类号
R73 [肿瘤学]; TP18 [人工智能理论];
学科分类号
100214 [肿瘤学]; 140502 [人工智能];
摘要
针对肿瘤基因数据具有高维小样本的特性,为了提高传统基因分类方法的正确率,提出一种结合随机森林和邻域粗糙集的特征基因选择方法(Random Forest and Neighborhood Rough Set,RFNRS).该方法首先利用Relief算法,对原始的肿瘤基因数据进行权重选择,去除权重较低的特征子集;接着引入基于随机森林的封装式特征选择算法(Random Forest Wrapper Feature Select,RFWFS),以模型准确率作为评判准则,筛选特征子集;然后引入邻域粗糙集针对连续性的特征子集进行寻优处理;最后利用多个经典分类算法处理特征子集.经实验结果表明,该方法不仅在肿瘤基因特征子集的选择上具有良好的性能,同时在算法的分类性能上也有所提高.
引用
收藏
页码:1358 / 1362
页数:5
相关论文
共 15 条
[1]
Selection and evaluation of reference genes for expression analysis of Cassi [J].
Liu, Zubi ;
Zhu, Qiankun ;
Li, Juanjuan ;
Yu, Jihua ;
Li, Yangyang ;
Huang, Xinhe ;
Wang, Wanjun ;
Tan, Rui ;
Zhou, Jiayu ;
Liao, Hai .
BIOSCIENCE BIOTECHNOLOGY AND BIOCHEMISTRY, 2015, 79 (11) :1818-1826
[2]
Feature gene selection method based on logistic and correlation information entropy.[J].Jiucheng Xu;Tao Li;Lin Sun.Bio-Medical Materials and Engineering.2015, s1
[3]
Data mining for feature selection in gene expression autism data [J].
Latkowski, Tomasz ;
Osowski, Stanislaw .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (02) :864-872
[4]
Diagnosis of colorectal cancer by near-infrared optical fiber spectroscopy and random forest.[J].Hui Chen;Zan Lin;Hegang Wu;Li Wang;Tong Wu;Chao Tan.Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy.2015,
[5]
Random forest classification of etiologies for an orphan disease.[J].Jaime Lynn Speiser;Valerie L. Durkalski;William M. Lee.Statist. Med..2014, 5
[6]
Research and application of data mining feature selection based on relief algorithm [J].
Gao, Lun ;
Li, Taifu ;
Yao, Lizhong ;
Wen, Feng .
Journal of Software, 2014, 9 (02) :515-522
[7]
A random forest classifier for lymph diseases [J].
Azar, Ahmad Taher ;
Elshazly, Hanaa Ismail ;
Hassanien, Aboul Ella ;
Elkorany, Abeer Mohamed .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2014, 113 (02) :465-473
[8]
PREDICTION BASED ON INTEGRATION OF DECISIONAL DNA AND A FEATURE SELECTION ALGORITHM RELIEF-F [J].
Wang, Peng ;
Sanin, Cesar ;
Szczerbicki, Edward .
CYBERNETICS AND SYSTEMS, 2013, 44 (2-3) :173-183
[9]
流式大数据下随机森林方法及应用 [J].
刘迎春 ;
陈梅玲 .
西北工业大学学报, 2015, 33 (06) :1055-1061
[10]
基于非对称变邻域粗糙集模型的属性约简 [J].
惠景丽 ;
潘巍 ;
吴康康 ;
周晓英 .
计算机科学, 2015, 42 (06) :282-287