mRMRe: an R package for parallelized mRMR ensemble feature selection

被引:191
作者
De Jay, Nicolas [1 ]
Papillon-Cavanagh, Simon [1 ]
Olsen, Catharina [2 ]
El-Hachem, Nehme [1 ]
Bontempi, Gianluca [2 ]
Haibe-Kains, Benjamin [1 ]
机构
[1] Inst Rech Clin Montreal, Bioinformat & Computat Biol Lab, Integrat Syst Biol Axis, Montreal, PQ H2W 1R7, Canada
[2] Univ Libre Bruxelles, Machine Learning Grp, Dept Comp Sci, B-1050 Brussels, Belgium
关键词
D O I
10.1093/bioinformatics/btt383
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Feature selection is one of the main challenges in analyzing high- throughput genomic data. Minimum redundancy maximum relevance (mRMR) is a particularly fast feature selection method for finding a set of both relevant and complementary features. Here we describe the mRMRe R package, in which the mRMR technique is extended by using an ensemble approach to better explore the feature space and build more robust predictors. To deal with the computational complexity of the ensemble approach, the main functions of the package are implemented and parallelized in C using the openMP Application Programming Interface. Results: Our ensemble mRMR implementations outperform the classical mRMR approach in terms of prediction accuracy. They identify genes more relevant to the biological context and may lead to richer biological interpretations. The parallelized functions included in the package show significant gains in terms of run-time speed when compared with previously released packages.
引用
收藏
页码:2365 / 2368
页数:4
相关论文
共 12 条
[1]  
[Anonymous], 2011, COSADE 2011
[2]   The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity [J].
Barretina, Jordi ;
Caponigro, Giordano ;
Stransky, Nicolas ;
Venkatesan, Kavitha ;
Margolin, Adam A. ;
Kim, Sungjoon ;
Wilson, Christopher J. ;
Lehar, Joseph ;
Kryukov, Gregory V. ;
Sonkin, Dmitriy ;
Reddy, Anupama ;
Liu, Manway ;
Murray, Lauren ;
Berger, Michael F. ;
Monahan, John E. ;
Morais, Paula ;
Meltzer, Jodi ;
Korejwa, Adam ;
Jane-Valbuena, Judit ;
Mapa, Felipa A. ;
Thibault, Joseph ;
Bric-Furlong, Eva ;
Raman, Pichai ;
Shipway, Aaron ;
Engels, Ingo H. ;
Cheng, Jill ;
Yu, Guoying K. ;
Yu, Jianjun ;
Aspesi, Peter, Jr. ;
de Silva, Melanie ;
Jagtap, Kalpana ;
Jones, Michael D. ;
Wang, Li ;
Hatton, Charles ;
Palescandolo, Emanuele ;
Gupta, Supriya ;
Mahan, Scott ;
Sougnez, Carrie ;
Onofrio, Robert C. ;
Liefeld, Ted ;
MacConaill, Laura ;
Winckler, Wendy ;
Reich, Michael ;
Li, Nanxin ;
Mesirov, Jill P. ;
Gabriel, Stacey B. ;
Getz, Gad ;
Ardlie, Kristin ;
Chan, Vivien ;
Myer, Vic E. .
NATURE, 2012, 483 (7391) :603-607
[3]  
Ding Chris, 2005, Journal of Bioinformatics and Computational Biology, V3, P185, DOI 10.1142/S0219720005001004
[4]   Systematic identification of genomic markers of drug sensitivity in cancer cells [J].
Garnett, Mathew J. ;
Edelman, Elena J. ;
Heidorn, Sonja J. ;
Greenman, Chris D. ;
Dastur, Anahita ;
Lau, King Wai ;
Greninger, Patricia ;
Thompson, I. Richard ;
Luo, Xi ;
Soares, Jorge ;
Liu, Qingsong ;
Iorio, Francesco ;
Surdez, Didier ;
Chen, Li ;
Milano, Randy J. ;
Bignell, Graham R. ;
Tam, Ah T. ;
Davies, Helen ;
Stevenson, Jesse A. ;
Barthorpe, Syd ;
Lutz, Stephen R. ;
Kogera, Fiona ;
Lawrence, Karl ;
McLaren-Douglas, Anne ;
Mitropoulos, Xeni ;
Mironenko, Tatiana ;
Thi, Helen ;
Richardson, Laura ;
Zhou, Wenjun ;
Jewitt, Frances ;
Zhang, Tinghu ;
O'Brien, Patrick ;
Boisvert, Jessica L. ;
Price, Stacey ;
Hur, Wooyoung ;
Yang, Wanjuan ;
Deng, Xianming ;
Butler, Adam ;
Choi, Hwan Geun ;
Chang, JaeWon ;
Baselga, Jose ;
Stamenkovic, Ivan ;
Engelman, Jeffrey A. ;
Sharma, Sreenath V. ;
Delattre, Olivier ;
Saez-Rodriguez, Julio ;
Gray, Nathanael S. ;
Settleman, Jeffrey ;
Futreal, P. Andrew ;
Haber, Daniel A. .
NATURE, 2012, 483 (7391) :570-U87
[5]  
GUZMANMARTINEZ R, 2011, ECML PKDD 2011 SPRIN, V6911, P597
[6]  
Harrell FE, 1996, STAT MED, V15, P361, DOI 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO
[7]  
2-4
[8]   On combining classifiers [J].
Kittler, J ;
Hatef, M ;
Duin, RPW ;
Matas, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (03) :226-239
[9]  
Kuncheva LI, 2007, PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, P390
[10]   minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information [J].
Meyer, Patrick E. ;
Lafitte, Frederic ;
Bontempi, Gianluca .
BMC BIOINFORMATICS, 2008, 9 (1)