A New Avenue for Classification and Prediction of Olive Cultivars Using Supervised and Unsupervised Algorithms

被引:26
作者
Beiki, Amir H. [1 ,2 ]
Saboor, Saba [3 ]
Ebrahimi, Mansour [1 ,2 ]
机构
[1] Univ Qom, Dept Biol, Sch Basic Sci, Qom, Iran
[2] Univ Qom, Bioinformat Res Grp, Qom, Iran
[3] IKIU, Fac Engn & Technol, Dept Agr Biotechnol, Qazvin, Iran
关键词
OLEA-EUROPAEA L; GENETIC DIVERSITY; ENZYME THERMOSTABILITY; FEATURE-SELECTION; SP-NOV; SEQUENCE; FEATURES; IDENTIFICATION; MARKER; OIL;
D O I
10.1371/journal.pone.0044164
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
Various methods have been used to identify cultivares of olive trees; herein we used different bioinformatics algorithms to propose new tools to classify 10 cultivares of olive based on RAPD and ISSR genetic markers datasets generated from PCR reactions. Five RAPD markers (OPA0a21, OPD16a, OP01a1, OPD16a1 and OPA0a8) and five ISSR markers (UBC841a4, UBC868a7, UBC841a14, U12BC807a and UBC810a13) selected as the most important markers by all attribute weighting models. K-Medoids unsupervised clustering run on SVM dataset was fully able to cluster each olive cultivar to the right classes. All trees (176) induced by decision tree models generated meaningful trees and UBC841a4 attribute clearly distinguished between foreign and domestic olive cultivars with 100% accuracy. Predictive machine learning algorithms (SVM and Naive Bayes) were also able to predict the right class of olive cultivares with 100% accuracy. For the first time, our results showed data mining techniques can be effectively used to distinguish between plant cultivares and proposed machine learning based systems in this study can predict new olive cultivars with the best possible accuracy.
引用
收藏
页数:9
相关论文
共 56 条
[1]
ProSOM:: core promoter prediction based on unsupervised clustering of DNA physical profiles [J].
Abeel, Thomas ;
Saeys, Yvan ;
Rouze, Pierre ;
Van de Peer, Yves .
BIOINFORMATICS, 2008, 24 (13) :I24-I31
[2]
Finding and using hyperthermophilic enzymes [J].
Adams, MWW ;
Kelly, RM .
TRENDS IN BIOTECHNOLOGY, 1998, 16 (08) :329-332
[3]
Virgin Olive Oil Authentication by Multivariate Analyses of 1H NMR Fingerprints and δ13C and δ2H Data [J].
Alonso-Salces, Rosa M. ;
Moreno-Rojas, Jose M. ;
Holland, Margaret V. ;
Reniero, Fabiano ;
Guillou, Claude ;
Heberger, Karoly .
JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2010, 58 (09) :5586-5596
[4]
Amino Acid Features of P1B-ATPase Heavy Metal Transporters Enabling Small Numbers of Organisms to Cope with Heavy Metal Pollution [J].
Ashrafi, E. ;
Alemzadeh, A. ;
Ebrahimi, M. ;
Ebrahimie, E. ;
Dadkhodaei, N. .
BIOINFORMATICS AND BIOLOGY INSIGHTS, 2011, 5 :59-82
[5]
Discovery of EST-SSRs in Lung Cancer: Tagged ESTs with SSRs Lead to Differential Amino Acid and Protein Expression Patterns in Cancerous Tissues [J].
Bakhtiarizadeh, Mohammad Reza ;
Ebrahimi, Mansour ;
Ebrahimie, Esmaeil .
PLOS ONE, 2011, 6 (11)
[6]
Genetic diversity among accessions of an ancient olive variety of Cyprus [J].
Banilas, G ;
Minas, J ;
Gregoriou, C ;
Demoliou, C ;
Kourti, A ;
Hatzopoulos, P .
GENOME, 2003, 46 (03) :370-376
[7]
Baseri S, 2011, ADV STUDIES BIOL, V3, P181
[8]
A new data mining approach for profiling and categorizing kinetic patterns of metabolic biomarkers after myocardial injury [J].
Baumgartner, Christian ;
Lewis, Gregory D. ;
Netzer, Michael ;
Pfeifer, Bernhard ;
Gerszten, Robert E. .
BIOINFORMATICS, 2010, 26 (14) :1745-1751
[9]
Bayesian networks for evaluating forensic DNA profiling evidence: A review and guide to literature [J].
Biedermann, A. ;
Taroni, F. .
FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2012, 6 (02) :147-157
[10]
Bijanzadeh E, 2010, AUST J CROP SCI, V4, P402