Verification for generalizability and accuracy of a thinning-trees selection model with the ensemble learning algorithm and the cross-validation method

被引:15
作者
Minowa, Yasushi [1 ]
机构
[1] Kyoto Prefectural Univ, Sakyo Ku, Kyoto 6068522, Japan
关键词
ensemble learning; m-fold cross validation; pattern-recognition algorithm; thinning-trees selection; WEKA;
D O I
10.1007/s10310-008-0084-6
中图分类号
S7 [林业];
学科分类号
0829 ; 0907 ;
摘要
For the purpose of making a highly effective model in relation to the selection of trees for thinning for various forestry goals, the author examined the generalizability and accuracy of models using various ensemble learning algorithms and the m-fold cross-validation method. These techniques make it possible to improve discrimination accuracy by combining or integrating multiple learning results whose accuracies are not very high. WEKA, which is a machine learning tool for data mining programmed in Java machine language, was used to verify the results of the simulation models. The number of samples was 503. Pattern-recognition algorithms in this study used five classification-type models and one function-type model. It was found that: (1) without cross validation, two pattern-recognition algorithms can be classified as having comparatively high discrimination accuracy; (2) with cross validation, discrimination accuracy decreased as a whole, but was not very different from that without cross validation, and (3) from the viewpoint of generalizability, we constructed a model at around 70% discrimination accuracy. In order to construct more effective models, we need to design the model to utilize certain algorithms or to build in re-sampling methods such as ensemble learning and cross validation. Additionally, in the case of small sample datasets, ensemble learning is an effective method for constructing efficient models.
引用
收藏
页码:275 / 285
页数:11
相关论文
共 34 条
[1]  
[Anonymous], 1996, P 2 INT C KNOWLEDGE
[2]  
ASO H, 2003, STAT PATTERN RECOGNI
[3]   TIMBER HARVEST SCHEDULING IN A FUZZY DECISION ENVIRONMENT [J].
BARE, BB ;
MENDOZA, GA .
CANADIAN JOURNAL OF FOREST RESEARCH-REVUE CANADIENNE DE RECHERCHE FORESTIERE, 1992, 22 (04) :423-428
[4]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[5]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
Duda R.O., 2001, Pattern Classification, V2nd
[8]   The relationship of land use practices to surface water quality in the Upper Oconee Watershed of Georgia [J].
Fisher, DS ;
Steiner, JL ;
Endale, DM ;
Stuedemann, JA ;
Schomberg, HH ;
Franzluebbers, AJ ;
Wilkinson, SR .
FOREST ECOLOGY AND MANAGEMENT, 2000, 128 (1-2) :39-48
[9]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[10]  
Fujino M., 2006, Journal of the Japanese Forest Society, V88, P221, DOI 10.4005/jjfs.88.221