Representing and reducing error in natural-resource classification using model combination

被引:18
作者
Huang, Z [1 ]
Lees, BG [1 ]
机构
[1] Australian Natl Univ, Sch Resources Environm & Soc, Canberra, ACT 0200, Australia
关键词
representing error; reducing error; natural-resource classification; model combination;
D O I
10.1080/13658810500032446
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Artificial Intelligence ( AI) models such as Artificial Neural Networks (ANNs), Decision Trees and Dempster-Shafer's Theory of Evidence have long claimed to be more error-tolerant than conventional statistical models, but the way error is propagated through these models is unclear. Two sources of error have been identified in this study: sampling error and attribute error. The results show that these errors propagate differently through the three AI models. The Decision Tree was the most affected by error, the Artificial Neural Network was less affected by error, and the Theory of Evidence model was not affected by the errors at all. The study indicates that AI models have very different modes of handling errors. In this case, the machine-learning models, including ANNs and Decision Trees, are more sensitive to input errors. Dempster-Shafer's Theory of Evidence has demonstrated better potential in dealing with input errors when multisource data sets are involved. The study suggests a strategy of combining AI models to improve classification accuracy. Several combination approaches have been applied, based on a 'majority voting system', a simple average, Dempster Shafer's Theory of Evidence, and fuzzy-set theory. These approaches all increased classification accuracy to some extent. Two of them also demonstrated good performance in handling input errors. Second-stage combination approaches which use statistical evaluation of the initial combinations are able to further improve classification results. One of these second-stage combination approaches increased the overall classification accuracy on forest types to 54% from the original 46.5% of the Decision Tree model, and its visual appearance is also much closer to the ground data. By combining models, it becomes possible to calculate quantitative confidence measurements for the classification results, which can then serve as a better error representation. Final classification products include not only the predicted hard classes for individual cells, but also estimates of the probability and the confidence measurements of the prediction.
引用
收藏
页码:603 / 621
页数:19
相关论文
共 38 条
[31]  
Spear M., 1996, SPAT ACC ASS NAT RES, P199
[32]  
Taylor J.R., 1982, An Introduction to Error Analysis
[33]   Geographical information systems and the problem of 'error and uncertainty' [J].
Unwin, DJ .
PROGRESS IN HUMAN GEOGRAPHY, 1995, 19 (04) :549-558
[34]  
Van Niel KP, 2004, J VEG SCI, V15, P747, DOI 10.1111/j.1654-1103.2004.tb02317.x
[35]   DEVELOPING AND TESTING OF AN ERROR PROPAGATION MODEL FOR GIS OVERLAY OPERATIONS [J].
VEREGIN, H .
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SYSTEMS, 1995, 9 (06) :595-619
[36]  
VEREGIN H, 1989, ACCURACY OF SPATIAL DATABASES, P3
[37]  
WALSH SJ, 1987, PHOTOGRAMM ENG REM S, V53, P1423
[38]   FUZZY SETS [J].
ZADEH, LA .
INFORMATION AND CONTROL, 1965, 8 (03) :338-&