Knowledge discovery from soil maps using inductive learning

被引:79
作者
Qi, F
Zhu, AX
机构
[1] Univ Wisconsin, Dept Geog, Madison, WI 53706 USA
[2] Chinese Acad Sci, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
[3] Chinese Acad Sci, Inst Geog Sci & Nat Resources, Beijing 100101, Peoples R China
关键词
D O I
10.1080/13658810310001596049
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops a knowledge discovery procedure for extracting knowledge of soil-landscape models from a soil map. It has broad relevance to knowledge discovery from other natural resource maps. The procedure consists of four major steps: data preparation, data preprocessing, pattern extraction, and knowledge consolidation. In order to recover true expert knowledge from the error-prone soil maps, our study pays specific attention to the reduction of representation noise in soil maps. The data preprocessing step has exhibited an important role in obtaining greater accuracy. A specific method for sampling pixels based on modes of environmental histograms has proven to be effective in terms of reducing noise and constructing representative sample sets. Three inductive learning algorithms, the See5 decision tree algorithm, Naive Bayes, and artificial neural network, are investigated for a comparison concerning learning accuracy and result comprehensibility. See5 proves to be an accurate method and produces the most comprehensible results, which are consistent with the rules (expert knowledge) used in producing the soil map. The incorporation of spatial information into the knowledge discovery process is found not only to improve the accuracy of the extracted knowledge, but also to add to the explicitness and extensiveness of the extracted soil-landscape model.
引用
收藏
页码:771 / 795
页数:25
相关论文
共 39 条
[1]   FOREST ECOSYSTEM PROCESSES AT THE WATERSHED SCALE - INCORPORATING HILLSLOPE HYDROLOGY [J].
BAND, LE ;
PATTERSON, P ;
NEMANI, R ;
RUNNING, SW .
AGRICULTURAL AND FOREST METEOROLOGY, 1993, 63 (1-2) :93-126
[2]  
BRUIN SD, 1999, GEODERMA, V91, P151
[3]   Understanding time series networks: A case study in rule extraction [J].
Craven, MW ;
Shavlik, JW .
INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 1997, 8 (04) :373-384
[4]   Soil-landscape relationships in Siwalik hills of the semiarid tract of Punjab, India [J].
Deka, B ;
Sawhney, JS ;
Sharma, BD ;
Sidhu, PS .
ARID SOIL RESEARCH AND REHABILITATION, 1996, 10 (02) :149-159
[5]   On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130
[6]   Data mining and soil salinity analysis [J].
Eklund, PW ;
Kirkby, SD ;
Salim, A .
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 1998, 12 (03) :247-268
[7]   A comparative analysis of methods for pruning decision trees [J].
Esposito, F ;
Malerba, D ;
Semeraro, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (05) :476-491
[8]  
ESTER M, 2001, GEOGRAPHIC DATA MINI, P160, DOI DOI 10.4324/9780203468029_CHAPTER_7
[9]   Estimating the leaf area index of North Central Wisconsin forests using the Landsat Thematic Mapper [J].
Fassnacht, KS ;
Gower, ST ;
MacKenzie, MD ;
Nordheim, EV ;
Lillesand, TM .
REMOTE SENSING OF ENVIRONMENT, 1997, 61 (02) :229-245
[10]  
Fayyad U, 1996, AI MAG, V17, P37